[ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant
-
Updated
Aug 14, 2024 - Python
[ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant
Clara: An agentic multimodal AI assistant that can see through your webcam, listen to your voice, think with Gemini, and speak back using ElevenLabs. Built with LangGraph, OpenCV, Groq, and Gradio.
Next.js multi-modal AI chat utilizing the open-source models powered by Hugging Face.
Add a description, image, and links to the multimodal-chatbot topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-chatbot topic, visit your repo's landing page and select "manage topics."