Skip to content

Open-source example for integrating ElevenLabs conversational AI with animated avatars using Mascotbot SDK. Features real-time lip sync and natural voice interactions.

License

Notifications You must be signed in to change notification settings

mascotbot/elevenlabs-avatar

Repository files navigation

ElevenLabs Avatar Integration Demo

Complete open-source example for integrating animated avatars with ElevenLabs conversational AI using Mascotbot SDK. Real-time lip sync, WebSocket support, and production-ready React components.

ElevenLabs Avatar Integration Demo

πŸš€ Quick Start

Deploy with Vercel

After deploying with Vercel:

  1. Add the Mascotbot SDK package (mascotbot-sdk-react-0.1.6.tgz) to your cloned repository
  2. Add your mascot .riv file to the public folder
  3. Commit and push these changes to trigger a rebuild

Prerequisites

  • Node.js 18+
  • Mascotbot SDK (provided as .tgz file after subscription)
  • Mascot .riv file (provided with SDK subscription)
  • ElevenLabs API key and Agent ID
  • Mascotbot API key

Manual Installation

  1. Clone this repository:
git clone https://github.com/mascotbot/elevenlabs-avatar.git
cd elevenlabs-avatar
  1. Copy the Mascotbot SDK package to the project root:
cp /path/to/mascotbot-sdk-react-0.1.6.tgz ./
  1. Copy your mascot .riv file to the public folder:
cp /path/to/mascot.riv ./public/
  1. Install dependencies:
npm install
# or
pnpm install
  1. Set up environment variables:
cp .env.example .env.local
  1. Update .env.local with your credentials:
MASCOT_BOT_API_KEY=your_mascot_bot_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_AGENT_ID=your_elevenlabs_agent_id
  1. Run the development server:
npm run dev
# or
pnpm dev

Open http://localhost:3000 to see the demo in action!

🎯 What This Demo Shows

This example demonstrates:

  • Real-time Lip Sync: Perfect viseme synchronization with ElevenLabs audio streams
  • WebSocket Integration: Automatic data extraction from ElevenLabs connections
  • Natural Mouth Movements: Human-like lip sync processing that avoids robotic over-articulation
  • Production-Ready Components: Complete implementation ready for deployment

πŸ“ Project Structure

elevenlabs-avatar/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ page.tsx          # Main demo page with ElevenLabs avatar
β”‚   β”‚   β”œβ”€β”€ layout.tsx        # Root layout
β”‚   β”‚   β”œβ”€β”€ globals.css       # Global styles
β”‚   β”‚   └── api/
β”‚   β”‚       └── get-signed-url/
β”‚   β”‚           └── route.ts  # API endpoint for ElevenLabs authentication
β”‚   └── components/           # Additional components (if needed)
β”œβ”€β”€ public/                   # Static assets
β”œβ”€β”€ .env.example             # Environment variables template
β”œβ”€β”€ package.json             # Project dependencies
└── README.md               # This file

πŸ”§ Key Features

1. Automatic Viseme Injection

The Mascotbot proxy endpoint automatically injects viseme (mouth shape) data into the ElevenLabs WebSocket stream, enabling perfect lip synchronization without any modifications to ElevenLabs code.

2. Natural Lip Sync Processing

// Human-like mouth movements with configurable parameters
// Important: Use useState to maintain stable object reference
const [lipSyncConfig] = useState({
  minVisemeInterval: 40,
  mergeWindow: 60,
  keyVisemePreference: 0.6,
  preserveSilence: true,
  similarityThreshold: 0.4,
  preserveCriticalVisemes: true,
  criticalVisemeMinDuration: 80,
});

useMascotElevenlabs({
  conversation,
  naturalLipSync: true,
  naturalLipSyncConfig: lipSyncConfig,
});

3. Pre-fetched URLs for Instant Connection

The demo pre-fetches signed URLs and refreshes them every 9 minutes, ensuring instant connection when users click "Start Conversation".

πŸ› οΈ Customization

Using Your Own Avatar

The demo expects a mascot .riv file in the public folder. The file path is configured in src/app/page.tsx:

const mascotUrl = "/mascot.riv"; // Place your .riv file in the public folder

You can also use a CDN URL:

const mascotUrl = "https://your-cdn.com/your-mascot.riv";

Ensure your Rive file has the required inputs:

  • is_speaking - Boolean input for lip sync
  • gesture - Optional trigger for animated reactions

Adjusting Lip Sync Settings

The demo includes full configuration for natural lip sync. Always use useState to maintain a stable object reference:

const [lipSyncConfig] = useState({
  minVisemeInterval: 40,        // Minimum time between visemes (ms)
  mergeWindow: 60,              // Window for merging similar shapes
  keyVisemePreference: 0.6,     // Preference for distinctive shapes (0-1)
  preserveSilence: true,        // Keep silence visemes
  similarityThreshold: 0.4,     // Threshold for merging (0-1)
  preserveCriticalVisemes: true,// Never skip important shapes
  criticalVisemeMinDuration: 80,// Min duration for critical visemes (ms)
});

You can adjust these values based on your needs:

  • Higher minVisemeInterval: Smoother, less articulated movements
  • Lower minVisemeInterval: More precise articulation
  • Higher keyVisemePreference: More emphasis on distinctive mouth shapes
  • Higher similarityThreshold: More aggressive merging of similar visemes

Styling

The demo uses Tailwind CSS for styling. Modify the classes in src/app/page.tsx to match your design requirements.

πŸ“ Environment Variables

Create a .env.local file with the following variables:

# Mascotbot API Key (get from app.mascot.bot)
MASCOT_BOT_API_KEY=mascot_xxxxxxxxxxxxxx

# ElevenLabs Credentials
ELEVENLABS_API_KEY=sk_xxxxxxxxxxxxxx
ELEVENLABS_AGENT_ID=agent_xxxxxxxxxxxxxx

🚨 Important Notes

WebSocket Proxy Requirement

Do NOT connect directly to ElevenLabs WebSocket URLs. The avatar lip-sync requires viseme data that only the Mascotbot proxy provides. Direct connections will result in no mouth movement.

Browser Requirements

  • Modern browser with WebGL2 support
  • Microphone access for voice interaction
  • Stable internet connection for WebSocket streaming

Performance

  • Less than 50ms audio-to-visual delay
  • WebGL2 acceleration for smooth 120fps animation
  • Minimal CPU usage (less than 1%)

πŸ› Troubleshooting

Avatar Not Moving?

  1. Check browser console for WebSocket errors
  2. Verify environment variables are set correctly
  3. Ensure Rive file has correct input names (is_speaking, gesture)
  4. Confirm you're using the Mascotbot proxy endpoint, not direct ElevenLabs connection

Connection Failed?

  1. Verify your API keys are correct
  2. Check that your ElevenLabs agent is active
  3. Ensure microphone permissions are granted
  4. Look for errors in the browser console

Lip Sync Out of Sync?

  1. Check network latency
  2. Adjust natural lip sync parameters
  3. Try different presets based on speech speed

πŸ“š Documentation

For complete documentation on the Mascotbot SDK and all available features, visit:

πŸ“„ License

This demo is provided as an open-source example for Mascotbot subscribers. You're free to use, modify, and deploy it as needed for your projects.

🀝 Support


Built with ❀️ by the Mascotbot team

About

Open-source example for integrating ElevenLabs conversational AI with animated avatars using Mascotbot SDK. Features real-time lip sync and natural voice interactions.

Topics

Resources

License

Stars

Watchers

Forks