Voice activity detector (VAD) for the browser with a simple API
-
Updated
Oct 29, 2025 - TypeScript
Voice activity detector (VAD) for the browser with a simple API
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Very fast, accurate speaker diarization
A real-time Voice Activity Detection (VAD) library for iOS and macOS using Silero models powered by ONNX Runtime. Includes advanced noise suppression and audio preprocessing with WebRTC APM, supporting seamless WAV data output with header metadata.
iOS Voice Activity Detection (VAD). Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
A sophisticated real-time voice assistant that seamlessly integrates speech recognition, AI reasoning, and neural text-to-speech synthesis. It is designed for natural conversational interactions with advanced tool-calling capabilities.
VAD is a cross-platform Dart binding for the VAD JavaScript library. This package provides access to a Voice Activity Detection (VAD) system, allowing Flutter applications to start and stop VAD-based listening and handle various VAD events.
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
Audio transcription using mlx whisper and vad silence processing
Python script for detect silences with Silero-VAD and transcribing with the whisper AI model.
This repo provides an addon that can perform VAD model reasoning in nodes and electric environments, based on cmake-js and Fastdeploy. Silero VAD is a pre-trained enterprise-grade Voice Activity Detector.
C++ implementation of real-time Voice Activity Detection (VAD) using Silero models with ONNX Runtime and WebRTC Audio Processing. Provides precise voice segmentation and cross-platform XCFramework support.
Real-time speech-to-text translation over WebSocket. Streams Opus or raw PCM audio from client to server for live transcription and optional translation. Supports CLI and Python API.
Experimental voice user interface (VUI) to interact with an agentic AI assistant
Test comparison of two VAD models with English and multilingual speech datasets
A voice assistant with local LLM as a backend
Enterprise VAD (Voice Activity Detection) in C#.NET (.NET 6.0+) with Microsoft.ML.Net, ONNXRuntime and DirectML. The easiest, efficient, and performant Silero VAD implementation! Always open for PRs.
Add a description, image, and links to the silero-vad topic page so that developers can more easily learn about it.
To associate your repository with the silero-vad topic, visit your repo's landing page and select "manage topics."