Real-Time MLX Voice Agent

Ultra-responsive, full-duplex voice assistant tuned for Apple Silicon + MLX. End-to-end round trip (speech → LLM → TTS) is consistently under 1 second, even while handling barge-in.

Click below for demo

Highlights

On-device VAD, STT (Whisper), LLM, and Kokoro TTS sharing a single MLX runtime
Sentence-streaming LLM responses with immediate, cancellable TTS playback
Client-side AudioWorklet correlator for robust echo suppression and barge-in
Rolling audio buffer to preserve context around interruptions

Quick Start

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python server.py

Then open http://localhost:8000 and hit Start Listening.

Configuration

config.py: audio rate, VAD thresholds, queue sizes, model choices
index.html: INTERRUPTION object for interruption sensitivity
requirements.txt: pin runtime dependencies for MLX + Kokoro

Architecture (at a glance)

Mic (16 kHz) → Correlator → VAD State Machine → Segment Queue
      ↓                                         ↓
  WebSocket ↔ Browser UI                 MLX Whisper STT
                                               ↓
                                    MLX LLM → Kokoro TTS
                                               ↓
                                        Playback + barge-in feedback

Repository Map

server.py – FastAPI WebSocket server + pipeline orchestration
audio_buffer.py, vad_detector.py – speech segmentation utilities
transcriber.py, llm_handler.py, tts_handler.py – model wrappers
index.html, correlator.js – interactive UI with barge-in logic

Credits

Silero VAD, Whisper, MLX, Kokoro TTS, and Pipecat SmartTurn for EoU detection. MIT licensed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-Time MLX Voice Agent

Click below for demo

Highlights

Quick Start

Configuration

Architecture (at a glance)

Repository Map

Credits

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
docs		docs
models		models
.gitignore		.gitignore
README.md		README.md
audio_buffer.py		audio_buffer.py
config.py		config.py
correlator.js		correlator.js
index.html		index.html
llm_handler.py		llm_handler.py
requirements.txt		requirements.txt
server.py		server.py
transcriber.py		transcriber.py
tts_handler.py		tts_handler.py
vad_detector.py		vad_detector.py

shubhdotai/offline-voice-ai

Folders and files

Latest commit

History

Repository files navigation

Real-Time MLX Voice Agent

Click below for demo

Highlights

Quick Start

Configuration

Architecture (at a glance)

Repository Map

Credits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages