Turn detection plugin for LiveKit Agents using Namo Turn Detector models.
pip install livekit-plugins-namo-turn-detector- Single-Language Models: Memory-efficient models for Vietnamese, English, Chinese (NEW ✨)
- Multilingual Support: 23+ languages with unified multilingual model
- High Accuracy: Language-specific models outperform baseline models
- Fast & Efficient: Optimized inference with 66% less memory for single-language apps
- Async API: Built on LiveKit's inference runner for optimal performance
- Easy Integration: Drop-in replacement for existing turn detectors
Most memory-efficient option - loads only one language model (~200MB):
from livekit.plugins import namo_turn_detector
from livekit import agents
async def entrypoint(ctx: agents.JobContext):
model = namo_turn_detector.vi_model.VietnameseModel(threshold=0.7)
prob = await model.predict_end_of_turn(chat_ctx)from livekit.plugins import namo_turn_detector
async def entrypoint(ctx: agents.JobContext):
model = namo_turn_detector.en_model.EnglishModel(threshold=0.7)
prob = await model.predict_end_of_turn(chat_ctx)from livekit.plugins import namo_turn_detector
async def entrypoint(ctx: agents.JobContext):
model = namo_turn_detector.zh_model.ChineseModel(threshold=0.7)
prob = await model.predict_end_of_turn(chat_ctx)Benefits:
- ✅ 66% less memory (~200MB vs ~600MB)
- ✅ 3x faster initialization
- ✅ Highest accuracy for the language
- ✅ Best for single-language production apps
Use when you need to switch between English, Vietnamese, or Chinese:
from livekit.plugins.namo_turn_detector.language_specific import LanguageSpecificModel
# Loads all 3 models (en, vi, zh) - ~600MB
async def entrypoint(ctx: agents.JobContext):
model = LanguageSpecificModel(language="vi", threshold=0.7)
prob = await model.predict_end_of_turn(chat_ctx)Use when you need support for many languages:
from livekit.plugins.namo_turn_detector.multilingual import MultilingualModel
async def entrypoint(ctx: agents.JobContext):
model = MultilingualModel(threshold=0.7)
prob = await model.predict_end_of_turn(chat_ctx)Comparison across English, Vietnamese, and Chinese:
Sample: "Hello, how are you?"
• Namo Multilingual: 0.8757 (16ms) - EOT: True
• Namo English-Specific: 0.0002 (13ms) - EOT: False
• LiveKit Multilingual: 0.2838 (33ms) - EOT: True
• LiveKit English: 0.4596 (4ms) - EOT: True
Sample: "What's the weather like today?"
• Namo Multilingual: 0.8032 (15ms) - EOT: True
• Namo English-Specific: 0.9999 (9ms) - EOT: True ⭐
• LiveKit Multilingual: 0.7799 (27ms) - EOT: True
• LiveKit English: 0.9409 (3ms) - EOT: True
Sample: "Xin chào, bạn khỏe không?" (Hello, how are you?)
• Namo Multilingual: 0.8651 (25ms) - EOT: True
• Namo Vietnamese-Specific: 0.9857 (36ms) - EOT: True ⭐
• LiveKit Multilingual: 0.0322 (20ms) - EOT: False
Sample: "Thời tiết hôm nay thế nào?" (What's the weather today?)
• Namo Multilingual: 0.5168 (27ms) - EOT: False
• Namo Vietnamese-Specific: 0.9952 (4ms) - EOT: True ⭐
• LiveKit Multilingual: 0.2988 (22ms) - EOT: False
Sample: "Vay ở đâu" (Where to borrow) - Incomplete phrase
• Namo Multilingual: 0.6599 (20ms) - EOT: False
• Namo Vietnamese-Specific: 0.9875 (10ms) - EOT: True ⭐
• LiveKit Multilingual: 0.5106 (25ms) - EOT: False
Sample: "你好,你好吗?" (Hello, how are you?)
• Namo Multilingual: 0.6525 (30ms) - EOT: False
• Namo Chinese-Specific: 0.8777 (16ms) - EOT: True ⭐
• LiveKit Multilingual: 0.8520 (20ms) - EOT: True
Sample: "今天天气怎么样?" (What's the weather today?)
• Namo Multilingual: 0.6818 (18ms) - EOT: False
• Namo Chinese-Specific: 0.9090 (34ms) - EOT: True ⭐
• LiveKit Multilingual: 0.9707 (20ms) - EOT: True
Key Insights:
- Language-Specific models show superior accuracy for their target languages
- Namo Multilingual provides consistent performance across all languages
- Inference speed is competitive, typically 10-30ms per prediction
- Vietnamese detection significantly outperforms baseline multilingual model
from livekit.plugins import namo_turn_detector
model = namo_turn_detector.vi_model.VietnameseModel(threshold: float = 0.7)from livekit.plugins import namo_turn_detector
model = namo_turn_detector.en_model.EnglishModel(threshold: float = 0.7)from livekit.plugins import namo_turn_detector
model = namo_turn_detector.zh_model.ChineseModel(threshold: float = 0.7)Parameters:
threshold: Detection threshold (0.0-1.0), default 0.7
Properties:
language- Language code ("vi","en", or"zh")model- Model name (e.g.,"namo-vi")threshold- Current detection threshold
Methods:
predict_end_of_turn(chat_ctx, timeout=10.0) -> float- Returns probability (0.0-1.0)unlikely_threshold(language) -> float- Get model's threshold for language
Memory Usage: ~200MB per model (loads only one language)
LanguageSpecificModel(language: str, threshold: float = 0.7)Parameters:
language: Language code ("en","vi","zh")threshold: Detection threshold (0.0-1.0)
Methods:
predict_end_of_turn(chat_ctx, timeout=10.0) -> float- Returns probability (0.0-1.0)unlikely_threshold(language) -> float- Get model's threshold for language
Memory Usage: ~600MB (loads all 3 models: en, vi, zh)
MultilingualModel(threshold: float = 0.7)Methods:
predict_end_of_turn(chat_ctx, timeout=10.0) -> float- Returns probability (0.0-1.0)unlikely_threshold(language) -> float- Get model's threshold for language
Memory Usage: ~400MB (single multilingual model for 23 languages)
python main.py download-filesChoose the right model for your use case:
| Model | Languages | Memory | Init Speed | Accuracy | Best For |
|---|---|---|---|---|---|
VietnameseModel |
Vietnamese | ~200MB | ⚡⚡⚡ Fast | ⭐⭐⭐ Highest | Vietnamese-only apps |
EnglishModel |
English | ~200MB | ⚡⚡⚡ Fast | ⭐⭐⭐ Highest | English-only apps |
ChineseModel |
Chinese | ~200MB | ⚡⚡⚡ Fast | ⭐⭐⭐ Highest | Chinese-only apps |
LanguageSpecificModel |
EN, VI, ZH | ~600MB | ⚡ Slow | ⭐⭐⭐ High | Multi-lang apps (3 langs) |
MultilingualModel |
23 languages | ~400MB | ⚡⚡ Medium | ⭐⭐ Good | Global apps (many langs) |
Recommendation: Use single-language models (VietnameseModel, EnglishModel, ChineseModel) for production apps serving one language. They provide 66% memory savings and 3x faster initialization.
-
Single-Language Models: Vietnamese (
vi), English (en), Chinese (zh) -
Multi-Language Model (LanguageSpecificModel): English (
en), Vietnamese (vi), Chinese (zh) -
Multilingual Model (23 languages): Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Marathi, Norwegian, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, Vietnamese
Apache-2.0
- Models: Namo Turn Detector v1 by VideoSDK
- Framework: LiveKit Agents
@software{namo2025,
title = {Namo Turn Detector v1: Semantic Turn Detection for Conversational AI},
author = {VideoSDK Team},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/collections/videosdk-live/namo-turn-detector-v1-68d52c0564d2164e9d17ca97}
}