pyVideoTrans

Recall.ai - Meeting Transcription API

If you’re looking for a transcription API for meetings, consider checking out Recall.ai , an API that works with Zoom, Google Meet, Microsoft Teams, and more. Recall.ai diarizes by pulling the speaker data and separate audio streams from the meeting platforms, which means 100% accurate speaker diarization with actual speaker names.

pyVideoTrans

A Powerful Open Source Video Translation / Audio Transcription / AI Dubbing / Subtitle Translation Tool

中文 | Documentation | Online Q&A

pyVideoTrans is dedicated to seamlessly converting videos from one language to another, offering a complete workflow that includes speech recognition, subtitle translation, multi-role dubbing, and audio-video synchronization. It supports both local offline deployment and a wide variety of mainstream online APIs.

✨ Core Features

🎥 Fully Automatic Video Translation: One-click workflow: Speech Recognition (ASR) -> Subtitle Translation -> Speech Synthesis (TTS) -> Video Synthesis.
🎙️ Audio Transcription / Subtitle Generation: Batch convert audio/video to SRT subtitles, supporting Speaker Diarization to distinguish between different roles.
🗣️ Multi-Role AI Dubbing: Assign different AI dubbing voices to different speakers.
🧬 Voice Cloning: Integrates models like F5-TTS, CosyVoice, GPT-SoVITS for zero-shot voice cloning.
🧠 Powerful Model Support:
- ASR: Faster-Whisper (Local), OpenAI Whisper, Alibaba Qwen, ByteDance Volcano, Azure, Google, etc.
- LLM Translation: DeepSeek, ChatGPT, Claude, Gemini, Ollama (Local), Alibaba Bailian, etc.
- TTS: Edge-TTS (Free), OpenAI, Azure, Minimaxi, ChatTTS, ChatterBox, etc.
🖥️ Interactive Editing: Supports pausing and manual proofreading at each stage (recognition, translation, dubbing) to ensure accuracy.
🛠️ Utility Toolkit: Includes auxiliary tools such as vocal separation, video/subtitle merging, audio-video alignment, and transcript matching.
💻 Command Line Interface (CLI): Supports headless operation, convenient for server deployment or batch processing.

🚀 Quick Start (Windows Users)

We provide a pre-packaged .exe version for Windows 10/11 users, requiring no Python environment configuration.

Download: Click to download the latest pre-packaged version
Unzip: Extract the compressed file to a path (e.g., D:\pyVideoTrans).
Run: Double-click sp.exe inside the folder to launch.

Note:

Do not run directly from within the compressed archive.

To use GPU acceleration, ensure CUDA 12.8 and cuDNN 9.11 are installed.

🛠️ Source Deployment (macOS / Linux / Windows Developers)

We recommend using uv for package management for faster speed and better environment isolation.

1. Prerequisites

Python: Recommended version 3.10 --> 3.12
FFmpeg: Must be installed and configured in the environment variables.
- macOS: brew install ffmpeg libsndfile git
- Linux (Ubuntu/Debian): sudo apt-get install ffmpeg libsndfile1-dev
- Windows: Download FFmpeg and configure Path, or place ffmpeg.exe and ffprobe.exe directly in the project directory.

2. Install uv (If not installed)

# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

3. Clone and Install

# 1. Clone the repository (Ensure path has no spaces/Chinese characters)
git clone https://github.com/jianchang512/pyvideotrans.git
cd pyvideotrans

# 2. Install dependencies (uv automatically syncs environment)
uv sync

4. Launch Software

Launch GUI:

uv run sp.py

Use CLI:

View documentation for detailed parameters

# Video Translation Example
uv run cli.py --task vtv --name "./video.mp4" --source_language_code zh --target_language_code en

# Audio to Subtitle Example
uv run cli.py --task stt --name "./audio.wav" --model_name large-v3

5. (Optional) GPU Acceleration Configuration

If you have an NVIDIA graphics card, execute the following commands to install the CUDA-supported PyTorch version:

# Uninstall CPU version
uv remove torch torchaudio

# Install CUDA version (Example for CUDA 12.x)
uv add torch==2.7 torchaudio==2.7 --index-url https://download.pytorch.org/whl/cu128
uv add nvidia-cublas-cu12 nvidia-cudnn-cu12

🧩 Supported Channels & Models (Partial)

Category	Channel/Model	Description
ASR (Speech Recognition)	Faster-Whisper (Local)	Recommended, fast speed, high accuracy
	WhisperX / Parakeet	Supports timestamp alignment & speaker diarization
	Alibaba Qwen3-ASR / ByteDance Volcano	Online API, excellent for Chinese
Translation (LLM/MT)	DeepSeek / ChatGPT	Supports context understanding, more natural translation
	Google / Microsoft	Traditional machine translation, fast speed
	Ollama / M2M100	Fully local offline translation
TTS (Speech Synthesis)	Edge-TTS	Microsoft free interface, natural effect
	F5-TTS / CosyVoice	Supports Voice Cloning, requires local deployment
	GPT-SoVITS / ChatTTS	High-quality open-source TTS
	302.AI / OpenAI / Azure	High-quality commercial API

📚 Documentation & Support

Official Documentation: https://pyvideotrans.com (Includes detailed tutorials, API configuration guides, FAQ)
Online Q&A Community: https://bbs.pyvideotrans.com (Submit error logs for automated AI analysis and answers)

⚠️ Disclaimer

This software is an open-source, free, non-commercial project. Users are solely responsible for any legal consequences arising from the use of this software (including but not limited to calling third-party APIs or processing copyrighted video content). Please comply with local laws and regulations and the terms of use of relevant service providers.

🙏 Acknowledgements

This project mainly relies on the following open-source projects (partial):

Created by jianchang512

Name		Name	Last commit message	Last commit date
Latest commit History 1,039 Commits
.github		.github
docs		docs
ffmpeg		ffmpeg
models		models
videotrans		videotrans
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
azure_voice_list.json		azure_voice_list.json
cli.py		cli.py
ffmpeg.txt		ffmpeg.txt
law.txt		law.txt
pyproject.toml		pyproject.toml
run-cuda.bat		run-cuda.bat
sp.py		sp.py
testcuda.py		testcuda.py
update_ffmpeg.bat		update_ffmpeg.bat
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pyVideoTrans

✨ Core Features

🚀 Quick Start (Windows Users)

🛠️ Source Deployment (macOS / Linux / Windows Developers)

1. Prerequisites

2. Install uv (If not installed)

3. Clone and Install

4. Launch Software

5. (Optional) GPU Acceleration Configuration

🧩 Supported Channels & Models (Partial)

📚 Documentation & Support

⚠️ Disclaimer

🙏 Acknowledgements

About

Uh oh!

Releases 78

Sponsor this project

Uh oh!

Uh oh!

Contributors 16

Languages

Uh oh!

License

jianchang512/pyvideotrans

Folders and files

Latest commit

History

Repository files navigation

pyVideoTrans

✨ Core Features

🚀 Quick Start (Windows Users)

🛠️ Source Deployment (macOS / Linux / Windows Developers)

1. Prerequisites

2. Install uv (If not installed)

3. Clone and Install

4. Launch Software

5. (Optional) GPU Acceleration Configuration

🧩 Supported Channels & Models (Partial)

📚 Documentation & Support

⚠️ Disclaimer

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 78

Sponsor this project

Uh oh!

Uh oh!

Contributors 16

Languages