VocalText 🎙️

A powerful, GPU-accelerated offline audio transcriber using OpenAI's Whisper models. Built with Python and Gradio.

Features

GPU Acceleration: Uses CUDA for blazing fast transcription (requires NVIDIA GPU).
Offline Privacy: All processing happens locally on your machine.
Multiple Models: Choose from Tiny (fastest) to Large (most accurate).
Simple UI: Clean web interface powered by Gradio.
Export: Save transcriptions directly to text files.

Requirements

Python 3.10 or higher
NVIDIA GPU (Recommended for speed, but works on CPU)
FFmpeg installed and added to system PATH.

Installation

Clone the repository:

git clone https://github.com/yourusername/VocalText.git
cd VocalText

Create a virtual environment:

conda create -n vocalText python=3.10
conda activate vocalText

Install dependencies:

# Install PyTorch with CUDA support (for NVIDIA GPUs)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Install other requirements
pip install -r requirements.txt

Usage

Option 1: Double-click start_app.bat (Windows only)

Option 2: Command Line

conda activate vocalText
python app.py

The interface will open in your default web browser automatically.

Models

Models are downloaded automatically on first use to your local cache.

Model	Size	Speed	Accuracy
Tiny	39 MB	Very Fast	Low
Base	74 MB	Fast	Moderate
Small	244 MB	Moderate	Good
Medium	769 MB	Slow	High
Large	1.5 GB	Slowest	Best

Order of Operations

Select a model from the dropdown menu.
Import an audio file.
- You can trim the audio file down in the viewport by clicking the scissor icon.
Click the "Transcribe" button.
- If you haven't already dowloaded that model file it will likely take a while to download. Once it's downloaded it will transcribe the audio file.
The transcription will be displayed in the viewport.
You can copy the text from the viewport or save it to a text file.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
favicon		favicon
images		images
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
start_app.bat		start_app.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VocalText 🎙️

Features

Requirements

Installation

Usage

Models

Order of Operations

License

About

Uh oh!

Releases

Packages

Languages

ArtisticMusician/VocalText

Folders and files

Latest commit

History

Repository files navigation

VocalText 🎙️

Features

Requirements

Installation

Usage

Models

Order of Operations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages