Skip to content

Jiayou-Chao/transcribe_youtube

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

transcribe_youtube

A command-line tool and Python package for downloading audio from YouTube videos and transcribing them using OpenAI's Whisper model.

Note: This package uses yt-dlp for downloading YouTube videos, which is more reliable and actively maintained than pytube.

Features

  • Download audio from YouTube videos
  • Transcribe audio files with Whisper
  • Support for multiple Whisper models (tiny, base, small, medium, large, turbo)
  • Language detection and specification
  • Translation to English
  • Multiple output formats (txt, json, srt, vtt)
  • Flexible CLI: download only, transcribe only, or both

Installation

Prerequisites

This tool requires ffmpeg to be installed on your system (needed for both Whisper and yt-dlp):

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew
brew install ffmpeg

# on Windows using Chocolatey
choco install ffmpeg

# on Windows using Scoop
scoop install ffmpeg

Installing the package

# Install from the current directory
pip install .

# Or install directly from GitHub
pip install git+https://github.com/Jiayou-Chao/transcribe_youtube.git

Usage

Command-line interface

The tool provides three main commands:

1. Download audio from YouTube

# Basic usage
transcribe-youtube download --url "https://www.youtube.com/watch?v=VIDEO_ID"

# Specify output path
transcribe-youtube download --url "https://www.youtube.com/watch?v=VIDEO_ID" --output "audio.mp3"

2. Transcribe an existing audio file

# Basic usage (prints to stdout)
transcribe-youtube transcribe --file "audio.mp3"

# With model selection
transcribe-youtube transcribe --file "audio.mp3" --model medium

# Specify language (auto-detected if not specified)
transcribe-youtube transcribe --file "audio.mp3" --language Japanese

# Translate to English
transcribe-youtube transcribe --file "audio.mp3" --task translate

# Save to file with specific format
transcribe-youtube transcribe --file "audio.mp3" --output-dir "./transcripts" --output-format srt

3. Download and transcribe in one step

# Basic usage (prints to stdout)
transcribe-youtube run --url "https://www.youtube.com/watch?v=VIDEO_ID"

# With all options
transcribe-youtube run --url "https://www.youtube.com/watch?v=VIDEO_ID" \
                      --model medium \
                      --language Japanese \
                      --task translate \
                      --output-dir "./transcripts" \
                      --output-format json \
                      --keep-audio

Available models

Model Size Languages Relative Speed
tiny 39M Multilingual ~10x
tiny.en 39M English only ~10x
base 74M Multilingual ~7x
base.en 74M English only ~7x
small 244M Multilingual ~4x
small.en 244M English only ~4x
medium 769M Multilingual ~2x
medium.en 769M English only ~2x
large 1550M Multilingual 1x
turbo 809M Multilingual (default) ~8x

Note: The .en models for English-only applications tend to perform better. The turbo model is an optimized version of large-v3 that offers faster transcription speed with minimal accuracy degradation.

Python API

You can also use the package programmatically:

from transcribe_youtube import core

# Download audio
audio_path = core.download_audio("https://www.youtube.com/watch?v=VIDEO_ID")

# Transcribe audio
result = core.transcribe_audio(
    audio_path,
    model_name="medium",
    language="en",
    output_dir="./transcripts",
    output_format="txt"
)

# Download and transcribe in one step
result, audio_path = core.run(
    "https://www.youtube.com/watch?v=VIDEO_ID",
    model_name="medium",
    language="en",
    task="transcribe",
    output_dir="./transcripts",
    output_format="txt",
    keep_audio=True
)

License

MIT License

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages