GitHub - meangrinch/MangaTranslator: Manga translation app powered by AI

MangaTranslator

Gradio-based web application for automating the translation of manga/comic page images using AI. Targets speech bubbles and text outside of speech bubbles. Supports 54 languages and custom font pack usage.

Original	Translated (w/ a single click)

Features

Detection: Speech bubble detection & segmentation (YOLO + SAM 2.1/3)
Cleaning: Inpaint speech bubbles and OSB text (Flux.2 Klein, Flux.1 Kontext, or OpenCV)
Translation: LLM-powered OCR & translation (54 languages)
Rendering: Text rendering with alignment and custom font packs
Upscaling: 2x-AnimeSharpV4 for enhanced output quality
Processing: Single/batch processing with directory preservation and ZIP support
Interfaces: Web UI (Gradio) and CLI
Automation: One-click translation; no intervention required

Requirements

Python 3.10+
PyTorch (CPU, CUDA, ROCm, MPS, XPU)
Font pack with .ttf/.otf files; included with portable package
LLM for Japanese source text; VLM for other languages (API or local)

Install

Portable Package (Recommended)

Download the standalone zip from the releases page: Portable Build

Requirements:

Windows: Bundled Python/Git included; no additional requirements
Linux/macOS: Python 3.10+ and Git must be installed on your system

Setup:

Extract the zip file
Run the setup script for your platform:
- Windows: Double-click setup.bat
- Linux/macOS: Run ./setup.sh in terminal
PyTorch version is automatically detected and installed based on your system
Open the launcher script created in ./MangaTranslator/:
- Windows: start-webui.bat
- Linux/macOS: start-webui.sh

Includes the Komika (normal text), Cookies (OSB text), Comicka (either), and Roboto (supports accents) font packs

Tip

In the event that you need to transfer to a fresh portable package:

You can safely move the fonts, models, and output directories to the new portable package
You might be able to move the runtime directory over, assuming the same setup configuration is wanted

Manual install

Clone and enter the repo

git clone https://github.com/meangrinch/MangaTranslator.git
cd MangaTranslator

Create and activate a virtual environment (recommended)

python -m venv venv
# Windows PowerShell/CMD
.\venv\Scripts\activate
# Linux/macOS
source venv/bin/activate

Install PyTorch (see: PyTorch Install)

# Example (CUDA 13.0)
pip install torch==2.9.1+cu130 torchvision==0.24.1+cu130 --extra-index-url https://download.pytorch.org/whl/cu130
# Example (CPU)
pip install torch torchvision

Install Nunchaku (optional, for Flux.1 Kontext Nunchaku backend)

Nunchaku wheels are not on PyPI. Install directly from the v1.2.0 GitHub release URL, matching your OS and Python version. CUDA only, and requires a 2000-series card or newer.

# Example (Windows, Python 3.13, PyTorch 2.9.1)
pip install https://github.com/nunchaku-ai/nunchaku/releases/download/v1.2.0/nunchaku-1.2.0+torch2.9-cp313-cp313-win_amd64.whl

Note

Nunchaku is not necessary for the use of Flux models via the SDNQ backend.

Install dependencies

pip install -r requirements.txt

Post-Install Setup

Models

The application will automatically download and use all required models

Fonts

Put font packs as subfolders in fonts/ with .otf/.ttf files
Prefer filenames that include italic/bold or both so variants are detected
Example structure:

fonts/
├─ CC Wild Words/
│  ├─ CCWildWords-Regular.otf
│  ├─ CCWildWords-Italic.otf
│  ├─ CCWildWords-Bold.otf
│  └─ CCWildWords-BoldItalic.otf
└─ Komika/
   ├─ KOMIKA-HAND.ttf
   └─ KOMIKA-HANDBOLD.ttf

LLM setup

Providers: Google, OpenAI, Anthropic, xAI, DeepSeek, Z.ai, Moonshot AI, OpenRouter, OpenAI-Compatible
Web UI: configure provider/model/key in the Config tab (stored locally)
CLI: pass keys/URLs as flags or via env vars
Env vars: GOOGLE_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, DEEPSEEK_API_KEY, ZAI_API_KEY, MOONSHOT_API_KEY, OPENROUTER_API_KEY, OPENAI_COMPATIBLE_API_KEY
OpenAI-compatible default URL: http://localhost:1234/v1

OSB text setup (optional)

If you want to use the OSB text pipeline, you need a Hugging Face token with access to the following repositories:

deepghs/AnimeText_yolo
black-forest-labs/FLUX.1-Kontext-dev (only required if using Flux.1 Kontext with Nunchaku backend)

Steps to create a token:

Sign in or create a Hugging Face account
Visit and accept the terms on:
- AnimeText_yolo
- FLUX.1 Kontext (dev) (optional, if using Kontext with Nunchaku)
- SAM 3 (optional, if using SAM 3 instead of SAM 2.1)
Create a new access token in your Hugging Face settings with read access to gated repos ("Read access to contents of public gated repos")
Add the token to the app:
- Web UI: set hf_token in Config
- Env var (alternative): set HUGGINGFACE_TOKEN
Save config to preserve the token across sessions

Run

Web UI (Gradio)

Portable package:
- Windows: Double-click start-webui.bat inside the MangaTranslator folder
- Linux/macOS: Run ./start-webui.sh inside the MangaTranslator folder
Manual install:
- Windows: Run python app.py --open-browser

Options: --models (default ./models), --fonts (default ./fonts), --port (default 7676), --cpu. First launch can take ~1–2 minutes.

Once launched, configure your LLM provider in the Config tab, then upload images and click Translate.

CLI

Examples:

# Single image, Japanese → English, Google provider
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...>

# Batch folder, custom source/target languages, OpenAI-Compatible provider (LM Studio)
python main.py --input <folder_path> --batch \
  --font-dir "fonts/Komika" \
  --input-language <src_lang> --output-language <tgt_lang> \
  --provider OpenAI-Compatible --openai-compatible-url http://localhost:1234/v1 \
  --output ./output

# Single Image, Japanese → English (Google), OSB text pipeline, custom OSB text font
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...> \
  --osb-enable --osb-font-name "fonts/fast_action"

# Cleaning-only mode (no translation/text rendering)
python main.py --input <image_path> --cleaning-only

# Upscaling-only mode (no detection/translation, only upscale)
python main.py --input <image_path> --upscaling-only --image-upscale-mode final --image-upscale-factor 2.0

# Test mode (no translation; render placeholder text)
python main.py --input <image_path> --test-mode

# Full options
python main.py --help

Documentation

Updating

Portable Package

Windows: Run update.bat from the portable package root
Linux/macOS: Run ./update.sh from the portable package root

Manual Install

From the repo root:

git pull
pip install -r requirements.txt  # Or activate venv first if present

License & credits

License: Apache-2.0 (see LICENSE)
Author: grinnch

ML Models & Libraries

YOLOv8m Speech Bubble Detector: kitsumed
Comic Speech Bubble Detector YOLOv8m: ogkalu
SAM 2.1: Segment Anything in Images and Videos: Meta AI
SAM 3: Meta AI
FLUX.1 Kontext: Black Forest Labs
FLUX.2 Klein 4B: Black Forest Labs
FLUX.2 Klein 9B: Black Forest Labs
Nunchaku: Nunchaku AI
SDNQ Quants: Disty0
2x-AnimeSharpV4: Kim2091
Manga OCR: kha-white
Manga109 YOLO: deepghs
AnimeText YOLO: deepghs

Name		Name	Last commit message	Last commit date
Latest commit History 496 Commits
core		core
docs		docs
fonts		fonts
models		models
ui		ui
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt
style.css		style.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MangaTranslator

Table of Contents

Features

Requirements

Install

Portable Package (Recommended)

Manual install

Post-Install Setup

Models

Fonts

LLM setup

OSB text setup (optional)

Steps to create a token:

Run

Web UI (Gradio)

CLI

Documentation

Updating

Portable Package

Manual Install

License & credits

About

Uh oh!

Releases 45

Packages

Contributors 2

Languages

License

meangrinch/MangaTranslator

Folders and files

Latest commit

History

Repository files navigation

MangaTranslator

Table of Contents

Features

Requirements

Install

Portable Package (Recommended)

Manual install

Post-Install Setup

Models

Fonts

LLM setup

OSB text setup (optional)

Steps to create a token:

Run

Web UI (Gradio)

CLI

Documentation

Updating

Portable Package

Manual Install

License & credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 45

Packages 0

Contributors 2

Languages

Packages