Skip to content

meangrinch/MangaTranslator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

496 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MangaTranslator

Gradio-based web application for automating the translation of manga/comic page images using AI. Targets speech bubbles and text outside of speech bubbles. Supports 54 languages and custom font pack usage.

Original Translated (w/ a single click)

Table of Contents

Features

  • Detection: Speech bubble detection & segmentation (YOLO + SAM 2.1/3)
  • Cleaning: Inpaint speech bubbles and OSB text (Flux.2 Klein, Flux.1 Kontext, or OpenCV)
  • Translation: LLM-powered OCR & translation (54 languages)
  • Rendering: Text rendering with alignment and custom font packs
  • Upscaling: 2x-AnimeSharpV4 for enhanced output quality
  • Processing: Single/batch processing with directory preservation and ZIP support
  • Interfaces: Web UI (Gradio) and CLI
  • Automation: One-click translation; no intervention required

Requirements

  • Python 3.10+
  • PyTorch (CPU, CUDA, ROCm, MPS, XPU)
  • Font pack with .ttf/.otf files; included with portable package
  • LLM for Japanese source text; VLM for other languages (API or local)

Install

Portable Package (Recommended)

Download the standalone zip from the releases page: Portable Build

Requirements:

  • Windows: Bundled Python/Git included; no additional requirements
  • Linux/macOS: Python 3.10+ and Git must be installed on your system

Setup:

  1. Extract the zip file
  2. Run the setup script for your platform:
    • Windows: Double-click setup.bat
    • Linux/macOS: Run ./setup.sh in terminal
  3. PyTorch version is automatically detected and installed based on your system
  4. Open the launcher script created in ./MangaTranslator/:
    • Windows: start-webui.bat
    • Linux/macOS: start-webui.sh

Includes the Komika (normal text), Cookies (OSB text), Comicka (either), and Roboto (supports accents) font packs

Tip

In the event that you need to transfer to a fresh portable package:

  • You can safely move the fonts, models, and output directories to the new portable package
  • You might be able to move the runtime directory over, assuming the same setup configuration is wanted

Manual install

  1. Clone and enter the repo
git clone https://github.com/meangrinch/MangaTranslator.git
cd MangaTranslator
  1. Create and activate a virtual environment (recommended)
python -m venv venv
# Windows PowerShell/CMD
.\venv\Scripts\activate
# Linux/macOS
source venv/bin/activate
  1. Install PyTorch (see: PyTorch Install)
# Example (CUDA 13.0)
pip install torch==2.9.1+cu130 torchvision==0.24.1+cu130 --extra-index-url https://download.pytorch.org/whl/cu130
# Example (CPU)
pip install torch torchvision
  1. Install Nunchaku (optional, for Flux.1 Kontext Nunchaku backend)
  • Nunchaku wheels are not on PyPI. Install directly from the v1.2.0 GitHub release URL, matching your OS and Python version. CUDA only, and requires a 2000-series card or newer.
# Example (Windows, Python 3.13, PyTorch 2.9.1)
pip install https://github.com/nunchaku-ai/nunchaku/releases/download/v1.2.0/nunchaku-1.2.0+torch2.9-cp313-cp313-win_amd64.whl

Note

Nunchaku is not necessary for the use of Flux models via the SDNQ backend.

  1. Install dependencies
pip install -r requirements.txt

Post-Install Setup

Models

  • The application will automatically download and use all required models

Fonts

  • Put font packs as subfolders in fonts/ with .otf/.ttf files
  • Prefer filenames that include italic/bold or both so variants are detected
  • Example structure:
fonts/
├─ CC Wild Words/
│  ├─ CCWildWords-Regular.otf
│  ├─ CCWildWords-Italic.otf
│  ├─ CCWildWords-Bold.otf
│  └─ CCWildWords-BoldItalic.otf
└─ Komika/
   ├─ KOMIKA-HAND.ttf
   └─ KOMIKA-HANDBOLD.ttf

LLM setup

  • Providers: Google, OpenAI, Anthropic, xAI, DeepSeek, Z.ai, Moonshot AI, OpenRouter, OpenAI-Compatible
  • Web UI: configure provider/model/key in the Config tab (stored locally)
  • CLI: pass keys/URLs as flags or via env vars
  • Env vars: GOOGLE_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, DEEPSEEK_API_KEY, ZAI_API_KEY, MOONSHOT_API_KEY, OPENROUTER_API_KEY, OPENAI_COMPATIBLE_API_KEY
  • OpenAI-compatible default URL: http://localhost:1234/v1

OSB text setup (optional)

If you want to use the OSB text pipeline, you need a Hugging Face token with access to the following repositories:

  • deepghs/AnimeText_yolo
  • black-forest-labs/FLUX.1-Kontext-dev (only required if using Flux.1 Kontext with Nunchaku backend)

Steps to create a token:

  1. Sign in or create a Hugging Face account
  2. Visit and accept the terms on:
  3. Create a new access token in your Hugging Face settings with read access to gated repos ("Read access to contents of public gated repos")
  4. Add the token to the app:
    • Web UI: set hf_token in Config
    • Env var (alternative): set HUGGINGFACE_TOKEN
  5. Save config to preserve the token across sessions

Run

Web UI (Gradio)

  • Portable package:
    • Windows: Double-click start-webui.bat inside the MangaTranslator folder
    • Linux/macOS: Run ./start-webui.sh inside the MangaTranslator folder
  • Manual install:
    • Windows: Run python app.py --open-browser

Options: --models (default ./models), --fonts (default ./fonts), --port (default 7676), --cpu. First launch can take ~1–2 minutes.

Once launched, configure your LLM provider in the Config tab, then upload images and click Translate.

CLI

Examples:

# Single image, Japanese → English, Google provider
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...>

# Batch folder, custom source/target languages, OpenAI-Compatible provider (LM Studio)
python main.py --input <folder_path> --batch \
  --font-dir "fonts/Komika" \
  --input-language <src_lang> --output-language <tgt_lang> \
  --provider OpenAI-Compatible --openai-compatible-url http://localhost:1234/v1 \
  --output ./output

# Single Image, Japanese → English (Google), OSB text pipeline, custom OSB text font
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...> \
  --osb-enable --osb-font-name "fonts/fast_action"

# Cleaning-only mode (no translation/text rendering)
python main.py --input <image_path> --cleaning-only

# Upscaling-only mode (no detection/translation, only upscale)
python main.py --input <image_path> --upscaling-only --image-upscale-mode final --image-upscale-factor 2.0

# Test mode (no translation; render placeholder text)
python main.py --input <image_path> --test-mode

# Full options
python main.py --help

Documentation

Updating

Portable Package

  • Windows: Run update.bat from the portable package root
  • Linux/macOS: Run ./update.sh from the portable package root

Manual Install

From the repo root:

git pull
pip install -r requirements.txt  # Or activate venv first if present

License & credits

ML Models & Libraries