Custom-trained handwriting models delivering up to 99% transcription accuracy on curated collections.
Thoth is a clean, fast, and modern desktop app for transcribing historical manuscripts. It pairs a robust preprocessing pipeline with fine‑tuned, task‑specific models trained on period scripts to achieve exceptional results.
Note: Accuracy is reported from internal evaluations on representative, curated datasets. Results can vary based on handwriting style, scan quality, and preprocessing. See "Accuracy and Models" below for details.
- Custom-trained models: Purpose‑built LoRA adapters for historical scripts and hands
- Up to 99% accuracy: High fidelity on curated collections and clear scans
- Beautiful, focused UI: Load, transcribe, and export in seconds
- Fast inference: Optimized TrOCR pipeline with GPU acceleration
- Offline-first: Run fully locally with your own models and data
Thoth helps historians, archivists, librarians, and digital‑humanities teams transcribe challenging handwritten sources. Built on the TrOCR architecture and fine‑tuned via Parameter‑Efficient Fine‑Tuning (LoRA), it provides high accuracy while remaining efficient and portable.
- State-of-the-art HTR powered by TrOCR
- Custom-trained adapters for specific periods (e.g., 18th‑c English cursive)
- Clean PyQt6 desktop app: cross‑platform, minimal clicks
- Advanced preprocessing: binarization, denoising, segmentation
- Batch processing and easy model switching
- Export anywhere: copy to clipboard or save as text
- Python 3.8+
- CUDA-capable GPU (recommended for training)
- 24GB+ VRAM for model fine-tuning
# Clone the repository
git clone https://github.com/rtw878/thoth.git
cd thoth
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Test the installation
python test_functionality.py- Our best in‑house adapters reach up to 99% accuracy on curated test sets that match their training domain (script, period, scan quality).
- Accuracy depends on handwriting style, document condition, and preprocessing. For new collections, training or fine‑tuning your own adapter usually yields the best results.
- You can drop your own fine‑tuned adapters into
models/and select them in the app.
python src/app/main.pyThe GUI provides:
- Image loading and display
- Model selection from available trained models
- Real-time transcription with progress indication
- Text export and clipboard functionality
# Download IAM dataset
python src/data/download_data.py IAM
# Download Bentham dataset
python src/data/download_data.py Bentham# Train on IAM dataset
python src/model/train_model.py --dataset IAM --output-dir models/my_model
# Train with custom configuration
python src/model/train_model.py --config configs/my_config.yml# Evaluate a trained model
python src/evaluate/evaluate_model.py models/my_model --output-file results.json# Preprocess a single image
python src/preprocess/image_pipeline.py path/to/image.png
# Preprocess a folder of images
python src/preprocess/image_pipeline.py path/to/folder/thoth/
├── LICENSE
├── README.md
├── requirements.txt
├── .gitignore
├── configs/config.yml # Main configuration file
├── test_functionality.py # Test script
├── data/ # Data directories
│ ├── raw/ # Original datasets
│ ├── interim/ # Intermediate processed data
│ └── processed/ # Final model-ready data
├── models/ # Trained model checkpoints
├── notebooks/ # Jupyter notebooks for exploration
├── reports/figures/ # Generated plots and visualizations
├── src/ # Source code
│ ├── __init__.py
│ ├── data/
│ │ ├── __init__.py
│ │ ├── download_data.py # Dataset downloading
│ │ └── dataset_manager.py # Dataset management
│ ├── preprocess/
│ │ ├── __init__.py
│ │ └── image_pipeline.py # Image preprocessing
│ ├── model/
│ │ ├── __init__.py
│ │ ├── train_model.py # Model training
│ │ └── inference.py # Model inference
│ ├── evaluate/
│ │ ├── __init__.py
│ │ └── evaluate_model.py # Model evaluation
│ └── app/
│ ├── __init__.py
│ └── main.py # GUI application
└── docs/ # Documentation
├── conf.py
└── index.rst
The project uses a centralized configuration system in configs/config.yml. Key configuration sections:
- Data paths: Directory structure for datasets
- Preprocessing: Image processing parameters
- Model parameters: TrOCR and LoRA configuration
- Training arguments: Learning rate, batch size, etc.
- Application settings: GUI configuration and available models
- IAM Handwriting Database: Modern English handwriting (research use only)
- Bentham Collection: 18th-19th century English script
- READ-ICFHR 2016: Early Modern German documents
- Custom datasets: Support for local datasets
Thoth uses:
- Base model:
microsoft/trocr-large-handwritten - Fine‑tuning: LoRA (Low‑Rank Adaptation) adapters for efficiency
- Evaluation: CER (Character Error Rate) and WER (Word Error Rate)
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
# Install development dependencies
pip install -r requirements.txt
# Run tests
python test_functionality.py
# Build documentation
cd docs
make html- Use Black for code formatting
- Follow PEP8 guidelines
- Include type hints and docstrings
- Write unit tests for new functionality
This project is licensed under the MIT License - see the LICENSE file for details.
Note: Some datasets (like IAM) have non-commercial licenses. Please check dataset licenses before use.
If you use Thoth in your research, please cite:
[TODO: Add citation information]
- Built on the Hugging Face Transformers library
- Uses TrOCR architecture
- Inspired by the Transcribe Bentham project
- Supported by LoRA for parameter-efficient fine-tuning
For questions and support:
- Open an issue on GitHub
- Check the documentation in
docs/ - Review the example configurations in
configs/
Built with care for historians and researchers. If you use Thoth, consider starring the repo and sharing feedback—your datasets and use‑cases help us improve models and accuracy.
