A language model enables accurate structural variant detection in whole-genome amplified long-read sequencing
Installation • Quick Start • Web Demo • Documentation • Citation
A deep learning-powered tool to identify chimeric artifacts introduced by whole genome amplification (WGA).
No installation required! Try ChimeraLM instantly in your browser:
🤗 Launch Web Demo on Hugging Face Spaces
Perfect for:
- 🧪 Testing with individual sequences
- 📊 Visualizing prediction confidence scores
- 🎓 Learning about chimeric artifact detection
- 🔬 Quick validation before batch processing
For production use with BAM files and batch processing, install the CLI tool below.
pip install chimeralmRequirements: Python 3.10, 3.11 and 3.12
For GPU support, installation instructions, and troubleshooting, see the Installation Guide.
# Predict chimeric reads (CPU)
chimeralm predict your_data.bam
# Predict with GPU acceleration
chimeralm predict your_data.bam --gpus 1 --batch-size 24
# Filter BAM to remove chimeric reads
chimeralm filter your_data.bam your_data.predictionsOutput:
- Predictions: Tab-separated file with read names and labels (0=biological, 1=chimeric)
- Filtered BAM:
{input}.filtered.sorted.bamwith chimeric reads removed
Need more help? See the Quick Start Tutorial for a complete walkthrough.
Full documentation is available at ylab-hi.github.io/ChimeraLM
Key Resources:
- Installation Guide - Setup with pip, conda, uv, or from source
- Quick Start Tutorial - Your first prediction in 15 minutes
- CLI Reference - Complete command documentation
- BAM Filtering Tutorial - Comprehensive filtering guide
- Performance Optimization - Speed up your analysis
- Troubleshooting - Common issues and solutions
- 🌐 Interactive Web Demo: Try it online at HuggingFace Spaces - no installation needed!
- 🎯 High Accuracy: Deep learning model trained on real WGA data
- ⚡ GPU Accelerated: Optimized for CUDA, MPS (Apple Silicon), and CPU
- 🚀 Easy to Use: Simple CLI with sensible defaults
- 📦 Fast Processing: Batch inference with configurable parallelism
- 🖥️ Local Web Interface: Run the web UI locally with
chimeralm ui - 🏭 Production Ready: Includes filtering, sorting, and indexing of BAM files
- DeepChopper - Identify chimera artifacts induced by internal adapter sequences in Nanopore direct RNA sequencing
Contributions are welcome! See our Contributing Guide for development setup and guidelines.
If you use ChimeraLM in your research, please cite:
@software{chimeralm2025,
title={ChimeraLM: A genomic language model to identify chimera artifacts},
author={Li, Yangyang and Guo, Qingxiang and Yang, Rendong},
year={2025},
url={https://github.com/ylab-hi/ChimeraLM}
}Apache License 2.0 - see LICENSE for details.