Comprehensive tutorials, analysis code, and reproducible workflows demonstrating scDiagnostics for systematic assessment of cell type annotation in single-cell transcriptomics data.
Manuscript: Christidis, A., Ghazi, A., Chawla, S., Turaga, N., Gentleman, R., & Geistlinger, L. scDiagnostics: systematic assessment of cell type annotation in single-cell transcriptomics data. Submitted.
Analysis and Results: Manuscript Website
We demonstrate scDiagnostics using two real-world single-cell datasets:
1. COVID-19 PBMC scRNA-seq
- Single-cell RNA-seq data from severe COVID-19 patients and healthy controls
- Source: CZI CELLxGENE (Stephenson et al., 2021)
- Use case: Discovery and characterization of disease-associated immune cell states
2. MERFISH Mouse Colitis
- Spatial transcriptomics from a mouse model of inflammatory bowel disease
- Source: MerfishData Bioconductor package (Cadinu et al., 2024)
- Use case: Spatial validation of annotation quality and disease-associated cell states
For each dataset, we predict cell type labels using four popular annotation tools:
- Azimuth — Weighted k-NN mapping
- SingleR — Correlation-based assignment
- CellTypist — Machine learning classifier
- scVI/scArches — Deep learning with VAE (GPU-accelerated)
#| eval: false
source("R/covid/R_Package_Installation_Pipeline.R")
Or for MERFISH:
#| eval: false
source("R/merfish/R_Package_Installation_Pipeline.R")
All pre-processed datasets with annotations are available on Zenodo:
#| eval: false
source("data/downloadData.R")
downloadData()
This automated script downloads all four SingleCellExperiment/SpatialExperiment objects into your data/covid/ and data/merfish/ directories. For manual download, visit the Zenodo repository.
See detailed instructions: Setup & Installation, Accessing Data
Full tutorials and analysis code available at https://ccb-hms.github.io/scDiagnosticsManuscript/:
Analysis environment setup, data retrieval, and reproducible analysis workflows:
- Setup & Installation — Install R and Python dependencies (GPU recommended for scVI/scArches)
- Accessing Data — Download pre-processed datasets from Zenodo
- Cell type annotation — Apply all four annotation methods to query data
Quick start, core functionality, and common analysis workflows:
- scDiagnostics Overview — Introduction to diagnostic framework and key concepts
- COVID-19 Analysis — Annotation assessment and anomaly detection in scRNA-seq data
- MERFISH Analysis — Annotation assessment and anomaly detection in spatial transcriptomics data
- Exploring Annotation Tool Diagnostics — Complementary aspects of scDiagnostics and built-in quality metrics from major annotation tools
All required R packages are automatically installed by running:
#| eval: false
source("R/covid/R_Package_Installation_Pipeline.R")
source("R/merfish/R_Package_Installation_Pipeline.R")
For GPU-accelerated scVI/scArches annotation:
conda env create -f environment-scvi.yml
conda activate scvi-envSee Setup & Installation for detailed instructions.
If you use this code, data, or analyses, please cite:
@article{christidis2024scDiagnostics,
author = {Christidis, A. and Ghazi, A. and Chawla, S. and Turaga, N. and Gentleman, R. and Geistlinger, L.},
title = {scDiagnostics: systematic assessment of cell type annotation in single-cell transcriptomics data},
year = {2026},
note = {Submitted}
}Code and Scripts: github.com/ccb-hms/scDiagnosticsManuscript
For questions or feedback, please open an issue on GitHub.