Skip to content

DBO-DKFZ/wsi-calib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evaluating calibration for WSI classification

python black pytorch lightning

Download data

Setup directories and generate patch locations

  1. Set environment variables. This can be done by adding the following lines to your .bashrc file:
export TCGA_ROOT_DIR=path/to/tcga/slides
export MCO_ROOT_DIR=path/to/mco/slides
export SLIDE_PROCESS_DIR=path/to/local/storage
  1. Copy the content of the data/ directory to $SLIDE_PROCESS_DIR:
|-- MCO-SCalib
|   |-- folds/
|   |-- slide_information.csv
|   |-- test_slides.csv
|
|-- TCGA-CRC-SCalib
|   |-- slide_information.csv
  1. Compute patch locations by using our preprocessing repository at https://github.com/DBO-DKFZ/wsi_preprocessing-crc (tested with version 0.2 git checkout v0.2). The file structure should then look as follows:
|-- MCO-SCalib
|   |-- folds/
|   |-- patches_256/
|   |-- patches_512/
|   |-- slide_information.csv
|   |-- test_slides.csv
|
|-- TCGA-CRC-SCalib
|   |-- patches_256/
|   |-- patches_512/
|   |-- slide_information.csv

Setup conda environment for repository

We recommend to use miniconda to install the required dependencies for this project:

conda env create -f environment.yml
conda activate slidecalib

Extract features for patch locations

The code for feature extraction is located in the preprocess/ directory and the config files are located under configs/.

For feature extraction with the Ciga model, the weights need to be downloaded from https://github.com/ozanciga/self-supervised-histopathology and the corresponding *.ckpt file needs to be put in the checkpoints/ directory.

An example command for extracting features for the MCO slides is

python preprocess/extract_features.py --config configs/features/features_mco_512.yaml

As backend for loading the image patches from the slide we support both Openslide and Cucim.

Train different models

Once the features are extracted, we can train the different model architectures on the extracted features. The corresponding config files are again provided in the configs/ directory.

Before training the model, the EXPERIMENT_LOCATION needs to be set as environment variable (for example by adjusting ~/.bashrc):

export EXPERIMENT_LOCATION=path/to/store/results

An example to train the CLAM model with our pipeline is to run

python train.py --config configs/train_mco_clam.yaml

The example expects the following directory content:

|-- MCO-SCalib
|   |-- features_512_resnet18-ciga
|   |-- folds/
|   |-- patches_512/
|   |-- slide_information.csv
|   |-- test_slides.csv
|
|-- TCGA-CRC-SCalib
|   |-- features_512_resnet18-ciga
|   |-- patches_512/
|   |-- slide_information.csv

With the provided config file, the model predictions are automatically stored for the MCO test data, as well as for the TCGA slides.

To train the model on a different fold of MCO slides, the fold parameter can be set from the command line with:

python train.py --config configs/train_mco_clam.yaml --data.fold X

where X has to be in the range of [1, 5].

Evaluate models

We provide a juypter notebook to perform evaluations on the stored predicitons in the notebooks/ directory.

To better track notebooks with git, add the following lines to .git/config:

[filter "strip-notebook-output"]
    clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"

About

Evaluating calibration for WSI classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published