Chaowei Chen1,Li Yu 1,Shiquan Min1, Shunfang Wang 1,2,*
Paper: (arXiv 2408.13735)
State Space Models (SSMs), especially Mamba, have shown great promise in medical image segmentation due to their ability to model long-range dependencies with linear computational complexity. However, accurate medical image segmentation requires the effective learning of both multi-scale detailed feature representations and global contextual dependencies. Although existing works have attempted to address this issue by integrating CNNs and SSMs to leverage their respective strengths, they have not designed specialized modules to effectively capture multi-scale feature representations, nor have they adequately addressed the directional sensitivity problem when applying Mamba to 2D image data. To overcome these limitations, we propose a Multi-Scale Vision Mamba UNet model for medical image segmentation, termed MSVM-UNet. Specifically, by introducing multi-scale convolutions in the VSS blocks, we can more effectively capture and aggregate multi-scale feature representations from the hierarchical features of the VMamba encoder and better handle 2D visual data. Additionally, the large kernel patch expanding (LKPE) layers achieve more efficient upsampling of feature maps by simultaneously integrating spatial and channel information. Extensive experiments on the Synapse and ACDC datasets demonstrate that our approach is more effective than some state-of-the-art methods in capturing and aggregating multi-scale feature representations and modeling long-range dependencies between pixels.
- Synapse Multi-Organ Segmentation
- ACDC for Automated Cardiac Segmentation
We recommend the following platforms:
Ubuntu <= 22 / CUDA 11.8.0 / Python 3.8 / Pytorch >= 2.0.0
Based on your environment, you can install CUDA 11.8.0 in your home directory. This step downloads
the CUDA toolkit installer, makes it executable, and
installs it to $HOME/cuda-11.8 directory:
chmod +x cuda_11.8.0_520.61.05_linux.run
./cuda_11.8.0_520.61.05_linux.run --silent --toolkit --override --installpath=$HOME/cuda-11.8Add CUDA to your system PATH and library path by editing ~/.bashrc. These commands set up the environment variables so
your system can find the CUDA compiler and libraries:
vim ~/.bashrc
# Add the following lines to ~/.bashrc:
export CUDA_HOME=$HOME/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
# Then reload the configuration:
source ~/.bashrcCheck that CUDA 11.8 is correctly installed by running the NVIDIA CUDA compiler version command. You should see output similar to:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0Create a new conda environment named msvmunet with Python 3.8, activate it, and install PyTorch 2.0.0 with CUDA 11.8
support:
conda create -n msvmunet python=3.8
conda activate msvmunet
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidiaCheck if your GCC version is 11. If not, install GCC 11 and G++ 11. The Selective Scan CUDA kernels require GCC 11 for proper compilation:
$ gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.If your GCC version is not 11, install it and set it as the default compiler:
sudo apt install gcc-11 g++-11
vim ~/.bashrc
# Add the following lines to ~/.bashrc:
export CC=gcc-11
export CXX=g++-11
# Then reload the configuration:
source ~/.bashrcInstall the Triton-implemented Selective Scan module, which is a core component of the Mamba architecture. This step compiles the CUDA kernels using GCC 11:
cd kernels/selective_scan
CC=gcc-11 CXX=g++-11 pip install -e .Install all required Python packages listed in requirements.txt:
pip install -r requirements.txtThe medpy package includes SimpleITK, which sometimes fails to install automatically because it needs to download
from GitHub. To avoid potential installation issues, install medpy directly without installing its dependencies (other
required dependencies are already covered in requirements.txt):
pip install medpy==0.5.2 --no-depsIf you need to reproduce baseline methods such as VM-UNet or Swin-UMamba, install the following additional dependencies:
pip install causal_conv1d==1.0.0
pip install mamba_ssm==1.0.1If the above installation gets stuck or fails, you can download the offline wheel packages and install them locally: causal_conv1d, mamba_ssm.
pip install causal_conv1d-1.2.0.post2+cu118torch2.0cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
pip install mamba_ssm-1.0.1+cu118torch2.0cxx11abiFALSE-cp38-cp38-linux_x86_64.whlIf you are using a cloud computing platform such as GPUShare, simply select a machine that matches our recommended environment specifications. In this case, you can skip Steps 1, 2, 3, and 5 as CUDA and GCC should already be properly configured on the cloud instance.
Before preparing data and training models, you need to set up the following environment variables in your shell configuration file (~/.bashrc or ~/.zshrc):
# Add these lines to ~/.bashrc or ~/.zshrc
export DATASET_HOME=/path/to/your/datasets # Root directory for all datasets
export PRETRAIN_HOME=/path/to/your/pretrained # Directory for pretrained model weights
# Then reload the configuration:
source ~/.bashrc # or source ~/.zshrcExample Setup:
export DATASET_HOME=$HOME/datasets
export PRETRAIN_HOME=$HOME/pretrained_modelsThese environment variables are used by the training and testing scripts to locate datasets and pretrained models, making it easier to manage different data locations across different machines.
mkdir -p $DATASET_HOME/mis- Option 1: Sign up at the official Synapse website and download the dataset
- Option 2: Download the preprocessed data
Extract the dataset to: $DATASET_HOME/mis/synapse/
Download the preprocessed ACDC dataset from Google Drive of MT-UNet
Extract the dataset to: $DATASET_HOME/mis/acdc/
Expected Directory Structure:
$DATASET_HOME/
└── mis/
├── synapse/
│ ├── train/
│ └── test/
└── acdc/
├── train/
├── valid/
└── test/
mkdir -p $PRETRAIN_HOMEDownload the pretrained VMamba-Tiny V2 model from VMamba official release:
- Model: vssm1_tiny_0230s_ckpt_epoch_264.pth
- Save to:
$PRETRAIN_HOME/vssm1_tiny_0230s_ckpt_epoch_264.pth
This pretrained model is used for encoder initialization to improve training convergence and performance.
Expected Directory Structure:
$PRETRAIN_HOME/
└── vssm1_tiny_0230s_ckpt_epoch_264.pth
Using the following command to train & evaluate MSVM-UNet:
bash ./run_msvm_unet.shNote: The model/ directory contains implementations of several baseline methods (Att-UNet, Trans-UNet, Swin-UNet, VM-UNet, Swin-UMamba, etc.) for comparison purposes. These are not required for MSVM-UNet training but are included for reproducibility of the experiments in the paper.
@article{chen2024msvmunet,
title={MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation},
author={Chaowei Chen and Li Yu and Shiquan Min and Shunfang Wang},
journal={arXiv preprint arXiv:2408.13735},
year={2024}
}
We thank the authors of TransUNet, SLDGroup, Mamba, VMamba, VM-UNet, and Swin-UMamba for making their valuable code & data publicly available.


