Skip to content

GenSI-THUAIR/ProBayes

Repository files navigation

[NeurIPS 2025] ProBayes

Official implementation of Rationalized All-Atom Protein Design with Unified Multi-modal Bayesian Flow.

Paper Weights Dataset Docker Badge

Table of Content

Environment Set Up

Use our docker (Recommended)

You can use our docker image to quickly set up the environment:

docker pull hanlinwu/probayes

Next, download the data following the instructions here.

Manually Install

You can also build the environment if docker is not available. Run the following script to install most of the packages.

conda env create -f probayes_env.yml

Install PyRosetta

The PyRosetta version we use is pyrosetta-2024.35+release.45abd6a-cp310-cp310-linux_x86_64.whl. You need to download this file into your server and install it:

pip install pyrosetta-2024.35+release.45abd6a-cp310-cp310-linux_x86_64.whl

Install pdbfix

pip install git+https://github.com/pandegroup/pdbfixer.git

Install torch-scatter

pip install torch-scatter==2.1.2+pt20cu117 -f https://data.pyg.org/whl/torch-2.0.0+cu117.html

Install the project

pip install -e .

Download Data

Download Datasets

Download the pre-processed dataset files in this link. And unzip it in the project root.

Raw data can be found in PepBench&PepBDB and SAbDab&RAbD.

Download Pre-Compute Cache

Download the pre-computed cache files for Bayesian flow in this link. And unzip it in the project root.

Download Checkpoint

We provide our checkpoints and the designed PDB files for benchmark (PepBench, PepBDB, RAbD) evaluation here.

Project Structure

After installation, the project structure should be like:

/probayes
|-- README.md
|-- cache_files
|-- ckpts
|-- configs
|-- logs
|-- openfold
|-- probayes
|-- probayes.egg-info
|-- probayes_data
|-- probayes_data.zip
|-- probayes_env.yml
|-- remote
|-- scripts
|-- setup.py
|-- train.py
|-- train_antibody.py
|-- train_antibody_ddp.py
|-- train_pep.py
`-- train_pep_ddp.py

Now you can reimplement our benchmark metrics.

Benchmark Metrics Reimplementation

You may need to add the execuation permission for DockQ evaluation. e.g.

chmod +x probayes/remote/PepGLAD/evaluation/DockQ/fnat
chmod +x probayes/remote/ppflow/bin/TMscore/TMscore

And compile the TMScore.cpp

g++ -static -O3 -ffast-math -lm -o evaluation/TMscore evaluation/TMscore.cpp

All training and evaluation scripts can be found in scripts/. For reimplementing the benchmark metric scores:

  1. Peptide codesign
source scripts/eval_ckpt_peptide.sh
  1. Peptide Binding Conformation Generation / Folding
source scripts/eval_ckpt_folding.sh
  1. Antibody design
source scripts/eval_ckpt_antibody.sh

You can choose the desired dataset by switching the CKPT_DIR variable in the bash file.

Training

We provide our training scripts here:

  1. Peptide codesign
source scripts/train_ddp_antibody_codesign.sh -d pepbench
  1. Peptide Binding Conformation Generation / Folding
source scripts/train_ddp_pep_folding.sh -d pepbench
  1. Antibody design
source scripts/train_ddp_antibody_codesign.sh

The default setting requires 4x80GB GPUs for 10~24 hours.

You can check the benchmark scores in wandb.

Acknowledgements

We would like to express our gratitude to the following repositories for their valuable contributions:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published