This repository is based on the open-source codebase of Retro* and Agent-R1.
cd Retro-R1
conda env create --file environment.yml
conda activate Retro_R1
pip install flash-attn==2.7.4.post1
pip install -e packages/mlp_retrosyn
pip install -e packages/rdchiral
unzip verl.zip
cd verl
pip install --no-deps -e .To reproduce the results in the paper, we also need the additional files containing the training dataset, evaluation datasets (USPTO, ChEMBL-1000), starting molecules, and the template rules.
Files from Retro* (such as single-step model weights V1) can be downloaded from the link, and put the folders (dataset/, one_step_model/) under this directory. The single-step model weights V2 and V3 can be downloaded from the link and put the files (retro_star_value_ours.ckpt(V2) and retro_star_zero_ours.ckpt(V3)) under the one_step_model/ directory. The single-step model weights V4 is not released and can be trained following the instructions in PDVN. We can also provide the trained weights if requested. Put the trained weights under the one_step_model/ directory and name it as retro_star_V4.ckpt. The ChEMBL-1000 testset can be downloaded from link and put under the folder dataset/.
python ./examples/data_preprocess/reaction.pybash training_script.pyNote that this script only use one node with eight gpus. To use multi nodes, please follow the instructions of verl.
Modify the scripts export.sh and change the ori_pth, ckpt_pth and export_pth to export model from the best checkpoint.
bash export.pybash test_retro_script.pybash test_chembl_script.py