Skip to content

metadriverse/PPL

Repository files navigation

Predictive Preference Learning from Human Interventions (PPL)

NeurIPS 2025 Spotlight

Official release for the code used in the paper: Predictive Preference Learning from Human Interventions

Webpage | Poster | Paper | Demo

Installation

git clone https://github.com/metadriverse/PPL.git
cd PPL

# Create Conda environment
conda create -n ppl python=3.7
conda activate ppl

# Install dependencies
pip install -r requirements.txt
pip install -e .

Launch Experiments

Predictive Preferenece Learning (Ours)

To reproduce the main experiment reported in the paper, run the training script train_ppl_metadrive.py in the folder ppl/experiments/metadrive. It takes about 12 minutes to train a performant driving agent. We also provide a simpler toy environment with --toy_env.

cd ~/PPL

# Run toy experiment
python ppl/experiments/metadrive/train_ppl_metadrive.py \
--toy_env

# Run full experiment
python ppl/experiments/metadrive/train_ppl_metadrive.py \
--wandb \
--wandb_project WADNB_PROJECT_NAME \
--wandb_team WANDB_ENTITY_NAME \

You can specify the output length H of the trajectory predictor with --num_predicted_steps H. You can also set different preference horizons L with --preference_horizon L.

To train a neural expert approximating human policy, you can run the following command:

# Train PPO expert (Optional)
python ppl/experiments/metadrive/train_ppo_metadrive.py

Baselines

We also provide the codes for the baselines and ablation studies.

For example, to run the baseline Proxy Value Propagation (Peng et al. 2023), you can run the following command:

# Run Proxy Value Propagation (Baseline)
python ppl/experiments/metadrive/train_pvp_metadrive.py

You can also set the mode --only_bc_loss=True in our PPL method to verify that the DPO-like preference loss contributes to improving the training performance of PPL.

# PPL without Preference Loss (Baseline)
python ppl/experiments/metadrive/train_ppl_metadrive.py \
--only_bc_loss=True

Reference

Predictive Preference Learning from Human Interventions (NeurIPS 2025 Spotlight):

@article{cai2025predictive,
  title={Predictive Preference Learning from Human Interventions},
  author={Cai, Haoyuan and Peng, Zhenghao and Zhou, Bolei},
  journal={Advances in Neural Information Processing Systems},
  year={2025}
}   

About

Codebase of Predictive Preference Learning from Human Interventions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •