Reproducing results on MLE-Bench

Hello,

I am interested in reproducing the results reported on the MLE-Bench leaderboard for the R&D Agent using GPT-5 (https://github.com/openai/mle-bench/tree/main?tab=readme-ov-file). 

Could you please provide the detailed instructions or artifacts required to replicate this setup? Specifically, I am looking for:
- The specific configuration files (or hyperparameters) used.
- The exact command-line arguments to run the evaluation.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reproducing results on MLE-Bench #1317

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Reproducing results on MLE-Bench #1317

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions