CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation

Dataset: Download the test.jsonl file here and put it to data/leetcode_contest.jsonl.

Run

Generation

Run {run_type}

python {file_name} \
--config 'config/config-leetcode-contest-qwen2.5-coder-32b.json' \
--run_type {run_type} \
--api_key 'xxx' \
--base_url 'xxx'

The required order of method execution, the methods following the arrow need to rely on the results generated by the previous methods for execution, including codes or test cases.

Sampling / Gen_tests -> Sampling+Filtering / CodeT / MBR_Exec / Self_repair, Gen_tests -> Reflexion

run_type

Method	file_name	run_type
Sampling	b_sampling.py	sampling
Sampling+Filtering	b_sampling_filtering.py	sampling_filtering
Gen_tests	b_gen_tests.py	gen_tests
CodeT	b_codet.py	codet
MBR_Exec	b_mbr_exec.py	mbr_exec
Self_repair	b_self_repair.py	self_repair
Reflexion	b_reflexion	reflexion
CoCoEvo	coevod.py	coevo
Evolution (CoCoEvo w/o test evolution)	evolutiond.py	evolution

Run CoCoEvo

python coevod.py \
--config 'config/config-leetcode-contest-qwen2.5-coder-32b.json' \
--run_type 'coevo' \
--api_key 'xxx' \
--base_url 'xxx'

Evaluation

Evaluate CoCoEvo / Evolution

python count_code_population.py \
result_dir 'result/leetcode_contest/qwen2.5-coder-32b/coevod'

Evaluate generated test cases

# evaluate generated tests
python b_gen_tests_eval.py \
--config 'config/config-leetcode-contest-qwen2.5-coder-32b.json' \
--run_type 'gen_tests_eval'

# show evaluation result
python show_tests.py --result_dir='result/leetcode_contest/qwen2.5-coder-32b' --run_type='gen_tests_eval'

For other methods, use submit.py

# submit to private tests
python submit.py \
--config 'config/config-leetcode-contest-qwen2.5-coder-32b.json' \
--run_type {run_type}

Other Baselines

baselines (AgentCoder, CodeCOT, INTERVENOR)

Citation

@ARTICLE{11098743,
  author={Li, Kefan and Yuan, Yuan and Yu, Hongyue and Guo, Tingyu and Cao, Shijie},
  journal={IEEE Transactions on Evolutionary Computation}, 
  title={CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation}, 
  year={2025},
  volume={},
  number={},
  pages={1-1},
  keywords={Codes;Accuracy;Maintenance engineering;Evolutionary computation;Electronic mail;Software development management;Programming;Dynamic scheduling;Computer bugs;Training;Large Language Models;Code Generation;Test Case Generation;Co-Evolution},
  doi={10.1109/TEVC.2025.3593272}}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
baselines		baselines
code_datasets		code_datasets
code_evaluator		code_evaluator
code_models		code_models
config		config
data		data
generators		generators
utils		utils
.gitignore		.gitignore
README.md		README.md
b_codet.py		b_codet.py
b_gen_tests.py		b_gen_tests.py
b_gen_tests_cov.py		b_gen_tests_cov.py
b_gen_tests_eval.py		b_gen_tests_eval.py
b_mbr_exec.py		b_mbr_exec.py
b_reflexion.py		b_reflexion.py
b_sampling.py		b_sampling.py
b_sampling_filtering.py		b_sampling_filtering.py
b_self_repair.py		b_self_repair.py
coevod.py		coevod.py
count_code_population.py		count_code_population.py
count_test_population.py		count_test_population.py
evaluate.py		evaluate.py
evolutiond.py		evolutiond.py
requirements.txt		requirements.txt
running_utils.py		running_utils.py
show.py		show.py
show_tests.py		show_tests.py
submit.py		submit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation

Run

Generation

Evaluation

Other Baselines

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

lbaf23/llm-cocoevo

Folders and files

Latest commit

History

Repository files navigation

CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation

Run

Generation

Evaluation

Other Baselines

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages