-
Notifications
You must be signed in to change notification settings - Fork 284
Description
Hi, thanks for your great work!
I'm running some benchmark and noticed that in master branch system sometimes goes in almost infinite continue_chaning loop performing "search_in_memory" toolcalls with slightly different inputs derived from originally requested user content (it appends the user_message with type=continue_chaining after each iteration of such a look).
In the branch "public_evaluation" the tool "search_in_memory" is usually called once and it is followed the user_message of type=heartbeat (I guess that is the reason, which leads to final send_message tool)
Is there any way to affect the behavior of master branch (so that it finished bechmarks performing a limited amount of toolcalls?
Since the code in public_evaluation seems to be quite obsolete.
Thank you!
UPD: the problem spotted for GPT-5-nano and GPT-5-mini models