Skip to content

Conversation

@pramodith
Copy link
Collaborator

What does this PR do?

Add support for async tool calls in GRPOTrainer.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@pramodith
Copy link
Collaborator Author

pramodith commented Dec 23, 2025

FYI: I had to hack the parse_response function in chat_template_utils for the test cases around agent training to pass. I think this has to do with some changes in the transformers lib and is currently being looked into by the team.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec
Copy link
Member

I think this has to do with some changes in the transformers lib and is currently being looked into by the team

Yes, I have noticed that some recent changes have caused CI to fail, but I haven't take time to look into this yet

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for asynchronous tool calls in GRPOTrainer, extending agent training capabilities to handle both synchronous and asynchronous tool functions. The implementation consolidates the async event loop infrastructure to support both async reward functions and async tools on a shared daemon thread.

Key changes:

  • Separated tool dictionaries into _sync_tool_dict and _async_tool_dict to handle both synchronous and asynchronous tools
  • Consolidated async event loop management from separate reward and tool loops into a single async_loop shared by both
  • Updated _tool_call_loop method to execute async tools using asyncio.gather with proper exception handling

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
trl/trainer/grpo_trainer.py Added async tool support by splitting tool dictionaries, consolidating event loop management, and implementing async tool execution in _tool_call_loop
tests/test_grpo_trainer.py Added async_multiply_tool test function and parametrized test_training_with_tools to test both sync and async tools separately
docs/source/grpo_trainer.md Updated documentation to show that tools can be sync or async, added example async_add function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pramodith pramodith requested a review from qgallouedec January 16, 2026 23:26
@pramodith pramodith enabled auto-merge (squash) January 18, 2026 00:36
Comment on lines -380 to -388
self._has_async_reward_funcs = any(asyncio.iscoroutinefunction(func) for func in self.reward_funcs)
if self._has_async_reward_funcs:
self.async_reward_loop_thread, self.async_reward_loop, self.async_reward_loop_ready_event = (
start_event_loop_in_daemon(name="GRPOTrainer-AsyncRewardLoop")
)
# wait until the event loop is running in the daemon thread
self.async_reward_loop_ready_event.wait()
atexit.register(shutdown_event_loop_in_daemon, self.async_reward_loop_thread, self.async_reward_loop)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, just moved it down after we examine all the tools for async functions too.

@pramodith pramodith merged commit 497aee0 into huggingface:main Jan 19, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants