Skip to content

fix: handle parallel tool calls with same index in streaming mode#1958

Draft
wingding12 wants to merge 1 commit intohuggingface:mainfrom
wingding12:fix/streaming-parallel-tool-calls-1569
Draft

fix: handle parallel tool calls with same index in streaming mode#1958
wingding12 wants to merge 1 commit intohuggingface:mainfrom
wingding12:fix/streaming-parallel-tool-calls-1569

Conversation

@wingding12
Copy link

Summary

Fixes #1569 - Parallel function calling does not work in streaming mode

When streaming tool calls, some providers (e.g., Ollama via LiteLLM) return multiple parallel tool calls with the same index. Previously, these would be incorrectly merged into a single tool call with concatenated arguments like {"query": "Emma Bull"}{"query": "Virginia Woolf"}, causing tool execution failures.

Root Cause

The agglomerate_stream_deltas() function used the index field as a unique key for tool calls. When multiple parallel tool calls had the same index (as returned by Ollama), they would be merged together instead of being treated as separate calls.

Solution

Modified agglomerate_stream_deltas() to detect when a stream delta represents a new tool call vs a continuation:

  • If IDs differ between deltas, create a new tool call
  • If function names differ, create a new tool call
  • If current arguments form a complete JSON and new arguments arrive, create a new tool call

Added helper functions:

  • _is_complete_json() - Check if a string is a valid, complete JSON object
  • _should_create_new_tool_call() - Determine if a delta starts a new tool call

Tool calls without IDs now get assigned unique UUIDs to ensure proper execution.

Files Changed

File Change
src/smolagents/models.py Core fix - rewrite agglomerate_stream_deltas()
tests/test_models.py Unit tests for new behavior
tests/test_agents.py Integration test for ToolCallingAgent streaming
tests/test_gradio_ui.py Integration test for Gradio UI streaming

Test plan

  • Unit tests for agglomerate_stream_deltas():
    • Same-index parallel tool calls are separated (exact bug scenario)
    • Incremental JSON arguments are still accumulated correctly
    • Different IDs with same index are separated
    • Different function names with same index are separated
    • Multiple indices work correctly
  • Integration test for ToolCallingAgent streaming with parallel calls
  • Integration test for Gradio UI streaming with parallel calls
  • Existing test test_agglomerate_stream_deltas still passes

…ggingface#1569)

When streaming tool calls, some providers (e.g., Ollama via LiteLLM)
return multiple parallel tool calls with the same index. Previously,
these would be incorrectly merged into a single tool call with
concatenated arguments, causing tool execution failures.

This fix modifies agglomerate_stream_deltas() to detect when a stream
delta represents a new tool call vs a continuation of an existing one:
- If IDs differ between deltas, create a new tool call
- If function names differ, create a new tool call
- If current arguments form a complete JSON and new arguments arrive,
  create a new tool call (indicates previous call finished)

Additionally, tool calls without IDs now get assigned unique UUIDs.

Changes:
- models.py: Rewrite agglomerate_stream_deltas() with helper functions
  _is_complete_json() and _should_create_new_tool_call() to properly
  separate parallel tool calls with the same index
- test_models.py: Add comprehensive unit tests for the new behavior
- test_agents.py: Add integration test verifying fix works through
  ToolCallingAgent streaming pipeline
- test_gradio_ui.py: Add integration test verifying fix works through
  Gradio UI streaming pipeline

Fixes huggingface#1569
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Parallel function calling does not work in streaming mode

1 participant