-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Python: Fix: Preserve text content when usage data is present in streaming chunks #3436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Python: Fix: Preserve text content when usage data is present in streaming chunks #3436
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes a critical bug in the OpenAI chat client's streaming response parser where text content was lost when usage data and text content appeared in the same chunk. This issue specifically affected providers like Gemini that include both types of data in a single streaming chunk.
Changes:
- Removed early return when
chunk.usageis present in_parse_response_update_from_openai - Usage data is now appended to the contents list instead of triggering an early return
- Text and tool calls are processed normally after usage data, allowing all content types to be preserved
| # BUGFIX: Process usage alongside text/tool calls instead of early return | ||
| # Gemini (and potentially other providers) include both usage and content in the same chunk |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment format "BUGFIX:" is not consistent with the existing comment style in this file. Other explanatory comments in the codebase (e.g., line 414: "When you enable asynchronous content filtering...") use a more descriptive, natural language style without special prefixes. Consider rephrasing to match the existing style, such as: "Process usage alongside text/tool calls to preserve all content types. Some providers (e.g., Gemini) include both usage and content in the same chunk."
| # BUGFIX: Process usage alongside text/tool calls instead of early return | |
| # Gemini (and potentially other providers) include both usage and content in the same chunk | |
| # Process usage alongside text and tool calls to preserve all content types. | |
| # Some providers (for example Gemini) include both usage and content in the same chunk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
- Remove early return in _parse_response_update_from_openai when chunk.usage exists - Process usage alongside text and tool calls in the same chunk - Fixes streaming with Gemini and other providers that include usage+text in same chunk
2d4d8a1 to
30ebc75
Compare
Fix: Preserve text content when usage data is present in streaming chunks
Summary
Fixes streaming response parsing bug where text content is lost when
chunk.usageis present in the same chunk asdelta.content. This affects providers like Gemini that include both usage and text in the same chunk.Fixes #[ISSUE_NUMBER]
Problem
The current implementation in
_parse_response_update_from_openaiperforms an early return whenchunk.usageis present:Impact:
Affected providers: Gemini (confirmed), potentially others that follow OpenAI spec strictly
Solution
Remove the early return and process usage alongside text/tool calls:
contentslist (instead of early return)choicesChatResponseUpdateChanges
Modified Files
agent_framework/openai/_chat_client.py:_parse_response_update_from_openaimethod (lines 307-348)chunk.usageTests Added
test_parse_response_update_with_usage_and_text(): Verifies text is preserved when usage is presenttest_parse_response_update_usage_only(): Verifies usage-only chunks work correctlytest_parse_response_update_text_only(): Verifies text-only chunks (regression test)Verification
Before Fix
After Fix
Backward Compatibility
✅ OpenAI: No impact (likely sends usage in separate chunks)
✅ Azure OpenAI: No impact (tested)
✅ Existing tests: All pass
Testing Performed
Checklist
Additional Notes
This fix enables full Gemini support for:
The change is minimal and focused on the root cause, ensuring no side effects on other providers.