Python: Fix: Preserve text content when usage data is present in streaming chunks #3436

toshihiro-otani · 2026-01-26T13:53:42Z

Fix: Preserve text content when usage data is present in streaming chunks

Summary

Fixes streaming response parsing bug where text content is lost when chunk.usage is present in the same chunk as delta.content. This affects providers like Gemini that include both usage and text in the same chunk.

Fixes #[ISSUE_NUMBER]

Problem

The current implementation in _parse_response_update_from_openai performs an early return when chunk.usage is present:

if chunk.usage:
    return ChatResponseUpdate(...)  # Early return skips text parsing!

Impact:

❌ Streaming text content is lost (always empty string)
❌ Tool calls are also skipped
❌ Final conversation contains empty messages
✅ Non-streaming responses work correctly

Affected providers: Gemini (confirmed), potentially others that follow OpenAI spec strictly

Solution

Remove the early return and process usage alongside text/tool calls:

Add usage to contents list (instead of early return)
Continue processing text and tool calls from choices
All content types are preserved in the final ChatResponseUpdate

Changes

Modified Files

agent_framework/openai/_chat_client.py:
- _parse_response_update_from_openai method (lines 307-348)
- Removed early return for chunk.usage
- Process usage, text, and tool calls in a single flow

Tests Added

test_parse_response_update_with_usage_and_text(): Verifies text is preserved when usage is present
test_parse_response_update_usage_only(): Verifies usage-only chunks work correctly
test_parse_response_update_text_only(): Verifies text-only chunks (regression test)

Verification

Before Fix

# Streaming with Gemini
async for update in agent.run_stream("Count 1 to 3"):
    print(update.text)  # Output: '' (empty!)

After Fix

# Streaming with Gemini
async for update in agent.run_stream("Count 1 to 3"):
    print(update.text)  # Output: '1', ',', '2, 3' (correct!)

Backward Compatibility

✅ OpenAI: No impact (likely sends usage in separate chunks)
✅ Azure OpenAI: No impact (tested)
✅ Existing tests: All pass

Testing Performed

✅ Unit tests for new behavior
✅ Integration test with Gemini streaming
✅ Integration test with OpenAI streaming (regression)
✅ Multi-agent workflow with Gemini (end-to-end)
✅ Tool calling with Gemini streaming

Checklist

Code follows project style guidelines
Tests added for bug fix
All existing tests pass
Documentation updated (if applicable)
Issue linked in PR description
Verified backward compatibility

Additional Notes

This fix enables full Gemini support for:

✅ Streaming text responses
✅ Tool calling in streaming mode
✅ Multi-agent workflows
✅ Usage tracking alongside content

The change is minimal and focused on the root cause, ensuring no side effects on other providers.

Copilot

Pull request overview

This pull request fixes a critical bug in the OpenAI chat client's streaming response parser where text content was lost when usage data and text content appeared in the same chunk. This issue specifically affected providers like Gemini that include both types of data in a single streaming chunk.

Changes:

Removed early return when chunk.usage is present in _parse_response_update_from_openai
Usage data is now appended to the contents list instead of triggering an early return
Text and tool calls are processed normally after usage data, allowing all content types to be preserved

Copilot · 2026-01-26T13:58:06Z

python/packages/core/agent_framework/openai/_chat_client.py

+        # BUGFIX: Process usage alongside text/tool calls instead of early return
+        # Gemini (and potentially other providers) include both usage and content in the same chunk


The comment format "BUGFIX:" is not consistent with the existing comment style in this file. Other explanatory comments in the codebase (e.g., line 414: "When you enable asynchronous content filtering...") use a more descriptive, natural language style without special prefixes. Consider rephrasing to match the existing style, such as: "Process usage alongside text/tool calls to preserve all content types. Some providers (e.g., Gemini) include both usage and content in the same chunk."

Suggested change

# BUGFIX: Process usage alongside text/tool calls instead of early return

# Gemini (and potentially other providers) include both usage and content in the same chunk

# Process usage alongside text and tool calls to preserve all content types.

# Some providers (for example Gemini) include both usage and content in the same chunk.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

markwallace-microsoft · 2026-01-28T21:07:17Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework/openai
_chat_client.py	268	22	91%	44, 178–179, 183, 269, 276, 356–363, 365–368, 378, 463, 500, 516
TOTAL	16246	2373	85%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
3480	221 💤	0 ❌	0 🔥	1m 6s ⏱️

- Remove early return in _parse_response_update_from_openai when chunk.usage exists - Process usage alongside text and tool calls in the same chunk - Fixes streaming with Gemini and other providers that include usage+text in same chunk

Copilot AI review requested due to automatic review settings January 26, 2026 13:53

markwallace-microsoft added the python label Jan 26, 2026

Copilot started reviewing on behalf of toshihiro-otani January 26, 2026 13:54 View session

github-actions bot changed the title ~~Fix: Preserve text content when usage data is present in streaming chunks~~ Python: Fix: Preserve text content when usage data is present in streaming chunks Jan 26, 2026

Copilot AI reviewed Jan 26, 2026

View reviewed changes

eavanvalkenburg approved these changes Jan 28, 2026

View reviewed changes

eavanvalkenburg force-pushed the fix-streaming-usage-3434 branch from 2d4d8a1 to 30ebc75 Compare January 28, 2026 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Fix: Preserve text content when usage data is present in streaming chunks #3436

Python: Fix: Preserve text content when usage data is present in streaming chunks #3436

Uh oh!

toshihiro-otani commented Jan 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

markwallace-microsoft commented Jan 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# BUGFIX: Process usage alongside text/tool calls instead of early return
		# Gemini (and potentially other providers) include both usage and content in the same chunk

Python: Fix: Preserve text content when usage data is present in streaming chunks #3436

Are you sure you want to change the base?

Python: Fix: Preserve text content when usage data is present in streaming chunks #3436

Uh oh!

Conversation

toshihiro-otani commented Jan 26, 2026

Fix: Preserve text content when usage data is present in streaming chunks

Summary

Problem

Solution

Changes

Modified Files

Tests Added

Verification

Before Fix

After Fix

Backward Compatibility

Testing Performed

Checklist

Additional Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

markwallace-microsoft commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

markwallace-microsoft commented Jan 28, 2026 •

edited

Loading