Fix batch atomicity when condensation forgets ObservationEvents #1450

xingyaoww · 2025-12-18T23:34:16Z

Summary

This PR fixes a bug where condensation breaks action-observation pairs when an ObservationEvent is forgotten, causing the Anthropic API to return an error.

Problem

When condensation forgets an ObservationEvent, the corresponding ActionEvent is correctly filtered out by filter_unmatched_tool_calls. However, other ActionEvents in the same batch (same llm_response_id) were NOT filtered out, which breaks the Anthropic API requirement that tool_use blocks must have corresponding tool_result blocks.

Error Message

litellm.BadRequestError: AnthropicException - {"type":"error","error":{"type":"invalid_request_error","message":"messages.28: `tool_use` ids were found without `tool_result` blocks immediately after: toolu_01L5zJ74i3tPdZDVGoMzeMHm. Each `tool_use` block must have a corresponding `tool_result` block in the next message."}}

Reproduction Scenario

Create a batch of 2 actions from the same LLM response (same llm_response_id):
- Action1 (batch A) with tool_call_id="tool_call_1"
- Action2 (batch A) with tool_call_id="tool_call_2"
Create matching observations:
- Obs1 with tool_call_id="tool_call_1"
- Obs2 with tool_call_id="tool_call_2"
Condensation forgets Obs1 (but not Action1, Action2, or Obs2)
After filtering:
- Action1 is filtered out (no matching Obs1)
- Action2 is kept (has matching Obs2)
- Obs2 is kept (has matching Action2)
Result: [Action2, Obs2]

But Action1 and Action2 were originally in the same LLM response! When Action2 is sent to the LLM without Action1, it breaks the batch structure.

Solution

This fix adds a second pass in filter_unmatched_tool_calls to enforce batch atomicity after filtering. If any ActionEvent in a batch is filtered out (due to missing ObservationEvent), all ActionEvents in that batch are also filtered out, along with their corresponding ObservationEvents.

Changes

Modified filter_unmatched_tool_calls to call a new _enforce_batch_atomicity_on_filtered method after the initial filtering
Added _enforce_batch_atomicity_on_filtered method that:
- Builds a map of llm_response_id -> list of ActionEvent IDs
- Identifies batches where some (but not all) ActionEvents were filtered out
- Removes all ActionEvents in those batches and their corresponding ObservationEvents

Testing

Added 4 new test cases in test_view_condensation_batch_atomicity.py:

test_batch_atomicity_when_observation_forgotten: Tests the main bug scenario
test_batch_atomicity_when_multiple_observations_forgotten: Tests with multiple forgotten observations
test_batch_atomicity_different_batches_independent: Tests that different batches are independent
test_single_action_batch_observation_forgotten: Tests single-action batches

All existing tests continue to pass (1419 tests).

Fixes #1449

@xingyaoww can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:5f33286-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-5f33286-python \
  ghcr.io/openhands/agent-server:5f33286-python

All tags pushed for this build

ghcr.io/openhands/agent-server:5f33286-golang-amd64
ghcr.io/openhands/agent-server:5f33286-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:5f33286-golang-arm64
ghcr.io/openhands/agent-server:5f33286-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:5f33286-java-amd64
ghcr.io/openhands/agent-server:5f33286-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:5f33286-java-arm64
ghcr.io/openhands/agent-server:5f33286-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:5f33286-python-amd64
ghcr.io/openhands/agent-server:5f33286-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:5f33286-python-arm64
ghcr.io/openhands/agent-server:5f33286-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:5f33286-golang
ghcr.io/openhands/agent-server:5f33286-java
ghcr.io/openhands/agent-server:5f33286-python

About Multi-Architecture Support

Each variant tag (e.g., 5f33286-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 5f33286-python-amd64) are also available if needed

github-actions · 2025-12-18T23:39:20Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/context
view.py	142	38	73%	41, 46, 51–52, 57–58, 63–67, 83–87, 89, 150–151, 182, 193–194, 200, 203, 254–255, 257, 259, 281–284, 287, 289–290, 297, 299–300
TOTAL	12921	5804	55%

When condensation forgets an ObservationEvent, the corresponding ActionEvent is correctly filtered out by filter_unmatched_tool_calls. However, other ActionEvents in the same batch (same llm_response_id) were NOT filtered out, which breaks the Anthropic API requirement that tool_use blocks must have corresponding tool_result blocks. This fix refactors the batch atomicity logic: 1. Extract common batch-building logic into _build_action_batches helper 2. Reuse _enforce_batch_atomicity in filter_unmatched_tool_calls to ensure batch atomicity is enforced after filtering unmatched tool calls 3. Also remove ObservationEvents whose ActionEvents were removed due to batch atomicity Also updated existing tests to properly mock the id and llm_response_id attributes on ActionEvent mocks. Fixes #1449 Co-authored-by: openhands <[email protected]>

csmith49 · 2025-12-19T05:15:13Z

Hey @xingyaoww, this might be fine but do you think you could take a look at #1412 instead? It builds off of #1380, which also needs a look.

ETA: I don't think our fixes are incompatible, but the first linked PR sets out an approach that'll be super helpful for future condenser work as well.

enyst

Thank you for looking into this!

You know, I read the description and it made sense; I started reading the code and it made sense; then I went back to the error message and... maybe it doesn't actually explain it?
messages.28: tool_use ids were found without tool_result blocks immediately after: toolu_01L5zJ74i3tPdZDVGoMzeMHm.
Each tool_use block must have a corresponding tool_result block in the next message.

If we sent (action2, obs2), action2 has a tool result, it's obs2 🤔

It's also curious that it removed the obs, but it had the action... which means the action must have been in keep_first events, those we don't touch afaik. Possible!

💭 Could it be that it inserts the summary between them? This code:

summary_offset=self.keep_first,

enyst · 2025-12-19T05:40:14Z

Oh, right, summary interfering between (so we send action2, summary, obs2) might have been exactly what happened in the issue linked to your PR, Calvin

Bug: Condensation summary can be inserted between action and observation, breaking LLM API message ordering #1395

xingyaoww · 2025-12-19T15:05:31Z

@csmith49 i do like your solution better! sorry for not checking that PR out early! I'll close this one

This commit migrates tests from PR #1450 and adds the necessary logic to enforce batch atomicity when ObservationEvents are forgotten during condensation. Changes: - Add _build_action_batches helper method to extract batch information - Update _enforce_batch_atomicity to accept warn_on_violation parameter - Update filter_unmatched_tool_calls to enforce batch atomicity after filtering unmatched tool calls - Add test_view_condensation_batch_atomicity.py with tests for: - Batch atomicity when observation is forgotten - Batch atomicity when multiple observations are forgotten - Different batches remain independent - Single action batch observation forgotten - Update existing tests to include llm_response_id and id attributes on mock ActionEvents Co-authored-by: openhands <[email protected]>

xingyaoww · 2025-12-19T15:59:59Z

ok after some discussion, @csmith49 think maybe we can merge this first & do a patch release so we can fix the CLI first before we merge his PR for a more fundamental fix

openhands-sdk/openhands/sdk/context/view.py

csmith49 · 2025-12-19T16:19:30Z

openhands-sdk/openhands/sdk/context/view.py

-
-        This prevents partial batches from being sent to the LLM, which can cause
-        API errors when thinking blocks are separated from their tool calls.
+    ) -> tuple[dict[str, list[str]], dict[str, str], dict[str, ToolCallID]]:


This return type is pretty scary. Not necessary for this quick fix PR (just making a note for later), but we should either split this into three separate functions or build a data class that has each element of the tuple as a field and return that so callers can access the mappings by name.

csmith49 · 2025-12-19T16:22:37Z

openhands-sdk/openhands/sdk/context/view.py

+        # Second pass: enforce batch atomicity for ActionEvents
+        # If any ActionEvent in a batch is removed, all ActionEvents in that
+        # batch should also be removed
+        removed_event_ids = View._enforce_batch_atomicity(events, removed_event_ids)


This call means the normal View.from_events function will call _enforce_batch_atomicity twice -- that's probably fine? Or maybe an indication that these functions need to be called after any change to the events.

Not a problem now, just leaving a note for the future integration work.

csmith49

LGTM, modulo the typing downgrade from EventID -> str

xingyaoww · 2025-12-19T17:32:17Z

@OpenHands please address all the code review comments from @csmith49

openhands-ai · 2025-12-19T17:32:26Z

I'm on it! xingyaoww can track my progress at all-hands.dev

@csmith49

Address code review feedback from @csmith49 to use the proper EventID type alias instead of raw str for event IDs and llm_response_id in the _build_action_batches method. Co-authored-by: openhands <[email protected]>

openhands-ai · 2025-12-19T17:37:01Z

I've addressed the code review comment from @csmith49 and pushed the changes to the PR.

Summary of Changes

Code Review Comment Addressed:

@csmith49 commented: "Think these str types should be EventID" on the _build_action_batches method

Changes Made:

Updated the _build_action_batches method in openhands-sdk/openhands/sdk/context/view.py to use the proper EventID type alias instead of raw str for:
- Return type annotation: tuple[dict[EventID, list[EventID]], dict[EventID, EventID], dict[EventID, ToolCallID]]
- Local variable type annotations: batches, action_id_to_response_id, and action_id_to_tool_call_id

Verification:

✅ Pre-commit hooks pass (ruff format, ruff lint, pycodestyle, pyright, import dependency rules)
✅ All 10 related batch atomicity tests pass

Note: The other two comments from @csmith49 were explicitly noted as "not necessary for this quick fix PR" and "not a problem now, just leaving a note for the future integration work" - so no action was required for those.

The changes have been pushed to the fix/condensation-batch-atomicity branch.

_{View full conversation}

xingyaoww force-pushed the fix/condensation-batch-atomicity branch from 9fc5a33 to 89b0097 Compare December 19, 2025 03:23

xingyaoww force-pushed the fix/condensation-batch-atomicity branch from 89b0097 to 4554540 Compare December 19, 2025 03:34

xingyaoww requested review from csmith49, enyst and neubig December 19, 2025 03:39

xingyaoww marked this pull request as ready for review December 19, 2025 03:39

Merge branch 'main' into fix/condensation-batch-atomicity

9d1a497

enyst reviewed Dec 19, 2025

View reviewed changes

xingyaoww closed this Dec 19, 2025

xingyaoww reopened this Dec 19, 2025

csmith49 reviewed Dec 19, 2025

View reviewed changes

openhands-sdk/openhands/sdk/context/view.py Outdated Show resolved Hide resolved

neubig removed their request for review December 19, 2025 16:19

csmith49 reviewed Dec 19, 2025

View reviewed changes

csmith49 approved these changes Dec 19, 2025

View reviewed changes

Use EventID type instead of str in _build_action_batches

8cdc6f4

Address code review feedback from @csmith49 to use the proper EventID type alias instead of raw str for event IDs and llm_response_id in the _build_action_batches method. Co-authored-by: openhands <[email protected]>

xingyaoww enabled auto-merge (squash) December 19, 2025 17:38

Merge branch 'main' into fix/condensation-batch-atomicity

c206a33

xingyaoww merged commit eb06d52 into main Dec 19, 2025
16 checks passed

xingyaoww deleted the fix/condensation-batch-atomicity branch December 19, 2025 17:40

Fix batch atomicity when condensation forgets ObservationEvents #1450

Fix batch atomicity when condensation forgets ObservationEvents #1450

Uh oh!

Conversation

xingyaoww commented Dec 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Error Message

Reproduction Scenario

Solution

Changes

Testing

Uh oh!

github-actions bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csmith49 commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enyst left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

enyst commented Dec 19, 2025

Uh oh!

xingyaoww commented Dec 19, 2025

Uh oh!

xingyaoww commented Dec 19, 2025

Uh oh!

Uh oh!

csmith49 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

csmith49 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

csmith49 left a comment

Choose a reason for hiding this comment

Uh oh!

xingyaoww commented Dec 19, 2025

Uh oh!

openhands-ai bot commented Dec 19, 2025

Uh oh!

openhands-ai bot commented Dec 19, 2025

Summary of Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xingyaoww commented Dec 18, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Dec 18, 2025 •

edited

Loading

csmith49 commented Dec 19, 2025 •

edited

Loading

enyst left a comment •

edited

Loading