Skip to content

feat(spans): Flush oversized segments in chunks#111820

Open
lvthanh03 wants to merge 2 commits intomasterfrom
tony/not-drop-spans
Open

feat(spans): Flush oversized segments in chunks#111820
lvthanh03 wants to merge 2 commits intomasterfrom
tony/not-drop-spans

Conversation

@lvthanh03
Copy link
Copy Markdown
Member

Refs STREAM-826

  • Adds an option spans.buffer.flush-oversized-segments to allow for flushing entire segments that exceed spans.buffer.max-segments-bytes bytes.
  • Adds the _chunk_segment() function in the flusher that splits span payloads into chunks, each chunk under max-segment-bytes

When the option is enabled, the flusher produces one Kafka message per chunk instead of one per segment.

@linear-code
Copy link
Copy Markdown

linear-code bot commented Mar 30, 2026

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 30, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

chunks.append(current_chunk)

if len(chunks) > 1:
metrics.incr("spans.buffer.flusher.oversized_segments_chunked")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also track a distribution on the number of chunks we have? metrics.timing(..., len(chunks))?

Copy link
Copy Markdown
Member

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also drop spans during ingestion into the redis lua script, those spans are still dropped. Instead we should stop merging sets and flush them out individually. This will make it unnecessary to do the same in the flusher, and preserves the invariant that there is only one root span per flushed chunk. Right now it's possible that multiple unrelated spans (i.e. distinct trees) are flushed in a single chunk and it's not clear to me how the segments consumer handles that (or that it's supposed to)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants