Skip to content

csharp: cap pre-@type token buffer in JsonParser.MergeAny to prevent O(N^2) memory growth#26851

Open
MindflareX wants to merge 1 commit intoprotocolbuffers:mainfrom
MindflareX:fix/csharp-mergeany-token-cap
Open

csharp: cap pre-@type token buffer in JsonParser.MergeAny to prevent O(N^2) memory growth#26851
MindflareX wants to merge 1 commit intoprotocolbuffers:mainfrom
MindflareX:fix/csharp-mergeany-token-cap

Conversation

@MindflareX
Copy link
Copy Markdown

Summary

JsonParser.MergeAny scans an Any object's body for the @type property by consuming and recording tokens into a per-call List<JsonToken> until @type is reached. The recording is unbounded and includes the full content of any nested objects encountered before @type, so an Any whose value subtree itself contains further @type-last Any objects forces every MergeAny stack frame to buffer the unread remainder of its body before recursing.

The combined cost across N nested layers grows as O(N²): a small input (a few hundred KB) can allocate gigabytes of GC memory and exhaust the parser well before any recursion-depth limit is reached. Callers that legitimately raise JsonParser.Settings.RecursionLimit remain affected regardless of any other limit checks.

Empirical reproduction

Building a payload of the form {"value":{"value":{...{}...,"@type":".../Any"},...},"@type":".../Any"} (i.e. value first, @type last at every layer) against the unpatched parser:

Depth Payload size GC alloc delta Amplification
1,000 60 KB 83 MB 1,381×
2,000 120 KB 329 MB 2,745×
4,000 240 KB 1.31 GB 5,476×

Doubling the depth quadruples the allocation, confirming the quadratic shape. The amplification is independent of the recursion-depth limit and is reachable regardless of the RecursionLimit value.

Fix

Bound the per-call token buffer to 100 tokens. The protobuf JSON canonical form places @type first (1 token recorded); even in the @type-last form this cap allows roughly 50 scalar fields ahead of @type, which is more than sufficient for any well-formed Any body. When the cap is exceeded, the parser throws InvalidProtocolBufferException with a clear message pointing the user at the @type placement.

The cap is intentionally small and constant (rather than scaled to RecursionLimit) so that the worst-case per-call buffer is bounded irrespective of how callers configure their settings.

Test plan

  • Added regression test Any_TypeUrlBufferLimit (200-deep @type-last payload, asserts rejection with a message mentioning @type)
  • All 333 JsonParserTest cases pass (332 existing + 1 new)
  • Existing Any_RegularMessage and Any_WellKnownType tests, which use @type-last payloads with 5-8 pre-@type tokens, continue to pass
  • Built against netstandard1.1, netstandard2.0, net45, net50

Related

This is a separate root cause from the recursion-depth bypass addressed in #26835 and applies independently of that fix. After #26835 lands, the default RecursionLimit=100 happens to bound the worst-case memory amplification to ~1 MB, but any caller that raises the limit (e.g. for legitimately deep typed-Any payloads) remains exposed to the full quadratic growth without this change.

JsonParser.MergeAny scans an Any object's body for the "@type" property
by consuming and recording tokens into a per-call List<JsonToken> until
"@type" is reached. The recording is unbounded and includes the full
content of any nested objects encountered before "@type", so an Any
whose "value" subtree itself contains further "@type"-last Any objects
forces every MergeAny stack frame to buffer the unread remainder of its
body before recursing.

The combined cost across N nested layers grows as O(N^2), allowing a
small input (a few hundred KB) to allocate gigabytes of GC memory and
exhaust the parser well before any recursion-depth limit is reached.
Callers that legitimately raise JsonParser.Settings.RecursionLimit
remain affected by this regardless of any other limit checks.

Bound the per-call buffer to 100 tokens. The protobuf JSON canonical
form places "@type" first; this cap is generous enough for any
well-formed Any body in either ordering, while preventing the
quadratic growth.

Includes a regression test that constructs a 200-deep "@type"-last
Any payload and asserts it is rejected with an
InvalidProtocolBufferException mentioning "@type".
@MindflareX MindflareX requested a review from a team as a code owner April 11, 2026 17:55
@MindflareX MindflareX requested review from jskeet and removed request for a team April 11, 2026 17:55
@jskeet jskeet removed their request for review April 11, 2026 18:05
@jskeet
Copy link
Copy Markdown
Contributor

jskeet commented Apr 11, 2026

I don't have the time to review this at this time.

@MindflareX
Copy link
Copy Markdown
Author

Understood, thanks. It's independent of #26835, so no ordering needed — feel free to come back to it (or hand it off) whenever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants