upb: add ASLR-based seed to integer hash function#26907
Open
MindflareX wants to merge 1 commit intoprotocolbuffers:mainfrom
Open
upb: add ASLR-based seed to integer hash function#26907MindflareX wants to merge 1 commit intoprotocolbuffers:mainfrom
MindflareX wants to merge 1 commit intoprotocolbuffers:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Author
|
@googlebot I signed it! |
0a58f23 to
991a6f1
Compare
Author
The integer hash function `upb_inthash()` is completely deterministic (no seed), unlike the string hash function which uses an ASLR-based seed via `_upb_seed`. This allows an attacker to trivially precompute keys that all hash to the same bucket, causing O(N^2) insertion time for N map entries during protobuf parsing. For example, a ~500KB protobuf message with 50,000 colliding int32 map keys takes ~1000x longer to parse than the same number of non-colliding keys. This affects all UPB-backed runtimes: Python, Ruby, PHP, and Rust. The fix adds a separate ASLR-based seed (`_upb_int_seed`) that is XORed into the key before hashing, matching the approach already used for string-keyed maps. This makes the hash function non-deterministic across process invocations when ASLR is enabled.
991a6f1 to
0954699
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The integer hash function
upb_inthash()inupb/hash/common.cis completely deterministic — it uses no seed or randomization. In contrast, the string hash function already uses an ASLR-based seed (_upb_seed) via Wyhash.This asymmetry allows an attacker to trivially precompute integer keys that all hash to the same bucket in
upb_inttable, causing O(N²) insertion time for N entries. Any protobuf message with amap<int32, ...>,map<int64, ...>,map<uint32, ...>, ormap<uint64, ...>field parsed from untrusted input is affected.Impact
All UPB-backed protobuf runtimes are affected: Python, Ruby, PHP, and Rust.
Empirical measurements (Python 3.13, protobuf 6.33.6, x86-64):
A ~500 KB protobuf message with 50,000 colliding keys takes over 1,000× longer to parse than the same message with non-colliding keys. The scaling is quadratic — 200,000 entries would take ~37 seconds.
Collision construction
For
map<int32, ...>on 64-bit, the hash is just(uint32_t)key(high 32 bits are zero). The bucket ishash & maskwheremask = table_size - 1. Keys that are multiples of the final table size (a power of 2) all hash to bucket 0 at every intermediate table size:Prior art
_upb_seed(line 448, 455)6bde8c417, Feb 2025) due to Ruby test flakiness, then restored (commit8ef81fbd9)absl::Hashwith per-allocation randomized salt — not affectedLinkedHashMapwith tree fallback at 8 collisions — partially mitigatedFix
Added a separate ASLR-based seed variable (
_upb_int_seed) and XOR it into the key before hashing, matching the approach already used for string-keyed maps. This makes the hash function non-deterministic across process invocations when ASLR is enabled.Files changed
upb/hash/common.c— sourceruby/ext/google/protobuf_c/ruby-upb.c— Ruby amalgamationphp/ext/google/protobuf/php-upb.c— PHP amalgamation