-
Notifications
You must be signed in to change notification settings - Fork 44
Fix hash table corruption from removing entries during iteration. #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
+102
−29
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The relmetacache_flush function was calling hash_search with HASH_REMOVE while iterating through RelMetaCache using hash_seq_search. This corrupts the iterator state because removing an entry deallocates memory that the iterator references, reorganizes bucket chains that invalidate the iterator's position, and may trigger dynamic resizing. The result is segmentation faults from accessing freed memory, skipped entries when the iterator loses its position, or infinite loops from corrupted chain pointers. Fixed by collecting all relids in a first pass, then removing them in a second pass after iteration completes. The relmetacache_prune function had the same issue when removing invalid cache entries during iteration. It was calling hash_search with HASH_REMOVE on entries marked as invalid while hash_seq_search was actively scanning the hash table. This causes identical corruption problems as the iterator's cached bucket index and chain pointers become stale when entries are physically removed from the table. Fixed using the same two-pass pattern: collect relids of invalid entries during iteration, then remove them after the scan finishes. The reset_channel_stats function was removing all statistics entries from SpockHash while iterating with hash_seq_search. This violated the fundamental rule that hash tables must not be modified during sequential scans. The iterator maintains pointers to the current entry and the next entry to visit, both of which become dangling pointers when hash_search removes the current entry. Under concurrent load this causes crashes with hash table corrupted errors and potential loss of statistics data. Fixed by copying all keys during iteration, then removing entries in a separate loop after the iterator completes its scan.
mason-sharp
requested changes
Jan 7, 2026
Contributor
|
In the code: old_ctxt = MemoryContextSwitchTo(RelMetaCacheContext);
RelMetaCache = hash_create("spock relation metadata cache",
RELMETACACHE_INITIAL_SIZE,
&ctl, hash_flags);
(void) MemoryContextSwitchTo(old_ctxt);It is not necessary to use MemoryContextSwitchTo at all - we already instructed hash_create to use a specific memory context. Hence, we may remove the old_ctxt variable altogether. |
danolivo
requested changes
Jan 8, 2026
The previous commit added extensive documentation explaining the hash table iterator corruption issue. While thorough, these comments were overly verbose for production code. This commit condenses them to be more concise while preserving the essential information: the problem (calling hash_search HASH_REMOVE during hash_seq_search corrupts the iterator state), the consequences (crashes, skipped entries), and the solution (two-pass approach: collect keys, then remove). Simplified comments in relmetacache_flush, relmetacache_prune, and reset_channel_stats functions without changing any code logic.
The hash_create call was configured with HASH_CONTEXT flag and ctl.hcxt = RelMetaCacheContext, which instructs the hash table API to perform all allocations in the specified context automatically. The explicit MemoryContextSwitchTo calls around hash_create and hash_search were therefore redundant and have been removed. This simplifies the code by removing 2 variable declarations and 4 lines of context switch boilerplate. The hash table operations will continue to use RelMetaCacheContext for all allocations as configured.
1. Missing NULL checks after hash_search(HASH_ENTER) on HASH_FIXED_SIZE tables. When SpockHash or SpockGroupHash reach capacity, hash_search returns NULL instead of expanding the table. Dereferencing NULL causes segmentation faults. Added NULL checks in handle_stats_counter(), spock_group_attach(), and spock_group_progress_update(). 2. Race condition in handle_stats_counter() where the function releases LW_SHARED, checks hash fullness without lock, then reacquires LW_EXCLUSIVE. Between these locks, concurrent processes can fill the table, causing HASH_ENTER to return NULL unexpectedly. The added NULL check handles this TOCTOU (time-of-check-to-time-of-use) race. 3. Lock mismatch in get_channel_stats() causing torn reads of 64-bit counter values. The reader held only LW_SHARED while writers used per-entry spinlocks (entry->mutex). Added spinlock acquisition in reader to ensure atomic access and prevent data races that violate the C memory model. All three fixes prevent crashes from NULL pointer dereference and ensure data consistency for statistics counters in shared memory hash tables.
Replace the two-pass approach (collect keys then remove) with the simpler pattern used in PostgreSQL core code: iterate once and remove entries during iteration. This matches the style used in ShippableCache and other similar hash table flush operations in PostgreSQL. The previous implementation re-initialized the iterator after each removal to avoid iterator corruption, but PostgreSQL's pattern of removing entries during iteration works correctly for hash tables when flushing all entries.
Contributor
|
There are already 6 commits on the branch. Maybe squash/restructure them somehow? |
Contributor
Author
|
New clean PR created |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The relmetacache_flush function was calling hash_search with HASH_REMOVE while iterating through RelMetaCache using hash_seq_search. This corrupts the iterator state because removing an entry deallocates memory that the iterator references, reorganizes bucket chains that invalidate the iterator's position, and may trigger dynamic resizing. The result is segmentation faults from accessing freed memory, skipped entries when the iterator loses its position, or infinite loops from corrupted chain pointers. Fixed by collecting all relids in a first pass, then removing them in a second pass after iteration completes.
The relmetacache_prune function had the same issue when removing invalid cache entries during iteration. It was calling hash_search with HASH_REMOVE on entries marked as invalid while hash_seq_search was actively scanning the hash table. This causes identical corruption problems as the iterator's cached bucket index and chain pointers become stale when entries are physically removed from the table. Fixed using the same two-pass pattern: collect relids of invalid entries during iteration, then remove them after the scan finishes.
The reset_channel_stats function was removing all statistics entries from SpockHash while iterating with hash_seq_search. This violated the fundamental rule that hash tables must not be modified during sequential scans. The iterator maintains pointers to the current entry and the next entry to visit, both of which become dangling pointers when hash_search removes the current entry. Under concurrent load this causes crashes with hash table corrupted errors and potential loss of statistics data. Fixed by copying all keys during iteration, then removing entries in a separate loop after the iterator completes its scan.