Skip to content

[BugFix][Relax] Add structural_equal verification to subroutine cache lookup#18962

Open
3em0 wants to merge 3 commits intoapache:mainfrom
3em0:fix/subroutine-hash-collision-v2
Open

[BugFix][Relax] Add structural_equal verification to subroutine cache lookup#18962
3em0 wants to merge 3 commits intoapache:mainfrom
3em0:fix/subroutine-hash-collision-v2

Conversation

@3em0
Copy link
Copy Markdown

@3em0 3em0 commented Apr 1, 2026

Summary

  • SubroutineMixin._get_subroutine() used structural_hash as the sole cache key without structural_equal verification. If two different arg_sinfo values produced the same 64-bit hash (collision), the cache would return a previously compiled function with mismatched parameter shapes, leading to silently incorrect compiled output.
  • Changed the cache to store a list of (arg_sinfo, result) pairs per hash bucket and verify with structural_equal on lookup, consistent with the pattern in block_builder.cc.
  • Added a security advisory document and regression test.

Root Cause

The subroutine cache (cls._gvar) was keyed by (structural_hash(arg_sinfo), is_dataflow). A hash match was treated as proof of structural equality, skipping the necessary structural_equal check. This is a hash-only lookup anti-pattern — hash determines the bucket, but equality must confirm the match.

For comparison, block_builder.cc correctly uses StructuralHash + StructuralEqual together as the hash and equality functions for std::unordered_map.

Test plan

  • Existing test test_linear passes (no regression)
  • New test test_different_shapes_produce_distinct_subroutines passes — verifies that the same Module class with different input shapes generates distinct subroutines

🤖 Generated with Claude Code

… lookup

SubroutineMixin._get_subroutine() used structural_hash as the sole
cache key without structural_equal verification. If two different
arg_sinfo values produced the same 64-bit hash, the cache would
return a previously compiled function with mismatched parameter
shapes, leading to silently incorrect compiled output.

The fix changes the cache to store a list of (arg_sinfo, result)
pairs per hash bucket and verifies each candidate with
structural_equal before returning. This follows the same pattern
used in block_builder.cc (StructuralHash + StructuralEqual).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a correctness issue in the subroutine cache of SubroutineMixin._get_subroutine() where relying solely on structural_hash could lead to incorrect function reuse. The fix implements a bucketed cache that verifies hits using structural_equal. A security advisory and a regression test are also included. Feedback suggests further refining the cache lookup_key by including the function object to prevent collisions between different implementations with identical input signatures.

3em0 and others added 2 commits April 1, 2026 10:21
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
The previous test used nn.op.add on tensors with incompatible shapes
(batch_size, 32) + (batch_size, 64). Changed to use Linear layers
that project different input dimensions to the same output dimension,
making the add valid while still testing distinct subroutine generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant