Fix dynamic filter pushdown to wrong table with same-named columns #20303
+264
−58
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #20213
Rationale for this change
When both sides of a join have columns with the same name (e.g.
k), the dynamic filter from an outer join was incorrectly pushed to both children instead of only the correct one. With small row groups this caused wrong results (0 rows instead of the expected result set).The root cause was that
FilterColumnCheckerinfilter_pushdown.rsmatched columns by name only. When the parent pushed a filter referencing columnkat index 2 (the right child'sk), the name-based check foundkin the left child's schema too, and incorrectly pushed the filter to both sides.What changes are included in this PR?
Approach adopted from #20192:
FilterColumnCheckernow matches on(name, index)pairs instead of just names, preventing incorrect cross-side pushdown when columns share namesChildFilterDescription::from_child_with_allowed_columns— new method that restricts pushdown to an explicit set of allowed(name, index)pairsChildFilterDescription::all_unsupported— helper to mark all filters unsupported for a childHashJoinExec::gather_filters_for_pushdown— builds per-side allowed-column sets fromcolumn_indices(+ optionalprojection), useslr_is_preservedto gate pushdown eligibility per join typelr_is_preserved— mirrors the logical optimizer's preserved-side logic, enabling parent filter pushdown for non-inner join types (Left, Right, Semi, Anti, Mark)Are these changes tested?
Yes:
lr_is_preservedcovering all join typesCOUNT(*)andSELECT *correctnessAre there any user-facing changes?
🤖 Generated with Claude Code