planner: generate PK filters prototype (wip)#66292
planner: generate PK filters prototype (wip)#66292terry1purcell wants to merge 4 commits intopingcap:masterfrom
Conversation
|
Posted PR review with 3 inline P2 comments: #66292 (review) |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @terry1purcell. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #66292 +/- ##
================================================
- Coverage 77.7002% 77.2551% -0.4451%
================================================
Files 2006 1928 -78
Lines 548386 536430 -11956
================================================
- Hits 426097 414420 -11677
- Misses 120629 122002 +1373
+ Partials 1660 8 -1652
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| // Optimize the sub-plan to get a physical plan | ||
| nthPlanBackup := sctx.GetSessionVars().StmtCtx.StmtHints.ForceNthPlan | ||
| sctx.GetSessionVars().StmtCtx.StmtHints.ForceNthPlan = -1 | ||
| optFlag := rule.FlagPredicatePushDown | rule.FlagBuildKeyInfo | rule.FlagPruneColumns |
There was a problem hiding this comment.
P2: MIN/MAX subqueries lose equality predicates
The cloned DataSource for MIN/MAX subqueries loses its equality predicates during sub-optimization. The code re-runs optimization with predicate pushdown enabled, but DataSource.PredicatePushDown receives an empty predicates slice and overwrites ds.PushedDownConds, resulting in IndexFullScan instead of efficient range scans.
This causes 2 full secondary index scans per qualifying index (MIN + MAX), which can cause severe performance regressions on large tables.
Evidence: Integration test shows IndexFullScan without range predicates at tests/integrationtest/r/planner/core/casetest/rule/rule_generate_pk_filter.result:15. Root cause: empty predicates passed to DataSource.PredicatePushDown at pkg/planner/core/operator/logicalop/logical_datasource.go:175.
| } | ||
|
|
||
| // Inject conditions into DataSource's PushedDownConds | ||
| ds.PushedDownConds = append(ds.PushedDownConds, newConds...) |
There was a problem hiding this comment.
P2: DataSource filter state inconsistency
New PK filter predicates are appended only to PushedDownConds without updating AllConds. The ColumnPruner pass (which runs after this rule) only inspects AllConds to determine column usage, so it may incorrectly prune PK columns that are only referenced by the new predicates.
This can cause planner/executor errors when the predicates (containing ScalarSubQueryExpr) are evaluated and reference pruned columns.
Evidence: ds.PushedDownConds = append(ds.PushedDownConds, newConds...) here, but ColumnPruner only checks ds.AllConds at pkg/planner/core/operator/logicalop/logical_datasource.go:188.
|
@terry1purcell: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: ref #66297
This PR has the effect of taking the SQL from the issue - such as the following.
NOTE: There is an index on c, and another index on b - but no index on both columns (this is by design).
And generating "lookup" subqueries to produce the min/max PK values related to another indexes filters. An example that demonstrates how these are internally rewritten appears here. This allows the ranges to apply as either a TableRangeScan (against the PK), or against the ORDER BY index as filters against the index (since a clustered PK is part of a secondary index):
In this PR - this is triggered by using a hint to specify the generation of the PK ranges, and which index to generate the subqueries on:
Problem Summary:
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.