-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[CALCITE-7279] Resolve ClickHouse identifier resolution error by aliasing nested JOIN projections #4692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[CALCITE-7279] Resolve ClickHouse identifier resolution error by aliasing nested JOIN projections #4692
Changes from 3 commits
a62efc6
6a0ce52
0f0a280
eb97c36
4d15d72
b32d386
fb9fc8e
b7c5a8d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -223,6 +223,73 @@ private static class AliasReplacementShuttle extends SqlShuttle { | |
| } | ||
| } | ||
|
|
||
| /** | ||
| * Wraps a nested join into a subquery with explicit column aliases. | ||
| * | ||
| * <p>This is specifically required for dialects like ClickHouse where the | ||
| * identifier resolver cannot resolve columns with internal table qualifiers | ||
| * (e.g., 'd1.loc') when they are wrapped in a subquery alias. By forcing | ||
| * an explicit 'AS' projection for every column, we ensure the identifiers | ||
| * are flattened and visible to the outer query block. | ||
| * | ||
| * @param input The Result of the nested join branch | ||
| * @param inputRel The RelNode representing the join branch to extract row types | ||
| * @param outerAlias The alias to be assigned to the wrapped subquery | ||
| * @return A new Result containing the wrapped SQL with explicit aliases | ||
| */ | ||
| protected Result wrapNestedJoin(Result input, RelNode inputRel, String outerAlias) { | ||
| final SqlParserPos pos = SqlParserPos.ZERO; | ||
|
|
||
| // Extract the underlying SqlSelect from the input Result. | ||
| // We manipulate the Select node directly to avoid redundant nesting | ||
| // often introduced by Result.asSelect(). | ||
| final SqlSelect innerSelect = input.asSelect(); | ||
| final SqlNodeList originalSelectList = innerSelect.getSelectList(); | ||
| final List<String> fieldNames = inputRel.getRowType().getFieldNames(); | ||
|
|
||
| final List<SqlNode> newSelectList = new ArrayList<>(); | ||
|
|
||
| // Iterate through the fields to build explicit projections. | ||
| // Example: transforms 'd1.deptno' into 'd1.deptno AS deptno'. | ||
| for (int i = 0; i < fieldNames.size(); i++) { | ||
| SqlNode expr = originalSelectList.get(i); | ||
| String targetName = fieldNames.get(i); | ||
|
|
||
| // If the expression is already aliased, strip the AS to get the raw expression. | ||
| if (expr.getKind() == SqlKind.AS) { | ||
| expr = ((SqlCall) expr).operand(0); | ||
| } | ||
|
|
||
| // Force an explicit alias. This breaks the link to the internal table | ||
| // qualifiers (like 'd1' or 'd2') and exports the column as a clean | ||
| // identifier for the outer join scope. | ||
| newSelectList.add( | ||
| SqlStdOperatorTable.AS.createCall( | ||
| pos, | ||
| expr, | ||
| new SqlIdentifier(targetName, pos))); | ||
| } | ||
|
|
||
| // Update the select list of the inner query with flattened aliases. | ||
| innerSelect.setSelectList(new SqlNodeList(newSelectList, pos)); | ||
|
|
||
| // Wrap the modified Select node with the outer alias (e.g., AS j). | ||
| SqlNode wrappedNode = | ||
| SqlStdOperatorTable.AS.createCall(pos, | ||
| innerSelect, | ||
| new SqlIdentifier(outerAlias, pos)); | ||
|
|
||
| // Return the new Result, resetting the internal clauses to prevent | ||
| // the Implementor from adding additional SELECT wrappers. | ||
| return new Result( | ||
| wrappedNode, | ||
| Collections.emptyList(), | ||
| outerAlias, | ||
| inputRel.getRowType(), | ||
| ImmutableMap.of(outerAlias, inputRel.getRowType())); | ||
| } | ||
|
|
||
|
|
||
| /** Visits a Join; called by {@link #dispatch} via reflection. */ | ||
| public Result visit(Join e) { | ||
| switch (e.getJoinType()) { | ||
|
|
@@ -233,7 +300,13 @@ public Result visit(Join e) { | |
| break; | ||
| } | ||
| final Result leftResult = visitInput(e, 0).resetAlias(); | ||
| final Result rightResult = visitInput(e, 1).resetAlias(); | ||
| Result rightResult = visitInput(e, 1).resetAlias(); | ||
|
|
||
| if (dialect.shouldWrapNestedJoin(e)) { | ||
| String newAlias = "t" + e.getId(); // alias | ||
|
||
| rightResult = wrapNestedJoin(rightResult, e.getRight(), newAlias); | ||
| } | ||
|
|
||
| final Context leftContext = leftResult.qualifiedContext(); | ||
| final Context rightContext = rightResult.qualifiedContext(); | ||
| final SqlNode sqlCondition; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is quite strange. What if there are ambiguous fields?
What will happen if I want to have something like
t1.id as customer_id, t2.id as product_id?It will become
t1.id as id,t2.id as idAlso, can you point me to ClickHouse documentation that shows this limitation?
Have you tried creating an issue in the ClickHouse repository as well ?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the PR description with the specific error log from ClickHouse 25.12.1.
To answer your concern about ambiguity: Calcite's RelDataType already ensures that all field names in the row type are unique. My implementation uses these unique names for explicit aliasing (e.g., t1.id AS id, t2.id AS id_0). This guarantees that the subquery 'exports' a clean and deterministic schema to the outer query, resolving the ClickHouse scoping limitation without risking field collisions.
I also renamed the methods to shouldWrapNestedJoin and wrapNestedJoinWithAliases to better reflect their purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test might be a good way to illustrate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created an issue in clickhouse repo by myself ClickHouse/ClickHouse#93112. I hope they also can improve it.