Skip to content

Comments

[SPARK][SQL] Block constraints in CTAS/RTAS at parser level#54454

Open
yyanyy wants to merge 1 commit intoapache:masterfrom
yyanyy:block-ctas-constraints
Open

[SPARK][SQL] Block constraints in CTAS/RTAS at parser level#54454
yyanyy wants to merge 1 commit intoapache:masterfrom
yyanyy:block-ctas-constraints

Conversation

@yyanyy
Copy link
Contributor

@yyanyy yyanyy commented Feb 24, 2026

What changes were proposed in this pull request?

The grammar accepts constraint specifications (PRIMARY KEY, UNIQUE, CHECK, FOREIGN KEY) in CREATE TABLE AS SELECT and REPLACE TABLE AS SELECT, but the execution layer silently drops them. Neither the ANSI SQL standard nor PostgreSQL supports this syntax - the SQL standard makes table element lists and AS subquery clauses mutually exclusive. Block this at the parser level, consistent with existing CTAS/RTAS checks for schema columns and partition column types.

Why are the changes needed?

Explicitly throw exception for an unsupported case for clarity

Does this PR introduce any user-facing change?

No

How was this patch tested?

unit test

Was this patch authored or co-authored using generative AI tooling?

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

@yyanyy
Copy link
Contributor Author

yyanyy commented Feb 24, 2026

Also looked into updating this on grammar level (e.g. change .g4) files but decided to go with this approach as grammar level changes would require refactoring of either splitting create table and CTAS, or splitting tableElementList and code duplication in visit pattern; in comparison, the current approach follows established pattern (e.g. partition column check) and is much smaller.

ExpectedContext(fragment = sql2, start = 0, stop = 61))
}

test("CTAS statement with constraints") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the test cases to ConstraintParseSuiteBase

@yyanyy yyanyy force-pushed the block-ctas-constraints branch from 30402a2 to eabd0b3 Compare February 24, 2026 20:06
comparePlans(parsed, expected)
}

test("CTAS with constraints is not allowed") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, the same test will be run 4 times. If it is hard to avoid that, let's revert and use the original plan resolution suite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good; an alternative would be to parameterized the constraint definition string per subclass extending this base class and keep tests here, but indeed it will still result in many more test runs and feels unnecessarily complex for a simple change like this...

The grammar accepts constraint specifications (PRIMARY KEY, UNIQUE, CHECK,
FOREIGN KEY) in CREATE TABLE AS SELECT and REPLACE TABLE AS SELECT, but
the execution layer silently drops them. Neither the ANSI SQL standard
nor PostgreSQL supports this syntax — the SQL standard makes table element
lists and AS subquery clauses mutually exclusive. Block this at the parser
level, consistent with existing CTAS/RTAS checks for schema columns and
partition column types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@yyanyy yyanyy force-pushed the block-ctas-constraints branch from eabd0b3 to a1bd9a5 Compare February 24, 2026 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants