-
Notifications
You must be signed in to change notification settings - Fork 2k
Support compound field access after subscripts, e.g. payload[1].a #21384
Description
Description
DataFusion accepts compound field access like:
SELECT payload.a[1]
FROM tbut rejects:
SELECT payload[1].a
FROM tand reports error 'Dot access not supported for non-string expr', even though both represent valid nested field access patterns and both are parsed by sqlparser
as Expr::CompoundFieldAccess.
AST Shape
In sqlparser, both expressions are represented as Expr::CompoundFieldAccess.
payload.a[1]
Expr::CompoundFieldAccess
├── root: Identifier("payload")
└── access_chain:
1. Dot(Identifier("a"))
2. Subscript(Index(Value(Number("1"))))
Intended logical chain:
expr0 = Column("payload")
expr1 = GetField(expr0, NamedStructField("a"))
expr2 = GetField(expr1, ListIndex(1))
payload[1].a
Expr::CompoundFieldAccess
├── root: Identifier("payload")
└── access_chain:
1. Subscript(Index(Value(Number("1"))))
2. Dot(Identifier("a"))
Intended logical chain:
expr0 = Column("payload")
expr1 = GetField(expr0, ListIndex(1))
expr2 = GetField(expr1, NamedStructField("a"))
Current Behavior
In sql_compound_field_access_to_expr, dot access currently accepts only a string literal form:
AccessExpr::Dot(SQLExpr::Value(SingleQuotedString | DoubleQuotedString))
and rejects:
AccessExpr::Dot(SQLExpr::Identifier(_))
As a result, payload[1].a fails to plan even though the field name a is statically known.
Expected Behavior
payload[1].a should be accepted and planned the same way as other named struct field accesses,
producing a GetFieldAccess::NamedStructField step after the subscript access.
In other words, these should both be supported:
payload.a[1]payload[1].a
Why This Matters
Downstream systems currently need SQL AST rewrites to convert payload[1].a into a form that
DataFusion accepts internally. Supporting this directly in DataFusion would remove that
workaround and make nested field access behavior more consistent.