Skip to content

Conversation

@andygrove
Copy link
Member

Summary

  • Makes relabel_array in cast_column.rs recursive so it correctly handles nested type mismatches (field names/metadata differences) in List, LargeList, Map, and Struct arrays
  • The previous shallow ArrayData type swap caused panics when Arrow's ArrayData::build() validated child types recursively
  • Fixes three failure categories: List element naming ("item" vs "element"), Map field naming ("key_value" vs "entries"), and PARQUET:field_id metadata mismatches in nested structures

Test plan

  • Added unit test for List field name relabeling ("item" -> "element")
  • Added unit test for Map entries field relabeling ("key_value" -> "entries")
  • Added unit test for Struct with PARQUET:field_id metadata stripping
  • Added unit test for nested Struct containing List with different field names
  • All existing tests continue to pass

🤖 Generated with Claude Code

The shallow ArrayData type swap in relabel_array caused panics when
Arrow's ArrayData::build() validated child types recursively. This
rebuilds arrays from typed constructors (ListArray, LargeListArray,
MapArray, StructArray) so nested field name and metadata differences
are handled correctly.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@andygrove andygrove changed the title fix: make relabel_array recursive for nested type mismatches fix: [df52] make relabel_array recursive for nested type mismatches Feb 11, 2026
Co-Authored-By: Claude Opus 4.6 <[email protected]>
@comphead comphead merged commit 2482c43 into apache:df52 Feb 11, 2026
76 of 109 checks passed
@andygrove andygrove deleted the fix/relabel-array-failures branch February 11, 2026 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants