Add fileNameGenerator to the constructor of IcebergInsertTableHandle by Zouxxyy · Pull Request #1571 · IBM/velox

Zouxxyy · 2026-01-07T09:05:04Z

Expose fileNameGenerator in IcebergInsertTableHandle so that users can provide a custom fileNameGenerator when creating it—for example, to include partition ID, task ID, or other information in Gluten.

Zouxxyy · 2026-01-07T09:19:10Z

CC @PingLiuPing For a look, thanks

Alchemy-item: (ID = 917) [OAP] Support struct schema evolution matching by name commit 1/1 - c816a39

Alchemy-item: (ID = 883) [OAP] [13620] Allow reading integers into smaller-range types commit 1/1 - 4cae2f5

… outer join Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 882) [OAP] [11771] Fix smj result mismatch issue commit 1/1 - ada7dd2

Alchemy-item: (ID = 954) [OAP] [14722] Fix memory leak caused by asynchronous prefetch commit 1/1 - aedf6b0

…15509)" This reverts commit d791d84. Alchemy-item: (ID = 983) Iceberg staging hub commit 1/18 - ec46102

This reverts commit fd0682b. Alchemy-item: (ID = 983) Iceberg staging hub commit 2/18 - f7d02da

…#15461)" This reverts commit 7576f4e. Alchemy-item: (ID = 983) Iceberg staging hub commit 3/18 - d7d8655

…facebookincubator#15477)" This reverts commit 1895711. Alchemy-item: (ID = 983) Iceberg staging hub commit 4/18 - aee2a67

…incubator#15443)" This reverts commit 51d4a94. Alchemy-item: (ID = 983) Iceberg staging hub commit 5/18 - 9ed4008

…15423)" This reverts commit 600524b. Alchemy-item: (ID = 983) Iceberg staging hub commit 6/18 - 710ba8d

The function toValues removes duplicated values from the vector and return them in a std::vector. It was used to build an InPredicate. It will be needed for building NOT IN filters for Iceberg equality delete read as well, therefore moving it from velox/functions/prestosql/InPred icate.cpp to velox/type/Filter.h. This commit also renames it to deDuplicateValues to make it easier to understand. Alchemy-item: (ID = 983) Iceberg staging hub commit 7/18 - 8ff8f5b

This commit introduces EqualityDeleteFileReader, which is used to read Iceberg splits with equality delete files. The equality delete files are read to construct domain filters or filter functions, which then would be evaluated in the base file readers. When there is only one equality delete field, and when that field is an Iceberg identifier field, i.e. non-floating point primitive types, the values would be converted to a list as a NOT IN domain filter, with the NULL treated separately. This domain filter would then be pushed to the ColumnReaders to filter our unwanted rows before they are read into Velox vectors. When the equality delete column is a nested column, e.g. a sub-column in a struct, or the key in a map, such column may not be in the base file ScanSpec. We need to add/remove these subfields to/from the SchemaWithId and ScanSpec recursively if they were not in the ScanSpec already. A test is also added for such case. If there are more than one equality delete field, or the field is not an Iceberg identifier field, the values would be converted to a typed expression in the conjunct of disconjunts form. This expression would be evaluated as the remaining filter function after the rows are read into the Velox vectors. Note that this only works for Presto now as the "neq" function is not registered by Spark. See https://github.com/ facebookincubator/issues/12667 Note that this commit only supports integral types. VARCHAR and VARBINARY need to be supported in future commits (see facebookincubator#12664). Co-authored-by: Naveen Kumar Mahadevuni <Naveen.Mahadevuni@ibm.com> # Conflicts: # velox/connectors/hive/iceberg/tests/IcebergReadTest.cpp # Conflicts: # velox/dwio/common/ScanSpec.h # Conflicts: # velox/type/Filter.h Alchemy-item: (ID = 983) Iceberg staging hub commit 8/18 - 117b7f9

# Conflicts: # velox/connectors/hive/HiveConnectorUtil.cpp Alchemy-item: (ID = 983) Iceberg staging hub commit 9/18 - fd66e1e

Co-authored-by: Chengcheng Jin <Chengcheng.Jin@ibm.com> Alchemy-item: (ID = 983) Iceberg staging hub commit 10/18 - d501ccc

Alchemy-item: (ID = 983) Iceberg staging hub commit 11/18 - 157015b

# Conflicts: # velox/dwio/parquet/writer/Writer.cpp # velox/dwio/parquet/writer/Writer.h # velox/dwio/parquet/writer/arrow/ArrowSchema.cpp # velox/dwio/parquet/writer/arrow/ArrowSchema.h # velox/dwio/parquet/writer/arrow/Metadata.cpp Alchemy-item: (ID = 983) Iceberg staging hub commit 12/18 - 6d3be8c

Alchemy-item: (ID = 983) Iceberg staging hub commit 13/18 - 306dd1c

Alchemy-item: (ID = 983) Iceberg staging hub commit 14/18 - 5a76f95

Alchemy-item: (ID = 983) Iceberg staging hub commit 15/18 - ea55d14

Alchemy-item: (ID = 983) Iceberg staging hub commit 16/18 - 973a1e4

Alchemy-item: (ID = 983) Iceberg staging hub commit 17/18 - bfbd027

Alchemy-item: (ID = 983) Iceberg staging hub commit 18/18 - 836264c

Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 906) fix: Adding daily tests commit 1/2 - e2eb2c6

we can cache ccache on every build even on failure, since ibm/velox is always incremental build Alchemy-item: (ID = 906) fix: Adding daily tests commit 2/2 - 0899ddc

Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 956) fix: Remove website folder to bypass the security issues commit 1/1 - 42debeb

Zouxxyy requested a review from majetideepak as a code owner January 7, 2026 09:05

Zouxxyy changed the title ~~Enable user defined fileNameGenerator for IcebergInsertTableHandle~~ Add fileNameGenerator to the constructor of IcebergInsertTableHandle. Jan 7, 2026

Zouxxyy changed the title ~~Add fileNameGenerator to the constructor of IcebergInsertTableHandle.~~ Add fileNameGenerator to the constructor of IcebergInsertTableHandle Jan 7, 2026

prestodb-ci force-pushed the main branch from a2e3d76 to f250987 Compare January 8, 2026 00:44

rui-mo and others added 25 commits January 8, 2026 07:37

[OAP][5962] Support struct schema evolution matching by name

0d0ddfb

Alchemy-item: (ID = 917) [OAP] Support struct schema evolution matching by name commit 1/1 - c816a39

[OAP][15173][15343]Allow reading integers into smaller-range types

ba66d01

Alchemy-item: (ID = 883) [OAP] [13620] Allow reading integers into smaller-range types commit 1/1 - 4cae2f5

[OAP][11771]fix: Fix smj result mismatch issue in semi, anit and full…

c7b518a

… outer join Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 882) [OAP] [11771] Fix smj result mismatch issue commit 1/1 - ada7dd2

[OAP][14722] Fix memory leak caused by asynchronous prefetch

563ea4a

Alchemy-item: (ID = 954) [OAP] [14722] Fix memory leak caused by asynchronous prefetch commit 1/1 - aedf6b0

Revert "feat: Support iceberg partition transform (facebookincubator#…

4ce8ebc

…15509)" This reverts commit d791d84. Alchemy-item: (ID = 983) Iceberg staging hub commit 1/18 - ec46102

Revert "refactor: Add Iceberg connector (facebookincubator#15581)"

038c328

This reverts commit fd0682b. Alchemy-item: (ID = 983) Iceberg staging hub commit 2/18 - f7d02da

Revert "feat: Add Iceberg partition name generator (facebookincubator…

1ac8d91

…#15461)" This reverts commit 7576f4e. Alchemy-item: (ID = 983) Iceberg staging hub commit 3/18 - d7d8655

Revert "feat: Add support for evaluating Iceberg partition transforms (…

eaaa35f

…facebookincubator#15477)" This reverts commit 1895711. Alchemy-item: (ID = 983) Iceberg staging hub commit 4/18 - aee2a67

Revert "feat: Add Iceberg partition name generation utility (facebook…

d290e8a

…incubator#15443)" This reverts commit 51d4a94. Alchemy-item: (ID = 983) Iceberg staging hub commit 5/18 - 9ed4008

Revert "feat: Add iceberg partition specification (facebookincubator#…

b469d01

…15423)" This reverts commit 600524b. Alchemy-item: (ID = 983) Iceberg staging hub commit 6/18 - 710ba8d

Support insert data into iceberg table.

b63dffb

# Conflicts: # velox/connectors/hive/HiveConnectorUtil.cpp Alchemy-item: (ID = 983) Iceberg staging hub commit 9/18 - fd66e1e

Add iceberg partition transforms.

593093a

Co-authored-by: Chengcheng Jin <Chengcheng.Jin@ibm.com> Alchemy-item: (ID = 983) Iceberg staging hub commit 10/18 - d501ccc

Add NaN statistics to parquet writer.

acc08eb

Alchemy-item: (ID = 983) Iceberg staging hub commit 11/18 - 157015b

Integrate Iceberg data file statistics and adding unit test.

ad7701f

Alchemy-item: (ID = 983) Iceberg staging hub commit 13/18 - 306dd1c

Support write field_id to parquet metadata SchemaElement.

8301a2a

Alchemy-item: (ID = 983) Iceberg staging hub commit 14/18 - 5a76f95

Implement iceberg sort order

8390215

Alchemy-item: (ID = 983) Iceberg staging hub commit 15/18 - ea55d14

Add clustered Iceberg writer mode.

60a6bb3

Alchemy-item: (ID = 983) Iceberg staging hub commit 16/18 - 973a1e4

Fix parquet writer ut

1b42240

Alchemy-item: (ID = 983) Iceberg staging hub commit 17/18 - bfbd027

Add IcebergConnector

d93e450

Alchemy-item: (ID = 983) Iceberg staging hub commit 18/18 - 836264c

adding daily tests

6f936a2

Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 906) fix: Adding daily tests commit 1/2 - e2eb2c6

remote gluten daily build

9f12b2e

we can cache ccache on every build even on failure, since ibm/velox is always incremental build Alchemy-item: (ID = 906) fix: Adding daily tests commit 2/2 - 0899ddc

fix: remove website folder to bypass the security issues

3c3e517

Signed-off-by: Yuan <yuanzhou@apache.org> Alchemy-item: (ID = 956) fix: Remove website folder to bypass the security issues commit 1/1 - 42debeb

prestodb-ci force-pushed the main branch from 09a0047 to e008bbb Compare January 27, 2026 12:08

prestodb-ci mentioned this pull request Jan 27, 2026

Rebase branch staging-ea7925753-rebase (e008bbb) with staging-ea7925753-head (ea79257) #1654

Closed

1 task

prestodb-ci force-pushed the main branch from e008bbb to 2a3357c Compare January 27, 2026 12:20

prestodb-ci force-pushed the main branch from 2a3357c to 9d01541 Compare January 30, 2026 20:15

prestodb-ci force-pushed the main branch from 9d01541 to ce9e07a Compare February 6, 2026 20:19

prestodb-ci mentioned this pull request Feb 6, 2026

Rebase branch staging-4e09b5521-rebase with staging-4e09b5521-head (4e09b55) #1684

Closed

1 task

prestodb-ci force-pushed the main branch from ce9e07a to 2c0509b Compare February 6, 2026 21:01

prestodb-ci mentioned this pull request Feb 6, 2026

Rebase branch staging-b468c9568-rebase with staging-b468c9568-head (b468c95) #1685

Closed

1 task

prestodb-ci force-pushed the main branch from 2c0509b to 527a6f1 Compare February 6, 2026 21:41

prestodb-ci mentioned this pull request Feb 6, 2026

Rebase branch staging-b46700901-rebase with staging-b46700901-head (b467009) #1686

Closed

1 task

prestodb-ci force-pushed the main branch from 527a6f1 to 3fc79b1 Compare February 6, 2026 22:21

prestodb-ci mentioned this pull request Feb 7, 2026

Rebase branch staging-34338bfe5-rebase with staging-34338bfe5-head (34338bf) #1687

Closed

1 task

prestodb-ci force-pushed the main branch from 3fc79b1 to d196f5a Compare February 7, 2026 01:10

This was referenced Feb 7, 2026

Rebase branch staging-ad7805bf2-rebase with staging-ad7805bf2-head (ad7805b) #1688

Closed

Rebase branch staging-54f466296-rebase with staging-54f466296-head (54f4662) #1705

Closed

prestodb-ci mentioned this pull request Feb 17, 2026

Rebase branch staging-54f466296-rebase (ca5b6c1) with staging-54f466296-head (54f4662) #1712

Closed

1 task

prestodb-ci force-pushed the main branch from d196f5a to 6eb48ba Compare February 17, 2026 13:08

This was referenced Feb 17, 2026

Rebase branch staging-2a81516b8-rebase with staging-2a81516b8-head (2a81516) #1713

Closed

Rebase branch staging-5716ea842-rebase with staging-5716ea842-head (5716ea8) #1714

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fileNameGenerator to the constructor of IcebergInsertTableHandle#1571

Add fileNameGenerator to the constructor of IcebergInsertTableHandle#1571
Zouxxyy wants to merge 27 commits intoIBM:mainfrom
Zouxxyy:dev/update-iceberg-handle

Zouxxyy commented Jan 7, 2026 •

edited

Loading

Uh oh!

Zouxxyy commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

Zouxxyy commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zouxxyy commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Zouxxyy commented Jan 7, 2026 •

edited

Loading