Skip to content

fix: [branch-0.14] backport #3879 - skip Comet columnar shuffle for stages with DPP scans#3934

Merged
andygrove merged 1 commit intoapache:branch-0.14from
andygrove:backport-3879-to-0.14
Apr 14, 2026
Merged

fix: [branch-0.14] backport #3879 - skip Comet columnar shuffle for stages with DPP scans#3934
andygrove merged 1 commit intoapache:branch-0.14from
andygrove:backport-3879-to-0.14

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 13, 2026

Which issue does this PR close?

Backport of #3879 to branch-0.14.

Rationale for this change

PR #3879 is a performance optimization that prevents stages with DPP scans from being converted to Comet. This has a huge impact on the TPC-DS benchmark.

What changes are included in this PR?

Cherry-pick of commit acbdeac from main. The changes skip Comet columnar shuffle for stages that contain DPP scans.

How are these changes tested?

Tests are included in the original PR and were cherry-picked along with the fix.

When a scan uses Dynamic Partition Pruning (DPP) and falls back to
Spark, Comet was still wrapping the stage with columnar shuffle,
creating inefficient row-to-columnar transitions:

  CometShuffleWriter → CometRowToColumnar → SparkFilter →
    SparkColumnarToRow → SparkScan

This adds a check in columnarShuffleSupported() that walks the child
plan tree to detect FileSourceScanExec nodes with dynamic pruning
filters. When found, the shuffle is not converted to Comet, allowing
the entire stage to fall back to Spark.
@andygrove andygrove changed the title fix: backport #3879 - skip Comet columnar shuffle for stages with DPP scans fix: [branch-0.14] backport #3879 - skip Comet columnar shuffle for stages with DPP scans Apr 13, 2026
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove!

@andygrove andygrove merged commit 797a76d into apache:branch-0.14 Apr 14, 2026
162 of 165 checks passed
@andygrove andygrove deleted the backport-3879-to-0.14 branch April 14, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants