fix: preserve duplicate GROUPING SETS rows by xiedeyantu · Pull Request #21058 · apache/datafusion

xiedeyantu · 2026-03-19T16:29:42Z

Which issue does this PR close?

Closes Duplicate GROUPING SETS rows are incorrectly collapsed during execution #21316.

Rationale for this change

GROUPING SETS with duplicate grouping lists were incorrectly collapsed during execution. The internal grouping id only encoded the semantic null mask, so repeated grouping sets shared the same execution key and were merged, which caused rows to be lost compared with PostgreSQL behavior.

For example, with:

create table duplicate_grouping_sets(deptno int, job varchar, sal int, comm int);
insert into duplicate_grouping_sets values
(10, 'CLERK', 1300, null),
(20, 'MANAGER', 3000, null);

select deptno, job, sal, sum(comm), grouping(deptno), grouping(job), grouping(sal)
from duplicate_grouping_sets
group by grouping sets ((deptno, job), (deptno, sal), (deptno, job))
order by deptno, job, sal, grouping(deptno), grouping(job), grouping(sal);

PostgreSQL preserves the duplicate grouping set and returns:

 deptno |   job   | sal  | sum | grouping | grouping | grouping
--------+---------+------+-----+----------+----------+----------
     10 | CLERK   |      |     |        0 |        0 |        1
     10 | CLERK   |      |     |        0 |        0 |        1
     10 |         | 1300 |     |        0 |        1 |        0
     20 | MANAGER |      |     |        0 |        0 |        1
     20 | MANAGER |      |     |        0 |        0 |        1
     20 |         | 3000 |     |        0 |        1 |        0
(6 rows)

Before this fix, DataFusion collapsed the duplicate (deptno, job) grouping set and returned only 4 rows for the same query shape.

+--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+
| deptno | job     | sal  | sum(duplicate_grouping_sets.comm) | grouping(duplicate_grouping_sets.deptno) | grouping(duplicate_grouping_sets.job) | grouping(duplicate_grouping_sets.sal) |
+--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+
| 10     | CLERK   | NULL | NULL                              | 0                                        | 0                                     | 1                                     |
| 10     | NULL    | 1300 | NULL                              | 0                                        | 1                                     | 0                                     |
| 20     | MANAGER | NULL | NULL                              | 0                                        | 0                                     | 1                                     |
| 20     | NULL    | 3000 | NULL                              | 0                                        | 1                                     | 0                                     |
+--------+---------+------+-----------------------------------+------------------------------------------+---------------------------------------+---------------------------------------+

What changes are included in this PR?

Preserve duplicate grouping sets by separating grouping semantics from duplicate-set identity during execution.
Keep __grouping_id as the semantic grouping mask only.
Add a new internal __grouping_ordinal column so repeated occurrences of the same grouping set produce distinct execution keys.
Keep GROUPING() semantics unchanged.
Add regression coverage for the duplicate GROUPING SETS case in:
- datafusion/core/tests/sql/aggregates/basic.rs
- datafusion/sqllogictest/test_files/group_by.slt

Are these changes tested?

cargo fmt --all
cargo test -p datafusion duplicate_grouping_sets_are_preserved
cargo test -p datafusion-physical-plan grouping_sets_preserve_duplicate_groups
cargo test -p datafusion-physical-plan evaluate_group_by_supports_duplicate_grouping_sets_with_eight_columns
PostgreSQL validation against the same query/result shape

Are there any user-facing changes?

Yes. Queries that contain duplicate GROUPING SETS entries now return the correct duplicated result rows, matching PostgreSQL behavior.

xiedeyantu · 2026-03-20T14:17:23Z

Hi @alamb , there's another one here.

alamb · 2026-03-21T13:11:46Z

Thanks @xiedeyantu -- this is great. As before, can you please:

File a ticket showing the bug and the postgres behavior as well? This will make it much easier / faster to review these PRs

xiedeyantu · 2026-03-21T13:22:17Z

Thanks @xiedeyantu -- this is great. As before, can you please:

File a ticket showing the bug and the postgres behavior as well? This will make it much easier / faster to review these PRs

I completely agree. I may not be very familiar with DataFusion's contribution process yet. Thank you for your thoughtful suggestions. I might add the necessary information in a few days, as I'm currently on vacation.

alamb · 2026-03-23T17:38:41Z

Thank you @xiedeyantu

neilconway · 2026-03-25T21:16:14Z

datafusion/core/tests/sql/aggregates/basic.rs

 }

+#[tokio::test]
+async fn duplicate_grouping_sets_are_preserved() -> Result<()> {


Seems like this duplicates the SLT test that this PR already adds, so I think it can be omitted.

neilconway

It's a little unfortunate that we don't support duplicate grouping sets for 8/16/32 grouping columns. I guess there's no easy way around that?

grouping_function_on_id has this code:

      let grouping_id_column = Expr::Column(Column::from(Aggregate::INTERNAL_GROUPING_ID));
      // The grouping call is exactly our internal grouping id
      if args.len() == group_by_expr_count
          && args
              .iter()
              .rev()
              .enumerate()
              .all(|(idx, expr)| group_by_expr.get(expr) == Some(&idx))
      {
          return Ok(cast(grouping_id_column, DataType::Int32));
      }

We probably want to mask out the ordinal bits that this PR adds in to the grouping ID?

neilconway · 2026-03-25T23:00:42Z

datafusion/physical-plan/src/aggregates/mod.rs

+    let extra_bits = width_bits - group.len();
+    if extra_bits == 0 && group_ordinal > 0 {
+        return not_impl_err!(
+            "Duplicate grouping sets with more than {} grouping columns are not supported",


If group.len() == 64, seems like this will produce an inaccurate error message ("more than 64" vs "64").

neilconway · 2026-03-25T23:11:34Z

datafusion/physical-plan/src/aggregates/mod.rs

    group_by: &PhysicalGroupBy,
    batch: &RecordBatch,
 ) -> Result<Vec<Vec<ArrayRef>>> {
+    let mut group_ordinals: HashMap<Vec<bool>, usize> = HashMap::new();


HashMap<&[bool], usize> instead?

neilconway · 2026-03-25T23:20:38Z

datafusion/physical-plan/src/aggregates/mod.rs

 }

-fn group_id_array(group: &[bool], batch: &RecordBatch) -> Result<ArrayRef> {
+fn group_id_array(


I think this function would benefit from some commentary: you need to read the implementation pretty carefully to understand the encoded bit format.

neilconway · 2026-03-25T23:22:58Z

datafusion/physical-plan/src/aggregates/mod.rs

    let group_id = group.iter().fold(0u64, |acc, &is_null| {
        (acc << 1) | if is_null { 1 } else { 0 }
    });
+    let group_id = if group.len() == 64 {


Personally, I would opt for tightening the check on group.len() > 64 to group.len() >= 64 at the top of the function, which would simplify this logic.

xiedeyantu · 2026-03-26T09:38:46Z

@neilconway Thank you for the thorough review. I will complete these modifications this weekend and update the problem description accordingly.

xiedeyantu · 2026-03-28T16:30:01Z

Thanks @xiedeyantu -- this is great. As before, can you please:

File a ticket showing the bug and the postgres behavior as well? This will make it much easier / faster to review these PRs

@alamb Apologies for the late reply. I have just updated the problem description in the PR.

xiedeyantu · 2026-03-28T16:48:17Z

It's a little unfortunate that we don't support duplicate grouping sets for 8/16/32 grouping columns. I guess there's no easy way around that?

grouping_function_on_id has this code:

      let grouping_id_column = Expr::Column(Column::from(Aggregate::INTERNAL_GROUPING_ID));
      // The grouping call is exactly our internal grouping id
      if args.len() == group_by_expr_count
          && args
              .iter()
              .rev()
              .enumerate()
              .all(|(idx, expr)| group_by_expr.get(expr) == Some(&idx))
      {
          return Ok(cast(grouping_id_column, DataType::Int32));
      }

We probably want to mask out the ordinal bits that this PR adds in to the grouping ID?

@neilconway Apologies for the delayed response. I have reviewed all your thoughtful comments and agree that the approach using high-bit flags indeed has significant flaws. I have now refactored the entire logic and updated the PR description; if you have the time, could you please help review it again? Thank you!

xiedeyantu · 2026-03-30T23:03:18Z

@alamb Could you please take a look and let me know if my current revisions are okay?

xiedeyantu · 2026-04-01T15:43:37Z

@neilconway Could you please help me to have a look again?

neilconway

Sorry for the delayed response, @xiedeyantu !

Thanks for revising this. I'm a bit concerned by the overhead here; we are added a UInt32 column to every query with grouping sets just to handle a relatively rare situation.

I actually liked the approach you took in the initial PR better; encoding duplicates into the high bits of the grouping ID is a nice approach. We'd just need to take care to mask out the high bits in the GROUPING function, and if possible it would be nice to avoid arbitrary limits like not supporting duplicate grouping sets for queries with 8/16/etc. grouping sets.

Alternatively we could represent grouping IDs as an index into the list of GROUPING SETS. That would provide an ID without concern for duplicates, and then we'd implement the GROUPING function with a CASE or similar construct.

neilconway · 2026-04-02T13:16:22Z

datafusion/core/src/dataframe/mod.rs

            .build()?;
        let plan = if is_grouping_set {
-            let grouping_id_pos = plan.schema().fields().len() - 1 - aggr_expr_len;
+            let grouping_id_pos = plan.schema().fields().len() - 2 - aggr_expr_len;


2 magic number in a few places is a bit inscrutable. Maybe create a named constant?

xiedeyantu · 2026-04-02T15:21:14Z

Sorry for the delayed response, @xiedeyantu !

Thanks for revising this. I'm a bit concerned by the overhead here; we are added a UInt32 column to every query with grouping sets just to handle a relatively rare situation.

I actually liked the approach you took in the initial PR better; encoding duplicates into the high bits of the grouping ID is a nice approach. We'd just need to take care to mask out the high bits in the GROUPING function, and if possible it would be nice to avoid arbitrary limits like not supporting duplicate grouping sets for queries with 8/16/etc. grouping sets.

Alternatively we could represent grouping IDs as an index into the list of GROUPING SETS. That would provide an ID without concern for duplicates, and then we'd implement the GROUPING function with a CASE or similar construct.

@neilconway Thanks for your kind guidance. I have refactored this PR according to your comments. Due to the significant changes, I force-pushed the commit. I think this version is better than the previous one. If you have time, please take a look again. Sorry for making multiple changes.

neilconway

Overall looks good!

neilconway · 2026-04-03T17:16:45Z

datafusion/sql/src/unparser/utils.rs

@@ -247,6 +247,11 @@ fn find_agg_expr<'a>(agg: &'a Aggregate, column: &Column) -> Result<Option<&'a E
                    )
                }
                Ordering::Greater => {
+                    if index < grouping_expr.len() + 1 {


This seems like dead code? i.e., index > grouping_expr.len -> index >= (grouping_expr.len + 1)

neilconway · 2026-04-03T17:22:12Z

datafusion/physical-plan/src/aggregates/mod.rs

+/// Returns the highest duplicate ordinal across all grouping sets.
+///
+/// The ordinal counts how many times a grouping-set pattern has already
+/// appeared before the current occurrence.  If the same `Vec<bool>` appears


"the current occurrence" doesn't seem to make sense when you look at the function itself, it is talking about the call-site.

neilconway · 2026-04-03T17:22:30Z

datafusion/physical-plan/src/aggregates/mod.rs

+/// three times the ordinals are 0, 1, 2 and this function returns 2.
+/// Returns 0 when no grouping set is duplicated.
+fn max_duplicate_ordinal(groups: &[Vec<bool>]) -> usize {
+    let mut counts: HashMap<&Vec<bool>, usize> = HashMap::new();


HashMap<&[bool], usize>

neilconway · 2026-04-03T17:22:51Z

datafusion/physical-plan/src/aggregates/mod.rs

+    counts
+        .values()
+        .copied()
+        .max()
+        .unwrap_or(1)
+        .saturating_sub(1)


counts.into_values().max().unwrap_or(0).saturating_sub(1)

neilconway · 2026-04-03T17:26:48Z

datafusion/optimizer/src/analyzer/resolve_grouping_function.rs

+    // The grouping call is exactly our internal grouping id — mask the ordinal
+    // bits (above position `n`) so only the semantic bitmask is visible.
+    let n = group_by_expr_count;
+    // (1 << n) - 1 masks the low n bits.  Use saturating arithmetic to handle n == 0.


The code doesn't use saturating arithmetic?

Although I don't think wrapping_sub is necessary anyway, n == 0 -> 1 << 0 -> 1, so it won't underflow.

neilconway · 2026-04-03T17:35:02Z

datafusion/optimizer/src/analyzer/resolve_grouping_function.rs

+    } else {
+        (1u64 << n).wrapping_sub(1)
+    };
+    let masked_id =


Don't compute this outside the if in which it is used.

xiedeyantu · 2026-04-04T03:16:57Z

@neilconway Thank you very much for your detailed review! I have addressed your comments line by line and included the changes in a single commit. Please take another look to see if everything now aligns with your feedback.

neilconway · 2026-04-05T15:20:50Z

datafusion/physical-plan/src/aggregates/mod.rs

+    /// Returns the Arrow data type of the `__grouping_id` column.
+    ///
+    /// The type is chosen to be wide enough to hold both the semantic bitmask
+    /// (in the low `n` bits) and the duplicate ordinal (in the high bits).


Not clear what n is here based on the context of the function itself.

neilconway · 2026-04-05T15:31:27Z

datafusion/optimizer/src/analyzer/resolve_grouping_function.rs

+    let n = group_by_expr_count;
+    // (1 << n) - 1 masks the low n bits.
+    let semantic_mask: u64 = if n >= 64 { u64::MAX } else { (1u64 << n) - 1 };


As I suggested before, move this inside the if

neilconway · 2026-04-05T15:33:34Z

datafusion/expr/src/logical_plan/plan.rs

+/// Returns 0 when no grouping set is duplicated.
+fn max_grouping_set_duplicate_ordinal(group_expr: &[Expr]) -> usize {
+    if let Some(Expr::GroupingSet(GroupingSet::GroupingSets(sets))) = group_expr.first() {
+        let mut counts: HashMap<&Vec<Expr>, usize> = HashMap::new();


HashMap<&[Expr], usize>

neilconway · 2026-04-05T15:34:15Z

datafusion/sqllogictest/test_files/group_by.slt

@@ -5206,6 +5203,41 @@ NULL NULL 1
 statement ok
 drop table t;

+# regression: duplicate grouping sets must not be collapsed into one


I think it's worth having a test case for the situation where adding the duplicate ordinal widens the size of the grouping ID field, that's a bit tricky.

neilconway · 2026-04-05T15:38:43Z

datafusion/expr/src/logical_plan/plan.rs

+/// The ordinal counts how many times a given grouping set pattern has already
+/// appeared before the current occurrence.  For example, if the same set


I mentioned this before but "the current occurrence" is not well-defined in this context. Something like this instead:

/// The ordinal for each occurrence of a grouping set pattern is its 0-based index among /// identical entries. For example, if the same set appears three times, the ordinals are /// 0, 1, 2 and this function returns 2.

xiedeyantu · 2026-04-06T05:59:20Z

@neilconway I have made the changes based on the comments; could you please review it again? Thank you!

xiedeyantu marked this pull request as draft March 19, 2026 16:29

github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels Mar 19, 2026

xiedeyantu marked this pull request as ready for review March 19, 2026 16:40

alamb added the bug Something isn't working label Mar 21, 2026

neilconway reviewed Mar 25, 2026

View reviewed changes

xiedeyantu force-pushed the groupset branch from 1f58462 to eb5dae5 Compare March 28, 2026 16:27

github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Mar 28, 2026

github-actions bot added the sql SQL Planner label Mar 28, 2026

xiedeyantu requested a review from neilconway March 30, 2026 14:43

neilconway reviewed Apr 2, 2026

View reviewed changes

fix: preserve duplicate GROUPING SETS rows

3bdd9e7

xiedeyantu force-pushed the groupset branch from 9f6b4a4 to 3bdd9e7 Compare April 2, 2026 15:14

github-actions bot removed the core Core DataFusion crate label Apr 2, 2026

fix fmt

c9453ef

neilconway reviewed Apr 3, 2026

View reviewed changes

fix comments

2a401c5

github-actions bot removed the sql SQL Planner label Apr 4, 2026

fix fmt

c04403e

neilconway reviewed Apr 5, 2026

View reviewed changes

xiedeyantu added 2 commits April 6, 2026 13:57

Addressed comments

546de18

Merge branch 'main' into groupset

3e9b4f0

		/// The ordinal counts how many times a given grouping set pattern has already
		/// appeared before the current occurrence. For example, if the same set

Conversation

xiedeyantu commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

xiedeyantu commented Mar 20, 2026

Uh oh!

alamb commented Mar 21, 2026

Uh oh!

xiedeyantu commented Mar 21, 2026

Uh oh!

alamb commented Mar 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neilconway left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiedeyantu commented Mar 26, 2026

Uh oh!

xiedeyantu commented Mar 28, 2026

Uh oh!

xiedeyantu commented Mar 28, 2026

Uh oh!

xiedeyantu commented Mar 30, 2026

Uh oh!

xiedeyantu commented Apr 1, 2026

Uh oh!

neilconway left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiedeyantu commented Apr 2, 2026

Uh oh!

neilconway left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiedeyantu commented Apr 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiedeyantu commented Apr 6, 2026

Uh oh!

Reviewers

xiedeyantu commented Mar 19, 2026 •

edited

Loading

neilconway left a comment •

edited

Loading