Avoid unnecessary DisjunctionMaxBulkScorer overhead #15659

shimpeko · 2026-02-02T22:10:01Z

This change inspects clause bulk scorers up front and only uses DisjunctionMaxBulkScorer if at least one clause provides a non-default BulkScorer. Otherwise, we fall back to the scorer-based path.

c88f933 made DisjunctionMaxQuery use DisjunctionMaxBulkScorer when tieBreakerMultiplier == 0 and scoreMode == TOP_SCORES. However, this bulk path primarily pays off when at least one clause implements a specialized BulkScorer.

When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations that the scorer-based DisjunctionMaxScorer supports in TOP_SCORES mode.

In such cases, falling back to the scorer-based path typically results in better performance and restores competitive-score-based skipping.

Fixes: #15658
Related PR: #14040

shimpeko · 2026-02-03T00:54:24Z

I realized that we cannot call ScorerSupplier.get() more than once. Need another solutions.

This change inspects clause bulk scorers up front and only uses DisjunctionMaxBulkScorer if at least one clause provides a non-default BulkScorer. Otherwise, we fall back to the scorer-based path. c88f933 made DisjunctionMaxQuery use DisjunctionMaxBulkScorer when tieBreakerMultiplier == 0 and scoreMode == TOP_SCORES. However, this bulk path primarily pays off when at least one clause implements a specialized BulkScorer. When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations that the scorer-based DisjunctionMaxScorer supports in TOP_SCORES mode. In such cases, falling back to the scorer-based path typically results in better performance and restores competitive-score-based skipping. Fixes: apache#15658 Related PR: apache#14040

shimpeko · 2026-02-03T10:08:48Z

I realized that we cannot call ScorerSupplier.get() more than once. Need another solutions.

I updated the code not to call ScorerSupplier.get() more than once.

uschindler · 2026-02-03T11:20:49Z

Hi, it would be nice to not always force push, this makes reviewing hard. I have no idea if you have fixed the linter warnings. Squashing the commits is done on our side while merging, please don't do it yourself.

Otherwise the change looks fine to me. but I leave it to @jpountz as he understand the code much better.

uschindler · 2026-02-03T11:23:11Z

Please fix the linter warning by running "./gradlew tidy" and commit, but don't sqash!

> Task :lucene:core:checkGoogleJavaFormat FAILED
java file(s) have google-java-format violations (run './gradlew tidy' to fix). An overview diff of changes:
== /home/runner/work/lucene/lucene/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java
@@ -181,8 +181,7 @@
 ············}
 
 ············return·new·Weight.DefaultBulkScorer(
-················new·DisjunctionMaxScorer(tieBreakerMultiplier,·scorers,·scoreMode,·Long.MAX_VALUE)
-············);
+················new·DisjunctionMaxScorer(tieBreakerMultiplier,·scorers,·scoreMode,·Long.MAX_VALUE));
 ··········}
 
 ··········@Override

shimpeko · 2026-02-03T11:40:42Z

Thank you for the review, and sorry about the repeated force-pushes — I was trying to clean up the commit history and should have avoided doing that during review. I won’t force-push going forward. Ran tidy and pushed 68ada56. Will wait for @jpountz's review.

jpountz · 2026-02-03T13:25:38Z

When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations

It's true that the windowing+replay logic isn't free but I remember that it was still better than using a Scorer which had to keep reordering a heap on every document. As far as block-max optimizations are concerned, DisjunctionBulkMaxScorer tracks the min competitive score and passes it to its sub clauses whenever scoring a window (

lucene/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxBulkScorer.java

Line 81 in 7ebdb93

scorer.setMinCompetitiveScore(topLevelScorable.minCompetitiveScore);

), so this should work fine.

Let's benchmark this change with luceneutil and our dismax tasks (https://github.com/mikemccand/luceneutil/blob/main/tasks/wikinightly.tasks#L257-L276)? DisMaxTerm in particular should be a good task since TermQuery doesn't have a bulk scorer. I haven't played with this for a while, it's possible that bulk scoring isn't helping anymore.

shimpeko force-pushed the dismax-bulk-heuristic branch from 56cf380 to ecc05df Compare February 2, 2026 22:43

github-actions bot added the module:core/search label Feb 2, 2026

github-actions bot modified the milestones: 11.0.0, 10.4.0 Feb 2, 2026

shimpeko marked this pull request as draft February 3, 2026 00:54

shimpeko force-pushed the dismax-bulk-heuristic branch 2 times, most recently from a3b772a to 160a235 Compare February 3, 2026 01:48

uschindler requested a review from jpountz February 3, 2026 09:46

shimpeko added 2 commits February 3, 2026 10:05

Add changes.txt

6231d37

shimpeko force-pushed the dismax-bulk-heuristic branch from 160a235 to 6231d37 Compare February 3, 2026 10:07

shimpeko marked this pull request as ready for review February 3, 2026 10:08

uschindler approved these changes Feb 3, 2026

View reviewed changes

./gradlew tidy --rerun-tasks

68ada56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid unnecessary DisjunctionMaxBulkScorer overhead #15659

Avoid unnecessary DisjunctionMaxBulkScorer overhead #15659

shimpeko commented Feb 2, 2026 •

edited

Loading

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

uschindler commented Feb 3, 2026 •

edited

Loading

Uh oh!

uschindler commented Feb 3, 2026 •

edited

Loading

Uh oh!

shimpeko commented Feb 3, 2026 •

edited

Loading

Uh oh!

jpountz commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Avoid unnecessary DisjunctionMaxBulkScorer overhead #15659

Are you sure you want to change the base?

Avoid unnecessary DisjunctionMaxBulkScorer overhead #15659

Conversation

shimpeko commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

uschindler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uschindler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shimpeko commented Feb 2, 2026 •

edited

Loading

uschindler commented Feb 3, 2026 •

edited

Loading

uschindler commented Feb 3, 2026 •

edited

Loading

shimpeko commented Feb 3, 2026 •

edited

Loading