Fix annotated highlighter wrapping issue by sjs004 · Pull Request #20682 · opensearch-project/OpenSearch

sjs004 · 2026-02-19T18:45:52Z

Description

This PR fixes a ClassCastException occurring in the mapper-annotated-text plugin when highlighting fields with max_analyzer_offset configured. The fix involves refactoring how UnifiedHighlighter composes analyzers, moving from a hardcoded wrapping logic to an extensible hook-based approach (wrapAnalyzer)

The Problem (Old Flow)
Previously, the UnifiedHighlighter core would fetch the analyzer from the plugin and then forcefully wrap it in a LimitTokenOffsetAnalyzer (an anonymous class UnifiedHighlighter$1) if a maximum offset was set

Old Object Hierarchy:

UnifiedHighlighter$1 (Outer Shell)

└── AnnotatedHighlighterAnalyzer (Inner Layer)

└── BaseAnalyzer (Core Layer)

Because the Outer Shell was UnifiedHighlighter$1, the plugin’s attempt to cast the analyzer back to its own type in loadFieldValues failed

The Solution (New Flow)
Introduced a protected wrapAnalyzer(Analyzer, Integer) hook in the UnifiedHighlighter core. This allows the plugin to control the order of decoration.

By overriding this hook in the plugin and calling super.wrapAnalyzer() first, we ensure that the Core applies the limit wrapper before the plugin applies its own specialized wrapper

New Object Hierarchy:

AnnotatedHighlighterAnalyzer (Outer Shell)

└── UnifiedHighlighter$1 (Inner Layer - Limit Wrapper)

└── BaseAnalyzer (Core Layer)

Key Changes

OpenSearch Core (UnifiedHighlighter.java):

Removed hardcoded wrapping in buildHighlighter
Added protected Analyzer wrapAnalyzer(Analyzer analyzer, Integer fieldMaxAnalyzedOffset) to provide a standard extension point.

Plugin (AnnotatedTextHighlighter.java):

Overrode wrapAnalyzer to ensure AnnotatedHighlighterAnalyzer remains the outermost object
Removed the obsolete getAnalyzer override

Related Issues

Resolves #20669

Check List

Functionality includes testing.
API changes companion pull request created, if applicable.
Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>

coderabbitai · 2026-02-19T18:46:22Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

The changes refactor analyzer construction in two highlighter classes by introducing a new wrapAnalyzer() helper method. The UnifiedHighlighter now uses this method to conditionally wrap analyzers with LimitedOffsetAnalyzer, while AnnotatedTextHighlighter delegates through it to wrap with AnnotatedHighlighterAnalyzer. This centralizes offset-limiting logic.

Changes

Cohort / File(s)	Summary
Analyzer Wrapper Delegation `plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java`, `server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`	Both classes now use a `wrapAnalyzer()` method pattern. UnifiedHighlighter introduces the new protected helper method to conditionally apply LimitedOffsetAnalyzer based on fieldMaxAnalyzedOffset. AnnotatedTextHighlighter overrides `wrapAnalyzer()` instead of `getAnalyzer()`, wrapping the core analyzer result with AnnotatedHighlighterAnalyzer. Analyzer initialization changed from direct DocumentMapper access to through mappers().indexAnalyzer().

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description contains only the template structure with a placeholder '[Describe what this change achieves]' and does not include an actual description of the changes, objectives, or reasoning.	Replace the placeholder with a concrete description of what the change achieves, explain why the wrapping approach was refactored, and confirm test coverage.
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix annotated highlighter wrapping issue' is directly related to the main change, which refactors the analyzer wrapping logic in the AnnotatedTextHighlighter class.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java (1)
38-38: ⚠️ Potential issue | 🟡 Minor

Unused DocumentMapper import — remove it.

DocumentMapper was referenced by the now-removed getAnalyzer(DocumentMapper) override. No method in the current class uses it.
🧹 Proposed fix
-import org.opensearch.index.mapper.DocumentMapper;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java`
at line 38, The import org.opensearch.index.mapper.DocumentMapper in
AnnotatedTextHighlighter is unused (the getAnalyzer(DocumentMapper) override was
removed); delete that import line and run organize-imports / recompile to ensure
no other references remain so the class compiles cleanly.
server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java (1)
211-213: ⚠️ Potential issue | 🟠 Major

getAnalyzer(DocumentMapper) is dead code and should be removed or deprecated.

buildHighlighter directly accesses the index analyzer at line 164 without calling getAnalyzer. The method is unused and is not overridden by any current subclasses (CustomUnifiedHighlighter or AnnotatedTextHighlighter), but its protected visibility creates a hazard: future subclasses might override it expecting it to be called during highlighting, only to have their override silently ignored. Remove the method or deprecate it clearly to guide future maintainers toward wrapAnalyzer instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`
around lines 211 - 213, The getAnalyzer(DocumentMapper) method is dead code and
should be removed or clearly deprecated to avoid future subclass confusion:
either delete the protected getAnalyzer(DocumentMapper) method from
UnifiedHighlighter (and remove any unused imports/tests) or mark it `@Deprecated`,
change visibility to private or final if keeping, and update its Javadoc to
point callers to wrapAnalyzer; ensure you reference getAnalyzer,
buildHighlighter (which currently reads the index analyzer directly), and the
subclasses CustomUnifiedHighlighter and AnnotatedTextHighlighter when making the
change so you don’t break expected behavior.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c62834 and 9b469e7.

📒 Files selected for processing (2)

plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java
server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-01-13T17:40:27.167Z

Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.

Applied to files:

server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: gradle-check
GitHub Check: assemble (25, ubuntu-latest)
GitHub Check: precommit (21, ubuntu-24.04-arm)
GitHub Check: precommit (21, macos-15-intel)
GitHub Check: precommit (21, windows-2025, true)
GitHub Check: assemble (25, ubuntu-24.04-arm)
GitHub Check: assemble (21, ubuntu-latest)
GitHub Check: precommit (21, macos-15)
GitHub Check: precommit (21, windows-latest)
GitHub Check: precommit (25, ubuntu-latest)
GitHub Check: precommit (25, macos-15)
GitHub Check: assemble (25, windows-latest)
GitHub Check: precommit (25, macos-15-intel)
GitHub Check: precommit (21, ubuntu-latest)
GitHub Check: precommit (25, ubuntu-24.04-arm)
GitHub Check: precommit (25, windows-latest)
GitHub Check: assemble (21, windows-latest)
GitHub Check: assemble (21, ubuntu-24.04-arm)
GitHub Check: detect-breaking-change
GitHub Check: Analyze (java)

🔇 Additional comments (2)

server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java (1)

215-220: wrapAnalyzer helper is well-structured.

The null check on fieldMaxAnalyzedOffset correctly handles the non-offset-limiting path, and delegation to getLimitedOffsetAnalyzer is clean.

plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java (1)

78-81: LGTM — correct fix for the annotated highlighter wrapping.

super.wrapAnalyzer is applied first to optionally inject LimitedOffsetAnalyzer, then AnnotatedHighlighterAnalyzer wraps the result. When fieldMaxAnalyzedOffset is null the chain is AnnotatedHighlighterAnalyzer(baseAnalyzer) — identical to the old behaviour — and when it is set the chain becomes AnnotatedHighlighterAnalyzer(LimitedOffsetAnalyzer(baseAnalyzer)), which is the intended fix. The cast in loadFieldValues (line 73) remains safe because this override always produces an AnnotatedHighlighterAnalyzer.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`:
- Around line 164-169: In UnifiedHighlighter, the analyzer returned by
wrapAnalyzer is being overwritten when a derived field type exists; move the
call to wrapAnalyzer so it is applied after resolving derivedFieldType (i.e.,
resolve derivedFieldType via
context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName),
select the analyzer from either the derived field
((DerivedFieldType).getIndexAnalyzer()) or the original indexAnalyzer, and then
call wrapAnalyzer on that chosen analyzer using fieldMaxAnalyzedOffset) so the
offset-limiting wrapper is never discarded; also ensure that when
AnnotatedTextHighlighter will later cast the analyzer (see
AnnotatedTextHighlighter.loadFieldValues) the wrapped analyzer preserves/returns
an AnnotatedHighlighterAnalyzer-compatible instance if required.

---

Outside diff comments:
In
`@plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java`:
- Line 38: The import org.opensearch.index.mapper.DocumentMapper in
AnnotatedTextHighlighter is unused (the getAnalyzer(DocumentMapper) override was
removed); delete that import line and run organize-imports / recompile to ensure
no other references remain so the class compiles cleanly.

In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`:
- Around line 211-213: The getAnalyzer(DocumentMapper) method is dead code and
should be removed or clearly deprecated to avoid future subclass confusion:
either delete the protected getAnalyzer(DocumentMapper) method from
UnifiedHighlighter (and remove any unused imports/tests) or mark it `@Deprecated`,
change visibility to private or final if keeping, and update its Javadoc to
point callers to wrapAnalyzer; ensure you reference getAnalyzer,
buildHighlighter (which currently reads the index analyzer directly), and the
subclasses CustomUnifiedHighlighter and AnnotatedTextHighlighter when making the
change so you don’t break expected behavior.

coderabbitai · 2026-02-19T18:52:03Z

server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java

+        Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer();
+        analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset);
        MappedFieldType derivedFieldType = fieldContext.context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName);
        if (derivedFieldType != null) {
            analyzer = ((DerivedFieldType) derivedFieldType).getIndexAnalyzer();
        }


⚠️ Potential issue | 🟡 Minor

wrapAnalyzer result is discarded when a derived field type is present.

wrapAnalyzer at line 165 applies offset limiting, but the derived-field override at line 168 replaces the analyzer unconditionally, silently ignoring the fieldMaxAnalyzedOffset setting for derived fields. For AnnotatedTextHighlighter, the replaced analyzer is also not an AnnotatedHighlighterAnalyzer, which would cause a ClassCastException in loadFieldValues (line 73 of AnnotatedTextHighlighter) — although that specific issue pre-dates this PR.

Consider applying wrapAnalyzer after the derived-field type resolution:

🔧 Proposed fix

- Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer(); - analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset); MappedFieldType derivedFieldType = fieldContext.context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName); + Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer(); if (derivedFieldType != null) { analyzer = ((DerivedFieldType) derivedFieldType).getIndexAnalyzer(); } + analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java` around lines 164 - 169, In UnifiedHighlighter, the analyzer returned by wrapAnalyzer is being overwritten when a derived field type exists; move the call to wrapAnalyzer so it is applied after resolving derivedFieldType (i.e., resolve derivedFieldType via context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName), select the analyzer from either the derived field ((DerivedFieldType).getIndexAnalyzer()) or the original indexAnalyzer, and then call wrapAnalyzer on that chosen analyzer using fieldMaxAnalyzedOffset) so the offset-limiting wrapper is never discarded; also ensure that when AnnotatedTextHighlighter will later cast the analyzer (see AnnotatedTextHighlighter.loadFieldValues) the wrapped analyzer preserves/returns an AnnotatedHighlighterAnalyzer-compatible instance if required.

github-actions · 2026-02-19T18:54:17Z

❌ Gradle check result for 9b469e7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>

github-actions · 2026-02-19T21:32:21Z

✅ Gradle check result for 9a7edd7: SUCCESS

codecov · 2026-02-19T22:01:00Z

Codecov Report

❌ Patch coverage is 42.85714% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.26%. Comparing base (6b557db) to head (9a7edd7).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
...h/subphase/highlight/AnnotatedTextHighlighter.java	0.00%	2 Missing ⚠️
...h/fetch/subphase/highlight/UnifiedHighlighter.java	60.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #20682      +/-   ##
============================================
- Coverage     73.32%   73.26%   -0.07%     
+ Complexity    72064    72000      -64     
============================================
  Files          5781     5781              
  Lines        329395   329405      +10     
  Branches      47525    47525              
============================================
- Hits         241536   241333     -203     
- Misses        68507    68787     +280     
+ Partials      19352    19285      -67

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Fix annotated highlighter wrapping issue

9b469e7

Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>

sjs004 requested review from a team, Bukhtawar, CEHENKLE, Rishikesh1159, anasalkouz, andrross, ashking94, cwperks, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, owaiskazi19, reta, sachinpkale, saratvemulapalli, shwetathareja and sohami as code owners February 19, 2026 18:45

sjs004 marked this pull request as draft February 19, 2026 18:48

github-actions bot added bug Something isn't working Search:Query Capabilities labels Feb 19, 2026

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Fix annotated highlighter wrapping issue

9a7edd7

Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix annotated highlighter wrapping issue#20682

Fix annotated highlighter wrapping issue#20682
sjs004 wants to merge 2 commits intoopensearch-project:mainfrom
sjs004:fix-annotated-highlighter-wrapping

sjs004 commented Feb 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 19, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

codecov bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

sjs004 commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

codecov bot commented Feb 19, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sjs004 commented Feb 19, 2026 •

edited

Loading

coderabbitai bot commented Feb 19, 2026 •

edited

Loading