Fix annotated highlighter wrapping issue#20682
Fix annotated highlighter wrapping issue#20682sjs004 wants to merge 2 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe changes refactor analyzer construction in two highlighter classes by introducing a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java (1)
38-38:⚠️ Potential issue | 🟡 MinorUnused
DocumentMapperimport — remove it.
DocumentMapperwas referenced by the now-removedgetAnalyzer(DocumentMapper)override. No method in the current class uses it.🧹 Proposed fix
-import org.opensearch.index.mapper.DocumentMapper;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java` at line 38, The import org.opensearch.index.mapper.DocumentMapper in AnnotatedTextHighlighter is unused (the getAnalyzer(DocumentMapper) override was removed); delete that import line and run organize-imports / recompile to ensure no other references remain so the class compiles cleanly.server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java (1)
211-213:⚠️ Potential issue | 🟠 Major
getAnalyzer(DocumentMapper)is dead code and should be removed or deprecated.
buildHighlighterdirectly accesses the index analyzer at line 164 without callinggetAnalyzer. The method is unused and is not overridden by any current subclasses (CustomUnifiedHighlighterorAnnotatedTextHighlighter), but its protected visibility creates a hazard: future subclasses might override it expecting it to be called during highlighting, only to have their override silently ignored. Remove the method or deprecate it clearly to guide future maintainers towardwrapAnalyzerinstead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java` around lines 211 - 213, The getAnalyzer(DocumentMapper) method is dead code and should be removed or clearly deprecated to avoid future subclass confusion: either delete the protected getAnalyzer(DocumentMapper) method from UnifiedHighlighter (and remove any unused imports/tests) or mark it `@Deprecated`, change visibility to private or final if keeping, and update its Javadoc to point callers to wrapAnalyzer; ensure you reference getAnalyzer, buildHighlighter (which currently reads the index analyzer directly), and the subclasses CustomUnifiedHighlighter and AnnotatedTextHighlighter when making the change so you don’t break expected behavior.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.javaserver/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.
Applied to files:
server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
🔇 Additional comments (2)
server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java (1)
215-220:wrapAnalyzerhelper is well-structured.The null check on
fieldMaxAnalyzedOffsetcorrectly handles the non-offset-limiting path, and delegation togetLimitedOffsetAnalyzeris clean.plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java (1)
78-81: LGTM — correct fix for the annotated highlighter wrapping.
super.wrapAnalyzeris applied first to optionally injectLimitedOffsetAnalyzer, thenAnnotatedHighlighterAnalyzerwraps the result. WhenfieldMaxAnalyzedOffsetisnullthe chain isAnnotatedHighlighterAnalyzer(baseAnalyzer)— identical to the old behaviour — and when it is set the chain becomesAnnotatedHighlighterAnalyzer(LimitedOffsetAnalyzer(baseAnalyzer)), which is the intended fix. The cast inloadFieldValues(line 73) remains safe because this override always produces anAnnotatedHighlighterAnalyzer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`:
- Around line 164-169: In UnifiedHighlighter, the analyzer returned by
wrapAnalyzer is being overwritten when a derived field type exists; move the
call to wrapAnalyzer so it is applied after resolving derivedFieldType (i.e.,
resolve derivedFieldType via
context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName),
select the analyzer from either the derived field
((DerivedFieldType).getIndexAnalyzer()) or the original indexAnalyzer, and then
call wrapAnalyzer on that chosen analyzer using fieldMaxAnalyzedOffset) so the
offset-limiting wrapper is never discarded; also ensure that when
AnnotatedTextHighlighter will later cast the analyzer (see
AnnotatedTextHighlighter.loadFieldValues) the wrapped analyzer preserves/returns
an AnnotatedHighlighterAnalyzer-compatible instance if required.
---
Outside diff comments:
In
`@plugins/mapper-annotated-text/src/main/java/org/opensearch/search/fetch/subphase/highlight/AnnotatedTextHighlighter.java`:
- Line 38: The import org.opensearch.index.mapper.DocumentMapper in
AnnotatedTextHighlighter is unused (the getAnalyzer(DocumentMapper) override was
removed); delete that import line and run organize-imports / recompile to ensure
no other references remain so the class compiles cleanly.
In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`:
- Around line 211-213: The getAnalyzer(DocumentMapper) method is dead code and
should be removed or clearly deprecated to avoid future subclass confusion:
either delete the protected getAnalyzer(DocumentMapper) method from
UnifiedHighlighter (and remove any unused imports/tests) or mark it `@Deprecated`,
change visibility to private or final if keeping, and update its Javadoc to
point callers to wrapAnalyzer; ensure you reference getAnalyzer,
buildHighlighter (which currently reads the index analyzer directly), and the
subclasses CustomUnifiedHighlighter and AnnotatedTextHighlighter when making the
change so you don’t break expected behavior.
| Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer(); | ||
| analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset); | ||
| MappedFieldType derivedFieldType = fieldContext.context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName); | ||
| if (derivedFieldType != null) { | ||
| analyzer = ((DerivedFieldType) derivedFieldType).getIndexAnalyzer(); | ||
| } |
There was a problem hiding this comment.
wrapAnalyzer result is discarded when a derived field type is present.
wrapAnalyzer at line 165 applies offset limiting, but the derived-field override at line 168 replaces the analyzer unconditionally, silently ignoring the fieldMaxAnalyzedOffset setting for derived fields. For AnnotatedTextHighlighter, the replaced analyzer is also not an AnnotatedHighlighterAnalyzer, which would cause a ClassCastException in loadFieldValues (line 73 of AnnotatedTextHighlighter) — although that specific issue pre-dates this PR.
Consider applying wrapAnalyzer after the derived-field type resolution:
🔧 Proposed fix
- Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer();
- analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset);
MappedFieldType derivedFieldType = fieldContext.context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName);
+ Analyzer analyzer = fieldContext.context.mapperService().documentMapper().mappers().indexAnalyzer();
if (derivedFieldType != null) {
analyzer = ((DerivedFieldType) derivedFieldType).getIndexAnalyzer();
}
+ analyzer = wrapAnalyzer(analyzer, fieldMaxAnalyzedOffset);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@server/src/main/java/org/opensearch/search/fetch/subphase/highlight/UnifiedHighlighter.java`
around lines 164 - 169, In UnifiedHighlighter, the analyzer returned by
wrapAnalyzer is being overwritten when a derived field type exists; move the
call to wrapAnalyzer so it is applied after resolving derivedFieldType (i.e.,
resolve derivedFieldType via
context.getQueryShardContext().resolveDerivedFieldType(fieldContext.fieldName),
select the analyzer from either the derived field
((DerivedFieldType).getIndexAnalyzer()) or the original indexAnalyzer, and then
call wrapAnalyzer on that chosen analyzer using fieldMaxAnalyzedOffset) so the
offset-limiting wrapper is never discarded; also ensure that when
AnnotatedTextHighlighter will later cast the analyzer (see
AnnotatedTextHighlighter.loadFieldValues) the wrapped analyzer preserves/returns
an AnnotatedHighlighterAnalyzer-compatible instance if required.
|
❌ Gradle check result for 9b469e7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: sjs004 <simarjeet.singh004@gmail.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20682 +/- ##
============================================
- Coverage 73.32% 73.26% -0.07%
+ Complexity 72064 72000 -64
============================================
Files 5781 5781
Lines 329395 329405 +10
Branches 47525 47525
============================================
- Hits 241536 241333 -203
- Misses 68507 68787 +280
+ Partials 19352 19285 -67 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
This PR fixes a
ClassCastExceptionoccurring in themapper-annotated-text pluginwhen highlighting fields withmax_analyzer_offsetconfigured. The fix involves refactoring howUnifiedHighlightercomposes analyzers, moving from a hardcoded wrapping logic to an extensible hook-based approach (wrapAnalyzer)The Problem (Old Flow)
Previously, the
UnifiedHighlightercore would fetch the analyzer from the plugin and then forcefully wrap it in aLimitTokenOffsetAnalyzer(an anonymous classUnifiedHighlighter$1) if a maximum offset was setOld Object Hierarchy:
Because the Outer Shell was
UnifiedHighlighter$1, the plugin’s attempt to cast the analyzer back to its own type in loadFieldValues failedThe Solution (New Flow)
Introduced a protected
wrapAnalyzer(Analyzer, Integer)hook in theUnifiedHighlightercore. This allows the plugin to control the order of decoration.By overriding this hook in the plugin and calling
super.wrapAnalyzer()first, we ensure that the Core applies the limit wrapper before the plugin applies its own specialized wrapperNew Object Hierarchy:
Key Changes
Related Issues
Resolves #20669
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.