Skip to content

Fix closed-index reroute handling#20648

Open
srikanthpadakanti wants to merge 5 commits intoopensearch-project:mainfrom
srikanthpadakanti:20200-closed-index-reroute-fix
Open

Fix closed-index reroute handling#20648
srikanthpadakanti wants to merge 5 commits intoopensearch-project:mainfrom
srikanthpadakanti:20200-closed-index-reroute-fix

Conversation

@srikanthpadakanti
Copy link
Contributor

Description

Prevents allocation of shards from closed indices during cluster reroute operations. Added a check in AllocationService.allocateExistingUnassignedShards() to skip shards belonging to closed indices, ensuring they remain unassigned until the index is explicitly reopened.

Related Issues

Resolves #20200

Check List

  • [ X ] Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR fixes a bug where shard routings for closed indexes were being allocated without the index being reopened. The fix adds guards to skip allocation and batching when shards have an INDEX_CLOSED unassigned reason, along with comprehensive test coverage.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md
Added entry documenting the fix for shard routings of closed indexes being allocated without opening the index.
Core Allocation Logic
server/src/main/java/org/opensearch/gateway/GatewayAllocator.java, server/src/main/java/org/opensearch/gateway/ShardsBatchGatewayAllocator.java
Added guards to skip allocation and batching for unassigned shards with INDEX_CLOSED reason. GatewayAllocator returns early before primary/replica allocation logic; ShardsBatchGatewayAllocator prevents adding closed-index shards to batches.
Test Coverage
server/src/test/java/org/opensearch/gateway/GatewayAllocatorTests.java
Added three new test methods: testIndexClosedShardsNotBatchedInBatchMode (verifies closed shards not batched), testIndexClosedShardsSkippedInNonBatchMode (verifies no allocation attempts for closed shards), and testNonClosedShardsStillBatchedNormally (verifies normal batching still works for open indexes).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: fixing how closed-index shard routings are handled during reroute operations to prevent unwanted allocation.
Description check ✅ Passed The PR description includes all required sections from the template: a clear description of changes, a linked issue reference (#20200), and a completed checklist confirming testing was included.
Linked Issues check ✅ Passed The code changes fully address the linked issue #20200 by adding guards in GatewayAllocator and ShardsBatchGatewayAllocator to skip allocation of shards with INDEX_CLOSED reason, preventing unwanted allocation of closed-index shards during reroute.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing closed-index reroute handling: two guard implementations to skip INDEX_CLOSED shards, CHANGELOG entry, and comprehensive test coverage with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Srikanth Padakanti added 2 commits February 17, 2026 12:47
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bbea60 and 9f4ae54.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/gateway/GatewayAllocator.java
  • server/src/main/java/org/opensearch/gateway/ShardsBatchGatewayAllocator.java
  • server/src/test/java/org/opensearch/gateway/GatewayAllocatorTests.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.

Applied to files:

  • server/src/main/java/org/opensearch/gateway/GatewayAllocator.java
  • server/src/main/java/org/opensearch/gateway/ShardsBatchGatewayAllocator.java
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/gateway/GatewayAllocatorTests.java (1)
server/src/main/java/org/opensearch/gateway/GatewayAllocator.java (1)
  • GatewayAllocator (74-369)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, macos-15)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, macos-15)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: detect-breaking-change
🔇 Additional comments (8)
CHANGELOG.md (1)

32-32: Changelog entry matches the fix scope.

server/src/main/java/org/opensearch/gateway/ShardsBatchGatewayAllocator.java (1)

23-23: Import aligns with INDEX_CLOSED checks.

server/src/main/java/org/opensearch/gateway/GatewayAllocator.java (2)

46-46: Import supports the new INDEX_CLOSED guard.


183-187: Correctly skips allocation for closed indices.

server/src/test/java/org/opensearch/gateway/GatewayAllocatorTests.java (4)

26-33: Imports support the new test scenarios.


671-745: Good coverage for batch mode with INDEX_CLOSED shards.


747-881: Non-batch mode guard is well validated.


883-946: Control test for OPEN index batching is solid.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@server/src/main/java/org/opensearch/gateway/ShardsBatchGatewayAllocator.java`:
- Around line 404-412: Move the INDEX_CLOSED guard so closed-index shards are
handled before checking currentBatchedShards: in the unassigned.forEach(...)
lambda (the shardRouting loop), first check shardRouting.unassignedInfo() !=
null && shardRouting.unassignedInfo().getReason() ==
UnassignedInfo.Reason.INDEX_CLOSED and if true remove/evict that shardId from
currentBatchedShards (so stale batches are cleared, e.g., by calling the same
logic used by refreshShardBatches) and return; only after that test perform the
existing containsKey(shardRouting.shardId()) / primary checks and add to
newShardsToBatch as before. This ensures previously-batched INDEX_CLOSED shards
are not retained.

…hards

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 24281a0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 5fac3a4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Cluster Manager

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[BUG] Shard routings for closed index are allocated again without opening the index

1 participant