Primary/replica status fails to update after Redis Cluster node status change

Redis nodes in a Redis Cluster can change their status from replica to primary and vice versa, when a primary node becomes unavailable. When this happens, StackExchange.Redis seems to fail in updating its knowledge (held in memory?) to match the new primary/replica status in the Redis Cluster. Importantly, this leads to many (1/3 in a Redis cluster with 3 primary and 3 replica nodes?) calls failing due to the following exception: `Command cannot be issued to a replica` (`StackExchange.Redis.RedisCommandException`). This is despite the cluster having returned to a fully functional state with all primary and replica nodes available.

This situation where some % of calls fail due to the said exception remains for an indefinite amount of time. One time when the situation was left unhandled manually, it remained approximately for an hour before resolving by itself without a clear trigger why.

The `Command cannot be issued to a replica` exception would be [ExceptionFactory.PrimaryOnly](https://github.com/StackExchange/StackExchange.Redis/blob/84b015e4196875444349b894b5ce51f50d8d33d0/src/StackExchange.Redis/ExceptionFactory.cs#L66C49-L66C57) which according to our stack trace (see below) is thrown by [PhysicalBridge.WriteMessageToServerInsideWriteLock](https://github.com/StackExchange/StackExchange.Redis/blob/84b015e4196875444349b894b5ce51f50d8d33d0/src/StackExchange.Redis/PhysicalBridge.cs#L1538). The exception appears to be thrown by the client library without attempting to call Redis.

During this situation, StackExchange.Redis keeps attempting to call [ConnectionMultiplexer.ReconfigureAsync](https://github.com/StackExchange/StackExchange.Redis/blob/84b015e4196875444349b894b5ce51f50d8d33d0/src/StackExchange.Redis/ConnectionMultiplexer.cs#L1405C35-L1405C51) but it does not update the Redis Cluster node primary/replica status to match the situation of the actual nodes. I am deducting `ConnectionMultiplexer.ReconfigureAsync` being called from the "Endpoint Summary" logs being produced. I have tested forcing `ConnectionMultiplexer.ReconfigureAsync` calls every minute and it does not solve the situation. Creating a new `ConnectionMultiplexer` and opening a totally new connection does work but should not be required as it is expensive.

Stack trace:
```
2026-01-27 12:42:18,830 [6616/103] ERROR - Error setting value to Redis cache, key=<REDACTED_KEY>
StackExchange.Redis.RedisConnectionException: InternalFailure on [0]:SETEX <REDACTED_KEY> (BooleanProcessor) ---> StackExchange.Redis.RedisCommandException: Command cannot be issued to a replica: SETEX <REDACTED_KEY>
   at StackExchange.Redis.PhysicalBridge.WriteMessageToServerInsideWriteLock(PhysicalConnection connection, Message message) in /_/src/StackExchange.Redis/PhysicalBridge.cs:line 1573
   --- End of inner exception stack trace ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
```

## Example of replicating the situation

This was replicated in a test environment with the following specifications:

StackExchange.Redis version: `2.9.32`
Redis version: `7.4.6`
.NET version: `.NET Framework 4.8.1`
Redis Cluster set up: 3 primary nodes, 3 replica nodes

### Redis Cluster initial state

These endpoint summaries are provided by the `Trace` logs by StackExchange.Redis. I have redacted server addresses.

```
2026-01-27 12:31:57,192 [6616/11] INFO - Endpoint Summary:
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_A>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_B>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_C>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_D>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_E>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:31:57,207 [6616/11] INFO - Server summary: <NODE_F>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
```


### SHUTDOWN node NODE_C

```
2026-01-27 12:36:21,306 [6616/103] WARN - RedisConnection.OnConnectionFailed() failure=SocketClosed endPoint=IPEndPoint(<NODE_C>)
```

This was logged by the following custom code where `connection` is `ConnectionMultiplexer`:

```C#
connection.ConnectionFailed += (sender, args) =>
{
    var endPoint = EndPointToString(args.EndPoint);
    Loki.Warn($"RedisConnection.OnConnectionFailed() failure={args.FailureType} endPoint={endPoint}", args.Exception);
};
```

### StackExchange.Redis first attempt to reconfigure

See NODE_C in state `Connecting`

```
2026-01-27 12:37:25,963 [6616/64] INFO - Endpoint Summary:
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_A>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_B>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_C>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: Connecting; sub: n/a;
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_D>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_E>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:37:25,963 [6616/79] INFO - Server summary: <NODE_F>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
```

`Command cannot be issued to a replica` exceptions already started showing during this stage:

```
2026-01-27 12:36:44,399 [6616/86] ERROR - Error setting value to Redis cache, key=<REDACTED_KEY> StackExchange.Redis.RedisConnectionException: InternalFailure on [0]:SETEX <REDACTED_KEY> (BooleanProcessor) ---> StackExchange.Redis.RedisCommandException: Command cannot be issued to a replica: SETEX <REDACTED_KEY> at StackExchange.Redis.PhysicalBridge.WriteMessageToServerInsideWriteLock(PhysicalConnection connection, Message message) in /_/src/StackExchange.Redis/PhysicalBridge.cs:line 1573 --- End of inner exception stack trace --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
```
   
### RESTART NODE_C

Note: If the shutdown node is restarted too quickly after shutting it down, the situation might not reproduce due no the primary/replica status of the nodes not changing.

`2026-01-27 12:45:24,395 [6616/87] INFO - RedisConnection.OnConnectionRestored() endPoint=IPEndPoint(<NODE_C>)`

This was logged by the following custom code where `connection` is `ConnectionMultiplexer`:

```C#
connection.ConnectionRestored += (sender, args) =>
{
    var endPoint = EndPointToString(args.EndPoint);
    _log.Info($"RedisConnection.OnConnectionRestored() endPoint={endPoint}");
};
```

### RESULT

Notice NODE_C has turned into a replica but NODE_D also remains as a replica. At this stage all the connections are `ConnectedEstablished` and the Redis Cluster has returned to fully operational state with 3 primary and 3 replica nodes, but StackExchange.Redis incorrectly thinks there are 2 primary and 4 replicas nodes.

```
2026-01-27 12:45:30,239 [6616/64] INFO - Endpoint Summary:
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_A>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_B>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_C>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_D>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_E>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:45:30,239 [6616/64] INFO - Server summary: <NODE_F>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
```

StackExchange.Redis does not solve the situation and it remains for an indefinite amount of time despite StackExchange.Redis attempting to reconfigure multiple times. Notice the situation of 2 primary and 4 replica nodes remains after over 6 minutes have passed:
```
2026-01-27 12:51:59,129 [6616/79] INFO - Endpoint Summary:
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_A>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_B>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_C>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_D>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_E>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:51:59,129 [6616/79] INFO - Server summary: <NODE_F>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
```

### Correct primary/replica status only after restarting the application
Only after the connection is fully reset by an application pool recycle, StackExchange.Redis learns the correct primary/replica status, where NODE_C has become a replica and NODE_D has become a primary:
```
2026-01-27 12:54:12,764 [9928/9] INFO - Endpoint Summary:
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_A>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_B>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_C>: Cluster v7.4.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_D>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_E>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
2026-01-27 12:54:12,764 [9928/9] INFO - Server summary: <NODE_F>: Cluster v7.4.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished;
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Primary/replica status fails to update after Redis Cluster node status change #3000

Example of replicating the situation

Redis Cluster initial state

SHUTDOWN node NODE_C

StackExchange.Redis first attempt to reconfigure

RESTART NODE_C

RESULT

Correct primary/replica status only after restarting the application

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Primary/replica status fails to update after Redis Cluster node status change #3000

Description

Example of replicating the situation

Redis Cluster initial state

SHUTDOWN node NODE_C

StackExchange.Redis first attempt to reconfigure

RESTART NODE_C

RESULT

Correct primary/replica status only after restarting the application

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions