fix: Resilient Redis cache reads for rate limiting#3455
fix: Resilient Redis cache reads for rate limiting#3455gabrielshanahan wants to merge 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds a new Spring component ResilientCacheAccessor to safely wrap Cache.get calls (catches RedisException, logs, evicts key, returns null). Integrates it into RateLimitService. Global rate-limit filters now skip requests whose path equals or starts with /actuator. Tests added/updated. Changes
Sequence DiagramsequenceDiagram
participant Client
participant RateLimitFilter
participant RateLimitService
participant ResilientCacheAccessor
participant Cache
participant Redis
Client->>RateLimitFilter: HTTP request
RateLimitFilter->>RateLimitService: consume/check bucket
RateLimitService->>ResilientCacheAccessor: get(cache, key, Bucket.class)
ResilientCacheAccessor->>Cache: cache.get(key, type)
Cache->>Redis: fetch entry
alt Redis deserialization succeeds
Redis-->>Cache: value
Cache-->>ResilientCacheAccessor: Bucket
ResilientCacheAccessor-->>RateLimitService: Bucket
else RedisException occurs
Redis-->>Cache: throws RedisException
ResilientCacheAccessor->>ResilientCacheAccessor: log warnings
ResilientCacheAccessor->>Cache: cache.evictIfPresent(key)
ResilientCacheAccessor-->>RateLimitService: null
end
RateLimitService->>RateLimitService: create fresh bucket if null
RateLimitService-->>RateLimitFilter: allow/reject
RateLimitFilter-->>Client: response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
26d5b9f to
af106c8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@backend/security/src/main/kotlin/io/tolgee/security/ratelimit/GlobalUserRateLimitFilter.kt`:
- Around line 53-55: In GlobalUserRateLimitFilter.shouldNotFilter, the current
check request.requestURI.startsWith("/actuator/") misses the bare "/actuator"
path; update the condition to cover both forms (e.g. replace
startsWith("/actuator/") with request.requestURI == "/actuator" ||
request.requestURI.startsWith("/actuator/") or simply startsWith("/actuator"))
so the filter skips both "/actuator" and any subpaths.
backend/security/src/main/kotlin/io/tolgee/security/ratelimit/GlobalUserRateLimitFilter.kt
Show resolved
Hide resolved
af106c8 to
065aa6c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@backend/security/src/main/kotlin/io/tolgee/security/ratelimit/GlobalIpRateLimitFilter.kt`:
- Around line 55-57: In shouldNotFilter, strip the servlet context path from
request.requestURI before performing the actuator checks so endpoints under a
context path still match; use request.contextPath (or request.getContextPath())
to remove the prefix, then compare the resulting path for equals "/actuator",
startsWith "/actuator/", or method == "OPTIONS" to decide skipping; update the
logic in GlobalIpRateLimitFilter.shouldNotFilter to operate on the trimmed URI.
🧹 Nitpick comments (3)
backend/security/src/test/kotlin/io/tolgee/security/ratelimit/GlobalUserRateLimitFilterTest.kt (1)
94-120: Also assert the rate-limit call is skipped.These tests only assert “no exception”; adding a
never()verify ensures actuator requests truly bypass the rate-limiter.🧪 Suggested assertion
assertDoesNotThrow { rateLimitFilter.doFilter(req, res, chain) } + Mockito.verify(rateLimitService, Mockito.never()) + .consumeGlobalUserRateLimitPolicy(any(), any())backend/security/src/test/kotlin/io/tolgee/security/ratelimit/GlobalIpRateLimitFilterTest.kt (1)
89-115: Also assert the rate-limit call is skipped.These tests pass even if the rate-limit call still occurs; consider verifying it’s never invoked to lock in the bypass behavior.
🧪 Suggested assertion
assertDoesNotThrow { rateLimitFilter.doFilter(req, res, chain) } + Mockito.verify(rateLimitService, Mockito.never()) + .consumeGlobalIpRateLimitPolicy(any())backend/data/src/test/kotlin/io/tolgee/security/ratelimit/RateLimitServiceTest.kt (1)
340-364: Strengthen the corrupted-cache test by asserting the accessor call.Right now it only checks that no exception is thrown; verifying the accessor interaction ensures the intended path is exercised.
🧪 Suggested assertion
serviceWithMockAccessor.consumeBucket(testPolicy) + Mockito.verify(mockAccessor) + .get(any<Cache>(), eq("corrupted_test"), eq(Bucket::class.java))
backend/security/src/main/kotlin/io/tolgee/security/ratelimit/GlobalIpRateLimitFilter.kt
Outdated
Show resolved
Hide resolved
065aa6c to
0752976
Compare
|
Can we remove the |
This reverts commit 7808583.
Health checks (actuator endpoints) go through Spring Security's FilterChainProxy on both port 8080 and 8081. When cached Bucket objects can't be deserialized (e.g. after schema changes), rate limit filters throw RedisException, causing health probes to fail and Kubernetes to restart pods in a crash loop. Two defense layers: 1. Exclude /actuator/ from rate limit filters — prevents crash loops even with corrupted cache entries 2. ResilientCacheAccessor — catches RedisException on direct cache.get() calls (not covered by TolgeeCacheErrorHandler which only handles @Cacheable), evicts bad entries, and returns null (cache miss) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0752976 to
30f4c3e
Compare
Summary
/actuator/health) no longer go throughGlobalIpRateLimitFilter/GlobalUserRateLimitFilter, preventing crash loops when Redis cache entries are corruptedResilientCacheAccessorcomponent — catchesRedisExceptionon directcache.get()calls (not covered byTolgeeCacheErrorHandlerwhich only handles@Cacheable), evicts bad entries, returnsnull(cache miss)RateLimitService— corrupted Bucket entries are treated as cache misses instead of 500 errorsRoot cause (v3.154.0 crash loop)
Bucketclass gainedstrikeCount/lastStrikeAtfields → old Kryo-serialized entries in Redis couldn't deserialize →KryoBufferUnderflowException→ health checks hit rate limit filter → HTTP 500 → K8s restarts pod → crash loop.Defense in depth
/actuator/**RedisException, evicts, returns nullTest plan
ResilientCacheAccessorTest— 5 tests (normal get, null return, RedisException caught + eviction, non-Redis propagation)RateLimitServiceTest— 15 tests including newcorrupted cache entry is treated as cache missserver-app:runStandardTests)rateLimitsentry, verify graceful handling🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests