Skip to content

High latency & Blocking behavior in RequestExecutor (Connection Pooling + time.sleep) #89

@eduards-vavere

Description

@eduards-vavere

I am currently using okta-jwt-verifier in a FastAPI (AsyncIO) environment. We have observed a consistent latency baseline of ~250ms per token verification request, along with occasional thread starvation/timeouts under load.

Upon investigating the source code, I found three design patterns in RequestExecutor that seem to cause these performance issues in async environments.

1. Lack of Connection Pooling (Latency)

In request_executor.py, the fire_request method initializes the aiohttp session inside a context manager for every single request.

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L29

# src/okta_jwt_verifier/request_executor.py
async def fire_request(self, uri, **params):
    # New session created for every call -> No Keep-Alive
    async with AsyncCacheControl(cache=self.cache) as cached_sess:
        async with cached_sess.get(uri, **params) as resp:
            # ...

This forces a full TCP/TLS handshake (~200ms+) for every verification attempt, effectively disabling HTTP Keep-Alive.

2. Strict Adherence to no-cache Headers (Latency)

The library uses AsyncCacheControl to respect upstream HTTP headers. We noticed that Okta's keys endpoints often return headers like Cache-Control: no-cache or must-revalidate.

Because the library strictly follows these headers, it ignores the local cache and triggers a network call (re-validation) for every verification request. Combined with issue #1 (no connection pooling), this results in significant latency per request.

3. Blocking time.sleep in Retry Logic (Starvation)

I also noticed that the get method handles rate-limiting wait times using time.sleep(0.1) inside an async method:

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L46

# src/okta_jwt_verifier/request_executor.py
while self.requests_count >= self.max_requests:
    time.sleep(0.1) # <--- Synchronous Sleep

In an AsyncIO environment (like FastAPI or Aiohttp), time.sleep blocks the entire Event Loop, freezing the server for all concurrent requests, not just the current task.

Suggestions / Questions

  1. Connection Pooling: Would it be possible to refactor RequestExecutor to persist self.session across calls, or allow injecting an external aiohttp.ClientSession?
  2. Async Sleep: Should the retry logic use await asyncio.sleep(0.1) instead of time.sleep(0.1) to avoid blocking the main loop?
  3. TTL Option: Would you be open to adding a cache_ttl option to allow opting-in to "Eventual Consistency" (e.g. caching keys for 1 hour) regardless of upstream no-cache headers?

Environment:

  • Python 3.11
  • FastAPI / Uvicorn
  • okta-jwt-verifier version: 0.3.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions