High latency & Blocking behavior in `RequestExecutor` (Connection Pooling + `time.sleep`)

I am currently using `okta-jwt-verifier` in a FastAPI (AsyncIO) environment. We have observed a consistent latency baseline of **~250ms per token verification request**, along with occasional thread starvation/timeouts under load.

Upon investigating the source code, I found three design patterns in `RequestExecutor` that seem to cause these performance issues in async environments.

### 1. Lack of Connection Pooling (Latency)
In `request_executor.py`, the `fire_request` method initializes the `aiohttp` session inside a context manager for *every single request*.

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L29

```python
# src/okta_jwt_verifier/request_executor.py
async def fire_request(self, uri, **params):
    # New session created for every call -> No Keep-Alive
    async with AsyncCacheControl(cache=self.cache) as cached_sess:
        async with cached_sess.get(uri, **params) as resp:
            # ...
```

This forces a full TCP/TLS handshake (~200ms+) for every verification attempt, effectively disabling HTTP Keep-Alive.

### 2. Strict Adherence to `no-cache` Headers (Latency)
The library uses `AsyncCacheControl` to respect upstream HTTP headers. We noticed that Okta's keys endpoints often return headers like `Cache-Control: no-cache` or `must-revalidate`.

Because the library strictly follows these headers, it ignores the local cache and triggers a network call (re-validation) for every verification request. Combined with issue #1 (no connection pooling), this results in significant latency per request.

### 3. Blocking `time.sleep` in Retry Logic (Starvation)
I also noticed that the `get` method handles rate-limiting wait times using `time.sleep(0.1)` inside an async method:

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L46

```python
# src/okta_jwt_verifier/request_executor.py
while self.requests_count >= self.max_requests:
    time.sleep(0.1) # <--- Synchronous Sleep
```

In an AsyncIO environment (like FastAPI or Aiohttp), `time.sleep` blocks the entire Event Loop, freezing the server for all concurrent requests, not just the current task.

### Suggestions / Questions

1.  **Connection Pooling:** Would it be possible to refactor `RequestExecutor` to persist `self.session` across calls, or allow injecting an external `aiohttp.ClientSession`?
2.  **Async Sleep:** Should the retry logic use `await asyncio.sleep(0.1)` instead of `time.sleep(0.1)` to avoid blocking the main loop?
3.  **TTL Option:** Would you be open to adding a `cache_ttl` option to allow opting-in to "Eventual Consistency" (e.g. caching keys for 1 hour) regardless of upstream `no-cache` headers?

**Environment:**
* Python 3.11
* FastAPI / Uvicorn
* `okta-jwt-verifier` version: 0.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High latency & Blocking behavior in `RequestExecutor` (Connection Pooling + `time.sleep`) #89

1. Lack of Connection Pooling (Latency)

2. Strict Adherence to `no-cache` Headers (Latency)

3. Blocking `time.sleep` in Retry Logic (Starvation)

Suggestions / Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

High latency & Blocking behavior in RequestExecutor (Connection Pooling + time.sleep) #89

Description

1. Lack of Connection Pooling (Latency)

2. Strict Adherence to no-cache Headers (Latency)

3. Blocking time.sleep in Retry Logic (Starvation)

Suggestions / Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

High latency & Blocking behavior in `RequestExecutor` (Connection Pooling + `time.sleep`) #89

2. Strict Adherence to `no-cache` Headers (Latency)

3. Blocking `time.sleep` in Retry Logic (Starvation)