Skip to content

Fix connection contention issue in http_utils.py - reuse ClientSession #591

@cgillum

Description

@cgillum

Summary

The Python Durable Functions SDK creates a new aiohttp.ClientSession for every HTTP request to the internal RPC endpoint. This is an anti-pattern that can cause connection contention and timeouts under concurrent load.

Problem

In http_utils.py, the post_async_request function creates a new session for each request:

python async def post_async_request(url: str, data: Any = None, ...) -> List[Union[int, Any]]: async with aiohttp.ClientSession() as session: # New session per request # ...

The aiohttp documentation explicitly warns against this pattern:

"Don't create a session per request. Most likely you need a session per application which performs all requests together."

Impact

During a production investigation (ICM 695094479), we observed intermittent ConnectionTimeoutError (~30s) when calling client.start_new() to start orchestrations. The error occurs in aiohttp/connector.py:_wrap_create_connection, indicating TCP connection establishment failures.

Under concurrent load (bursts of 6-9 simultaneous requests), multiple requests compete to establish new TCP connections instead of reusing pooled connections from a shared session.

Proposed Fix

Modify http_utils.py to reuse a single ClientSession with configurable timeout and connection pooling.

Considerations

  1. Thread safety - May need to use an async lock when initializing the session
  2. Session lifecycle - Need to handle session cleanup on worker shutdown
  3. Connection limits - The TCPConnector limits should be tuned appropriately

Additional Context

  • The 30s timeout matches aiohttp's default sock_connect timeout
  • Similar issues have been reported in the aiohttp community

Metadata

Metadata

Labels

P1Priority 1

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions