Skip to content

Add built-in connection health monitoring and automatic reconnection support #448

@maxkoretskyi

Description

@maxkoretskyi

Proposed changes

Add built-in support for:

  1. Connection health monitoring

    • Application-level heartbeat mechanism to detect silent disconnects (e.g., network path failures, NAT timeouts, server crashes) where no WebSocket close frame is received and the connection appears open but is actually dead
  2. Automatic reconnection - Optional auto-reconnect with configurable:

    • Max retry attempts
    • Backoff strategy (exponential, linear)
    • Retry delays
  3. KeepAlive auto-send

    • Optional automatic KeepAlive message sending at configurable intervals (default 8s for Agent, 10s for STT)

Context

Currently, the SDK provides low-level primitives (disconnect(), reconnect(), keepAlive()) but requires developers to implement their own:

  • KeepAlive interval timers
  • Silent disconnect detection (no ping/pong or heartbeat mechanism)
  • Reconnection logic with backoff
  • Timestamp offset tracking across reconnections

This is error-prone and leads to duplicated boilerplate across applications. Long-running streaming sessions are particularly vulnerable to silent network failures that go undetected for 30-90+ seconds.

Possible Implementation

  • Reference implementation: Socket.IO ping/pong mechanism - Socket.IO sends periodic ping packets and expects pong responses within a timeout. If no pong is received, the connection is considered dead and closed. This is a well-established pattern for detecting silent disconnects.
  • The official docs recommend these patterns at Recovering From Connection Errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions