Skip to content

Head monitor service resilience improvements #8741

@jimmygchen

Description

@jimmygchen

Description

Follow-up improvements for the beacon head monitor service introduced in #7892. The current implementation is a solid MVP, but several enhancements were identified during review that would improve resilience and robustness.

Improvements

1. Per-stream error handling and retry

Currently, if a beacon node is restarted after the VC starts, the event stream for that node is never re-established until the VC itself restarts. The service only restarts when all streams fail.

The current error handling returns from the entire function on any stream error:

return Err("Head monitoring service channel closed".into());
}
}
Ok(event) => {
warn!(
event_kind = event.topic_name(),

A more resilient approach would handle errors per-stream independently and retry failed streams, so a flaky BN doesn't disrupt monitoring of healthy BNs.

2. Handle dynamic beacon node list updates

When update_candidates_list is called via API with a new BN list, stale SSE connections are maintained and the index mapping can become outdated. For example, if connected to nodes [A, B, C] and the list changes to [B, C, D], head events from B would still report index 1 (its original position) even though B is now at index 0 in the candidate list.

Options discussed:

  • Trigger service restart when candidate list changes
  • Key cache on endpoint/ID rather than index

The cache purge mechanism exists but only runs on service restart:

3. Unit tests

Add unit tests for:

  • BeaconHeadCache methods
  • first_success_or_fallback function (similar to existing first_success_should_try_nodes_in_order test)

Additional Info

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestval-clientRelates to the validator client binary

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions