-
Notifications
You must be signed in to change notification settings - Fork 966
Description
Description
Follow-up improvements for the beacon head monitor service introduced in #7892. The current implementation is a solid MVP, but several enhancements were identified during review that would improve resilience and robustness.
Improvements
1. Per-stream error handling and retry
Currently, if a beacon node is restarted after the VC starts, the event stream for that node is never re-established until the VC itself restarts. The service only restarts when all streams fail.
The current error handling returns from the entire function on any stream error:
lighthouse/validator_client/beacon_node_fallback/src/beacon_head_monitor.rs
Lines 173 to 178 in 0e9dab3
| return Err("Head monitoring service channel closed".into()); | |
| } | |
| } | |
| Ok(event) => { | |
| warn!( | |
| event_kind = event.topic_name(), |
A more resilient approach would handle errors per-stream independently and retry failed streams, so a flaky BN doesn't disrupt monitoring of healthy BNs.
2. Handle dynamic beacon node list updates
When update_candidates_list is called via API with a new BN list, stale SSE connections are maintained and the index mapping can become outdated. For example, if connected to nodes [A, B, C] and the list changes to [B, C, D], head events from B would still report index 1 (its original position) even though B is now at index 0 in the candidate list.
Options discussed:
- Trigger service restart when candidate list changes
- Key cache on endpoint/ID rather than index
The cache purge mechanism exists but only runs on service restart:
lighthouse/validator_client/beacon_node_fallback/src/beacon_head_monitor.rs
Lines 97 to 98 in 0e9dab3
| }; | |
3. Unit tests
Add unit tests for:
BeaconHeadCachemethodsfirst_success_or_fallbackfunction (similar to existingfirst_success_should_try_nodes_in_ordertest)
Additional Info
- Discussion: Use events API to eager send attestations #7892
- Related to parallel BN handling: Consider flag for making VC
first_successparallel #7908