fix: apply CGC immediately during network bootstrap phase#8697
fix: apply CGC immediately during network bootstrap phase#8697qu0b wants to merge 1 commit intosigp:bal-devnet-2from
Conversation
|
Ubuntu seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
pawanjay176
left a comment
There was a problem hiding this comment.
I don't think this is safe.
register_validators is called post genesis and there could be an inconsistent state where the block is received and the da check passes for cgc=4 before register_validators is called.
So nobody has the full data for slot 1.
Can we not run as supernodes for kurtosis instead?
|
I don't quite understand what you're describing here, how can there be an inconsistent state with this PR? In your description it sounds like register validators is called twice, after genesis and after receiving a block? |
|
How can the check pass for cgc=4 if the LH node has 128 validations? In the current implementation cgc=4 because of the delay, so Nobody has the full data in a LH only network. |
Because until the I think the proper fix would be allow |
Replace the `epoch <= 1 || is_before_peerdas` special-case with a simpler approach: start the VC preparation service before genesis wait so CGC registrations arrive at the BN before the first block. Changes: - Move preparation_service.start_update_service() before wait_for_genesis() - Fix BN HTTP handler to use now_or_genesis() for the CGC slot read (chain.slot() returns Err pre-genesis, killing the registration path) - In register_validators(), apply CGC at epoch 0 when slot == 0 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1818723 to
83b000a
Compare
|
@pawanjay176 I let claude cook up something based on your suggestion |
Summary
When validators register during early network bootstrap (epoch 0-1) or before PeerDAS activates, apply the custody group count (CGC) immediately instead of with the standard 30-second delay.
Problem
The standard delay exists to give nodes time to subscribe to new subnets and avoid inconsistent column counts within an epoch. However, during network bootstrap, this delay causes issues:
CGC = spec.custody_requirement(4 custody groups)This was observed in bal-devnet-2 testing where pure Lighthouse networks without
--supernodeflag would consistently split by epoch 4-5.Solution
Apply CGC immediately when:
current_epoch <= 1(bootstrap phase), ORcurrent_epoch < fulu_fork_epoch(pre-PeerDAS)For established networks (epoch 2+), the standard delay is preserved to ensure smooth subnet subscription coordination.
Test Results
Tested on local Kurtosis devnet with
preset: minimal,fulu_fork_epoch: 0,gloas_fork_epoch: 1.Changes
beacon_node/beacon_chain/src/custody_context.rs: Modifiedregister_validators()to check for bootstrap phase before applying delay🤖 Generated with Claude Code