[Fix] Make `prepare_next_quorum_block` atomic #3070

kaimast · 2026-01-05T21:24:02Z

This will avoid certain race conditions when preparing a block template.

Concretely, in the current design the following is possible:

There is a thread t that calls Ledger::prepare_next_beacon_block which reads the value of current_block
There is another thread t' that calls Ledger::advance_to_next_block, which updates current_block` and the state root.
If t now continues execution and reads the state root, it will return a block and not an error, but the block will be invalid, i.e. the state root is incorrect.

This is problematic for two reasons:

Ledger implements Sync, which implies concurrent calls to its functions should be executed atomically and not read partial/inconsistent data.
Such invalid blocks are harder to catch on the snarkOS side, as they appear to have the correct height and inconsistencies in the state root will be detected only later when the block is executed.

While this PR will ensure calls to the Ledger are atomic (formally, the Ledger being linearizable), it does not ensure that multiple calls to these functions do not interleave in unintended ways (formally, serializable). The latter is implemented in snarkOS and out of the scope of snarkVM, while the former makes the implementation for in snarkOS a lot easier.

P.S.: There is this great blog post by Peter Bailis describing the two formal guarantees mentioned in the previous paragraph, in case you want to learn more about the formalism behind it.

Other minor changes

The PR also reduces log verbosity for sequential operations, as they add a decent amount of noise otherwise.

vicsn · 2026-01-06T14:55:55Z

Can you explain in the PR README with which test setup you saw what issue come up and how you validated it is resolved?

Can you try to run a network while calling GET /<network>/block/latest on repeat?

All existing calls to latest_block() in snarkVM and snarkOS seem to happen outside of the block generation path, so may incur some delay but no obvious deadlock.

ljedrz

It's not clear to me how this fixes things, as prepare_advance_to_next_quorum_block appears to only reuse the block obtained at the beginning.

If I'm not mistaken, the specific intent is to make it impossible for concurrent calls to try_advance_to_next_block to have a different previous_block; however, I'd argue that those calls should not happen concurrently at all, which this change doesn't prevent, as multiple concurrent readers are permitted.

kaimast · 2026-01-09T18:16:59Z

If I'm not mistaken, the specific intent is to make it impossible for concurrent calls to try_advance_to_next_block to have a different previous_block; however, I'd argue that those calls should not happen concurrently at all, which this change doesn't prevent, as multiple concurrent readers are permitted.

That is actually not the goal. I simply do not want any concurrent updates or reads to the ledger while the function is executing. I added some more comments to the code and will update the PR description in a second.
At a high-level my reasoning is that Ledger is Sync, which implies that it can be called to from concurrent threads, without ending up in a corrupted state, but it is possible to create invalid block templates if there are certain concurrent invocations in the current design.

ljedrz

Ah, I initially missed the broader context.

LGTM; a read guard should suffice to provide the desired guarantees, while not being too restrictive for any concurrent work. In the future we might want to reorganize the related logic so that mistakes are less likely to happen due to the structure itself, but it's a step in the right direction, and it's well-documented.

vicsn · 2026-01-21T19:54:39Z

Let's first finish ProvableHQ/snarkOS#4069 before merging

raychu86 · 2026-02-04T21:39:18Z

ledger/src/check_next_block.rs

Is there a deadlock condition here? latest_timestamp also tries to get the read lock. It should be handled the same way latest_epoch_hash is fetched.

latest_timestamp() is also used in Ledger::num_remaining_solutions, which may have a conflicting .read() as well.

kaimast force-pushed the fix/atomic_prepare_next_block branch from f2c2142 to bbb39e1 Compare January 5, 2026 22:42

kaimast mentioned this pull request Jan 5, 2026

[Fix] Pause BFT block advancement during sync ProvableHQ/snarkOS#4039

Draft

kaimast force-pushed the fix/atomic_prepare_next_block branch from bbb39e1 to 2a42f7e Compare January 6, 2026 00:41

kaimast marked this pull request as ready for review January 6, 2026 00:42

kaimast requested review from ljedrz and vicsn January 6, 2026 03:55

kaimast force-pushed the fix/atomic_prepare_next_block branch from 2a42f7e to 93e9807 Compare January 6, 2026 19:27

ljedrz reviewed Jan 7, 2026

View reviewed changes

ljedrz approved these changes Jan 12, 2026

View reviewed changes

kaimast mentioned this pull request Jan 21, 2026

[Feature] Add Ledger::try_get_* for transmissions #3117

Draft

kaimast force-pushed the fix/atomic_prepare_next_block branch from 0c6001d to 50d9138 Compare February 4, 2026 05:28

raychu86 reviewed Feb 4, 2026

View reviewed changes

kaimast and others added 6 commits February 9, 2026 09:38

fix(ledger): lock current_block during speculation

be33656

logs(synth/vm): reduce verbosity for sequential ops

e69c0d5

misc(ledger): fix typo

5d1d527

docs(ledger): improve documentation on Ledger atomicity

41f8e42

fix(ledger): directly fetch latest round and epoch hash

bf31d55

feat(ledger): return CheckBlockError for prepare_next_*_block

e12a4e4

kaimast force-pushed the fix/atomic_prepare_next_block branch from 991ff32 to e12a4e4 Compare February 9, 2026 17:49

kaimast mentioned this pull request Feb 9, 2026

[Fix] Update snarkVM to support atomic prepare_next_block ProvableHQ/snarkOS#4096

Draft

fix(ledger): prevent a potential deadlock in is_solution_limit_reached

9ba0df7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Make `prepare_next_quorum_block` atomic #3070

[Fix] Make `prepare_next_quorum_block` atomic #3070

kaimast commented Jan 5, 2026 •

edited

Loading

Uh oh!

vicsn commented Jan 6, 2026 •

edited

Loading

Uh oh!

ljedrz left a comment

Uh oh!

kaimast commented Jan 9, 2026

Uh oh!

ljedrz left a comment

Uh oh!

vicsn commented Jan 21, 2026

Uh oh!

raychu86 Feb 4, 2026

Uh oh!

raychu86 Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Fix] Make prepare_next_quorum_block atomic #3070

Are you sure you want to change the base?

[Fix] Make prepare_next_quorum_block atomic #3070

Conversation

kaimast commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other minor changes

Uh oh!

vicsn commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ljedrz left a comment

Choose a reason for hiding this comment

Uh oh!

kaimast commented Jan 9, 2026

Uh oh!

ljedrz left a comment

Choose a reason for hiding this comment

Uh oh!

vicsn commented Jan 21, 2026

Uh oh!

raychu86 Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

raychu86 Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Fix] Make `prepare_next_quorum_block` atomic #3070

[Fix] Make `prepare_next_quorum_block` atomic #3070

kaimast commented Jan 5, 2026 •

edited

Loading

vicsn commented Jan 6, 2026 •

edited

Loading