-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[Fix] Make prepare_next_quorum_block atomic
#3070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
Conversation
f2c2142 to
bbb39e1
Compare
bbb39e1 to
2a42f7e
Compare
|
Can you explain in the PR README with which test setup you saw what issue come up and how you validated it is resolved? Can you try to run a network while calling All existing calls to |
2a42f7e to
93e9807
Compare
ljedrz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me how this fixes things, as prepare_advance_to_next_quorum_block appears to only reuse the block obtained at the beginning.
If I'm not mistaken, the specific intent is to make it impossible for concurrent calls to try_advance_to_next_block to have a different previous_block; however, I'd argue that those calls should not happen concurrently at all, which this change doesn't prevent, as multiple concurrent readers are permitted.
That is actually not the goal. I simply do not want any concurrent updates or reads to the ledger while the function is executing. I added some more comments to the code and will update the PR description in a second. |
ljedrz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I initially missed the broader context.
LGTM; a read guard should suffice to provide the desired guarantees, while not being too restrictive for any concurrent work. In the future we might want to reorganize the related logic so that mistakes are less likely to happen due to the structure itself, but it's a step in the right direction, and it's well-documented.
|
Let's first finish ProvableHQ/snarkOS#4069 before merging |
0c6001d to
50d9138
Compare
ledger/src/check_next_block.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a deadlock condition here? latest_timestamp also tries to get the read lock. It should be handled the same way latest_epoch_hash is fetched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latest_timestamp() is also used in Ledger::num_remaining_solutions, which may have a conflicting .read() as well.
991ff32 to
e12a4e4
Compare
This will avoid certain race conditions when preparing a block template.
Concretely, in the current design the following is possible:
Ledger::prepare_next_beacon_blockwhich reads the value ofcurrent_blockLedger::advance_to_next_block, which updatescurrent_block` and the state root.This is problematic for two reasons:
LedgerimplementsSync, which implies concurrent calls to its functions should be executed atomically and not read partial/inconsistent data.While this PR will ensure calls to the
Ledgerare atomic (formally, the Ledger being linearizable), it does not ensure that multiple calls to these functions do not interleave in unintended ways (formally, serializable). The latter is implemented in snarkOS and out of the scope of snarkVM, while the former makes the implementation for in snarkOS a lot easier.P.S.: There is this great blog post by Peter Bailis describing the two formal guarantees mentioned in the previous paragraph, in case you want to learn more about the formalism behind it.
Other minor changes
The PR also reduces log verbosity for sequential operations, as they add a decent amount of noise otherwise.