-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[pipeline] introduce decryption pipeline stage #18313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e48b887 to
e22d1fe
Compare
| to_truncate | ||
| } else { | ||
| encrypted_txns | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Encrypted transactions beyond 10 are silently dropped
The code truncates encrypted transactions to a maximum of 10, and any transactions beyond this limit are permanently lost. After partition separates encrypted from unencrypted transactions, truncation discards excess encrypted transactions. These discarded transactions never appear in decrypted_txns (due to truncation) nor in unencrypted_txns (because they were encrypted), resulting in silent data loss. The TODO comment suggests this is known but shipping code that drops transactions is a critical issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not yet "shipping" this code.
|
|
||
| let (encrypted_txns, unencrypted_txns): (Vec<_>, Vec<_>) = input_txns | ||
| .into_iter() | ||
| .partition(|txn| txn.is_encrypted_txn()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Transaction ordering changed by partition and concat
The partition operation separates transactions into encrypted and unencrypted groups, then concat rejoins them with all decrypted transactions before unencrypted ones. This reorders transactions from their original sequence. For example, [unenc1, enc1, unenc2, enc2] becomes [dec1, dec2, unenc1, unenc2]. In blockchain systems, transaction ordering is critical for deterministic execution and consensus. This reordering would cause nodes to disagree on execution results.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation, this is expected reordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are we prioritizing encrypted txns? Why not keep the original order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is to make sure encrypted txns are not cut due to block gas limit overflow. They will be revealed but not executed.
| }; | ||
| let Entry::Occupied(o) = entry else { | ||
| unreachable!("Entry must exist"); | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Coordinator may panic on vacant HashMap entry
In make_coordinator, when processing rand_ready_block or secret_ready_block, the code calls entry(first_block_id).and_modify(...) which does nothing if the key doesn't exist, returning an Entry::Vacant. The subsequent check let Entry::Occupied(o) = entry else { unreachable!(...) } will panic if the entry is vacant. This could occur if a ready signal arrives for a block that was never registered or was already processed and removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is impossible to happen, because hashmap insert happens when sending blocks to the rand and secret managers.
| max_txns_from_block_to_execute, | ||
| block_gas_limit, | ||
| )); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Early return drops channel without sending signal
When secret_share_config is Some but there are no encrypted transactions, the function returns early without sending anything on derived_self_key_share_tx. This causes the receiver (derived_self_key_share_rx) to receive a channel error when the sender is dropped. The channel type is Sender<Option<SecretShare>>, suggesting None should be sent to indicate "no key share needed" rather than dropping the sender and causing an error in secret_sharing_derive_self_fut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
ca3df8b to
1f1d14f
Compare
e22d1fe to
5b069d0
Compare
|
|
||
| let (encrypted_txns, unencrypted_txns): (Vec<_>, Vec<_>) = input_txns | ||
| .into_iter() | ||
| .partition(|txn| txn.is_encrypted_txn()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are we prioritizing encrypted txns? Why not keep the original order?
| }) | ||
| .collect(); | ||
|
|
||
| let encryption_round = block.round(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one optimization to save the number of trusted setup: the encryption_round can be decoupled from block round, and increment only when a block contains encrypted txns. Then the trusted setup size is number of encrypted txn blocks instead of blocks. Not sure how important it is for now, but just a note for potential optimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, I will add this as TODO. We can revisit this when Rex is back.
| ))) | ||
| .unwrap(); | ||
|
|
||
| // TODO(ibalajiarun): improve perf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to break these task in separate phases later? or how do you plan to improve perf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to split into phases here because we only implement slow path now. The digest and eval computation should be done in 3 hops before ordering finishes and decryption share exchange phase finishes. I think we may need to use separate thread pool to compute eval proofs. That's the perf part here mostly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense
| .zip(txn_ciphertexts) | ||
| .map(|(mut txn, ciphertext)| { | ||
| let eval_proof = proofs.get(&ciphertext.id()).unwrap().into(); | ||
| if let Ok(payload) = FPTX::decrypt_individual::<DecryptedPayload>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the performance compared with batch decryption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is essentially the same as batch decryption. I just moved the par_iter outside because it give me individual Results.
| author: Author, | ||
| epoch: u64, | ||
| validator: Arc<ValidatorVerifier>, | ||
| // wconfig: WeightedConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo here?
5b069d0 to
028fde6
Compare
|
|
||
| let maybe_decryption_key = secret_shared_key_rx.await.unwrap(); | ||
| // TODO(ibalajiarun): account for the case where decryption key is not available | ||
| let decryption_key = maybe_decryption_key.expect("decryption key should be available"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing decryption key causes unhandled panic
When decryption is enabled and the secret share manager fails to produce a key, the code will panic. Line 101 calls .unwrap() on the channel receive which panics if the sender is dropped. Line 103 calls .expect() on Option<SecretSharedKey> which panics if the key is None. The TODO comment explicitly acknowledges this case isn't handled. If the secret sharing protocol fails for any reason, the node will crash rather than gracefully handling the failure.
JoshLind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 😄 Unblocking.
028fde6 to
9fb40fd
Compare
| let _ = ready_block_tx.send(ordered_blocks).await; | ||
| } | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinator task lacks shutdown mechanism causing panic
High Severity
The make_coordinator function spawns a tokio task with an infinite loop using select! on three channels, but provides no shutdown mechanism. When end_epoch is called, it sends stop signals to the rand_manager and secret_share_manager (closing their channels), but the coordinator has no reset channel. When all three channels close and return None, the select! macro has no matching branches and no else clause, causing the task to panic. The coordinator needs a reset/shutdown channel similar to the managers.
Additional Locations (1)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
418d620 to
2bb0172
Compare
2bb0172 to
9f2fab9
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|

Description
Introduces a pipeline phase that communicates with
secret_share_managerfrom #18223 to retrieve a shared secret key for the block and attempts to decrypts the transactions in the block. The pipeline stage returns early if theOption<SecretShareConfig>is None, which will control if the decryption feature is enabled or not.The pipeline stage consolidates multiple secret share derivation steps into the same stage because it keeps the implementation simpler and cleaner. Perf improvements will be made in a future PR, but should be easily done using
tokio::spawn_blockingwithout introducing additional stages.