Skip to content

Comments

Added differential signer state sync#668

Open
ihordiachenko wants to merge 17 commits intomainfrom
feature/diff-signer-state-sync
Open

Added differential signer state sync#668
ihordiachenko wants to merge 17 commits intomainfrom
feature/diff-signer-state-sync

Conversation

@ihordiachenko
Copy link
Collaborator

Implements #745

Key design considerations

  • First heartbeat sends the full state; subsequent heartbeats send diffs only
  • Merge conflicts fall back to a full re-sync
  • Tombstone logic added to handle deletions explicitly
  • Protobuf schemas remain unchanged to preserve backward compatibility
  • Bandwidth savings will be measured in live deployment using trace logs

Copy link
Collaborator

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the first couple of commits, with a fine-toothed comb seeing that this is your first contribution. Don't worry we won't be as pedantic going forward :-)

I'll finish the review first thing tomorrow, but thought you might need the feedback early, so here's an incomplete review :-)

Copy link
Collaborator

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR, I like it a lot. I think we might have had a small misunderstanding with the tombstone, in that I had in mind using the last valid version number to block any future updates on it, and sending that back and forth intrinsically making updates fail.

Also I'm not sure the conflicts can even happen, and what I think you were marking as conflict, is actually a normal thing to happen, i.e., a No-op following an old cached value being returned.

#[derive(Debug, Default)]
pub struct SafeMergeResult {
pub changes: Vec<(String, Option<u64>, u64)>,
pub conflict_count: usize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you are counting overwrites with older versions as conflict, when really its not. The reason is that we may have multiple requests in flight, request A changes field A, and request B changes field B, if the signer B handles request B it'll return the A with the same version as before, which you would consider a conflict, but it is just stale, and merging will just pick the newer value. The staging takes care of ensuring that state is updated correctly, please copy those semantics where possible.

trace!("hsmd hsm_id={} request processor started", hsm_id);
let mut last_sent_sketch = StateSketch::new();
let mut last_seen_sync_epoch = state_sync_epoch.load(Ordering::SeqCst);
enum SyncMode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need a full sync mode at all? As mentioned before the conflicts you identified are in fact no-ops, and therefore we don't end up falling back to full sync. The Diff mode will still do a full sync when the connection is first established simply because its sketch for what the signer already has is empty, so all kvv tuples are new from its PoV.

@ihordiachenko ihordiachenko changed the title Draft: Added differential signer state sync Added differential signer state sync Feb 18, 2026
Copy link
Collaborator

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent PR, thank you @ihordiachenko 🙏

@ihordiachenko ihordiachenko marked this pull request as ready for review February 24, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants