Skip to content

Conversation

@pmantica11
Copy link

@pmantica11 pmantica11 commented Nov 20, 2025

Overview

In this PR, I replace multiple issues causing deshredding to fail:

  1. I replaced the try_send with send to ensure that shreds are delivered to the reconstructor thread. (Without this, I got a lot of missing shreds in the reconstructor thread).
  2. I removed an "unknown" start hack that was causing segments not to be deshredded. Solana data segments are always clearly delineated. There is no need for this hack. This hack resulted in false, incomplete half-segments, which in turn prevented other half-segments from being deshredded.

Tests

Verified locally that we didn't get any missing transactions for over 1000 slots after the two fixes above.

@buffalu buffalu requested a review from esemeniuc December 2, 2025 21:58
);

if should_reconstruct_shreds {
let _ = reconstruct_tx.try_send(packet_batch.clone());
Copy link
Collaborator

@esemeniuc esemeniuc Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we switch this back to try_send, don't to have OOM due to unbounded linked list. lets increase the buffer size instead

) -> usize {
deshredded_entries.clear();
slot_fec_indexes_to_iterate.clear();
let mut data_fec_indexes_to_reconstruct = Vec::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets re-use the vec and clear it instead of allocating a new one each time

// failed to reconstruct the entries from the index due to never finding the start data boundary.
// To deal with this possible out-of-order scenario, we retry rebuilding the entrie(s) for the next
// fec index whenever we encounter a data complete boundary.
if let Some(next_slot) = &state_tracker.data_shreds[index + 1] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im a bit confused by this logic, state_tracker.data_shreds only tracks shreds for a single slot. how does this give you next slot information?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we only ever push to this when shreds come out of order. if they always come in order, the never push to the vec, and line 186 for (slot, fec_set_index) in data_fec_indexes_to_reconstruct.iter() { wouldnt iterate on anything

@esemeniuc
Copy link
Collaborator

Please run the test suite, the new changes decodes less transactions than before

test deshred::tests::test_reconstruct_live_data_complete_shred ... FAILED

failures:

---- deshred::tests::test_recover_shreds stdout ----
thread 'deshred::tests::test_recover_shreds' panicked at proxy/src/deshred.rs:1019:9:
assertion `left == right` failed
  left: 0
 right: 200
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- deshred::tests::test_reconstruct_live_shreds stdout ----
thread 'deshred::tests::test_reconstruct_live_shreds' panicked at proxy/src/deshred.rs:699:9:
assertion `left == right` failed
  left: 9543
 right: 13580

---- deshred::tests::test_reconstruct_live_data_complete_shred stdout ----
thread 'deshred::tests::test_reconstruct_live_data_complete_shred' panicked at proxy/src/deshred.rs:876:9:
assertion `left == right` failed
  left: 36220
 right: 43170


failures:
    deshred::tests::test_reconstruct_live_data_complete_shred
    deshred::tests::test_reconstruct_live_shreds
    deshred::tests::test_recover_shreds

test result: FAILED. 14 passed; 3 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.26s

error: test failed, to rerun pass `-p jito-shredstream-proxy --bin jito-shredstream-proxy`
➜  shredstream-proxy git:(upstream-jito-bug-fixes) ✗ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants