[qs] Introduce BatchV2 #18085

ibalajiarun · 2025-11-11T20:26:49Z

Description

This PR introduces support for BatchV2 in QuorumStore, enabling batches to carry additional metadata (such as batch kind) while maintaining full backward compatibility with BatchV1. The implementation uses BatchInfoExt as a unified type that can represent both V1 and V2 batches, and adds new network message types to handle V2 batches. Sending BatchV2 is configured via a local Quorum Store Config flag.

BatchMsg and related QS messages are used to handle BatchInfo or BatchInfoExt::V1 batches normally. BatchMsgV2 and other V2 messages verify that BatchInfoExt::V2 is used. This is to simplify handling of different message types on the receiver.

ibalajiarun · 2025-11-11T20:27:05Z

JoshLind

Looks reasonable to me 😄

consensus/src/quorum_store/proof_coordinator.rs

cursor

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on December 7

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2025-12-02T00:32:45Z

consensus/src/quorum_store/quorum_store_builder.rs

+                    let batch: Batch<BatchInfoExt> = value.try_into().unwrap();
+                    let batch: Batch<BatchInfo> = batch
+                        .try_into()
+                        .expect("Batch retieval requests must be for V1 batch");


Bug: Batch retrieval panics when serving V2 batches

The batch retrieval handler unconditionally attempts to convert Batch<BatchInfoExt> to Batch<BatchInfo> via try_into().expect(), which panics for V2 batches. The batch store now contains both V1 and V2 batches (as PersistedValue<BatchInfoExt>), and a BatchResponse::BatchV2 variant exists for V2 responses. However, when a retrieval request arrives for a V2 batch's digest, the code panics instead of either returning BatchV2 or gracefully handling the incompatibility. This could crash nodes when enable_batch_v2 is enabled and another node requests a V2 batch via the legacy retrieval mechanism.

cursor · 2025-12-02T01:11:10Z

consensus/src/quorum_store/batch_store.rs

+                if !batch_info.is_v2() {
+                    self.generate_signed_batch_info(batch_info.info().clone())
+                        .ok()
+                        .map(|inner| inner.into())


Bug: Signature verification fails due to type conversion mismatch

For V1 batches, generate_signed_batch_info(batch_info.info().clone()) creates a signature over BatchInfo, then .map(|inner| inner.into()) converts the result to SignedBatchInfo<BatchInfoExt> by wrapping the info in BatchInfoExt::V1. However, the signature remains unchanged. When verification occurs via optimistic_verify(self.signer, &self.info, &self.signature), it hashes BatchInfoExt::V1{info: BatchInfo} which has a different BCS serialization than BatchInfo (enum discriminator byte is prepended). This causes signature verification to fail because the signed data doesn't match the verified data. The PR author acknowledged this issue in the discussion.

Additional Locations (1)

consensus/consensus-types/src/proof_of_store.rs#L486-L499

github-actions · 2025-12-02T01:40:11Z

✅ Forge suite `compat` success on `92c0534410a26cafa4509b2dfda12c86811abf0a` ==> `0c49bf4e08275a47c56ed52473bf15632fe46c0d`

Compatibility test results for 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d (PR)
1. Check liveness of validators at old version: 92c0534410a26cafa4509b2dfda12c86811abf0a
compatibility::simple-validator-upgrade::liveness-check : committed: 13395.68 txn/s, latency: 2586.87 ms, (p50: 2700 ms, p70: 2800, p90: 3300 ms, p99: 4200 ms), latency samples: 442300
2. Upgrading first Validator to new version: 0c49bf4e08275a47c56ed52473bf15632fe46c0d
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 5959.69 txn/s, latency: 5707.30 ms, (p50: 6300 ms, p70: 6400, p90: 6500 ms, p99: 6600 ms), latency samples: 203700
3. Upgrading rest of first batch to new version: 0c49bf4e08275a47c56ed52473bf15632fe46c0d
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6000.22 txn/s, latency: 5646.27 ms, (p50: 6200 ms, p70: 6300, p90: 6400 ms, p99: 6500 ms), latency samples: 209780
4. upgrading second batch to new version: 0c49bf4e08275a47c56ed52473bf15632fe46c0d
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 10449.33 txn/s, latency: 3114.16 ms, (p50: 3300 ms, p70: 3400, p90: 3500 ms, p99: 3700 ms), latency samples: 343500
5. check swarm health
Compatibility test for 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d passed
Test Ok

github-actions · 2025-12-02T01:41:36Z

✅ Forge suite `framework_upgrade` success on `92c0534410a26cafa4509b2dfda12c86811abf0a` ==> `0c49bf4e08275a47c56ed52473bf15632fe46c0d`

Compatibility test results for 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d (PR)
Upgrade the nodes to version: 0c49bf4e08275a47c56ed52473bf15632fe46c0d
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2171.19 txn/s, submitted: 2179.28 txn/s, failed submission: 8.09 txn/s, expired: 8.09 txn/s, latency: 1323.71 ms, (p50: 1200 ms, p70: 1500, p90: 1800 ms, p99: 3000 ms), latency samples: 198660
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2200.93 txn/s, submitted: 2207.99 txn/s, failed submission: 7.06 txn/s, expired: 7.06 txn/s, latency: 1320.10 ms, (p50: 1200 ms, p70: 1500, p90: 1800 ms, p99: 2700 ms), latency samples: 199560
5. check swarm health
Compatibility test for 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d passed
Upgrade the remaining nodes to version: 0c49bf4e08275a47c56ed52473bf15632fe46c0d
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2244.73 txn/s, submitted: 2252.11 txn/s, failed submission: 7.38 txn/s, expired: 7.38 txn/s, latency: 1325.48 ms, (p50: 1200 ms, p70: 1500, p90: 1600 ms, p99: 2400 ms), latency samples: 200781
Test Ok

github-actions · 2025-12-02T01:42:18Z

✅ Forge suite `realistic_env_max_load` success on `0c49bf4e08275a47c56ed52473bf15632fe46c0d`

Forge report malformed: Expecting ',' delimiter: line 7 column 1 (char 136)
'{\n  "metrics": [\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "submitted_txn",\n      "value": 5088960.0\n[2025-12-02T01:42:15Z INFO  aptos_forge::report] Test Ok\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_tps",\n      "value": 13668.686596280544\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_latency",\n      "value": 2756.7169472740993\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p50_latency",\n      "value": 2700.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p90_latency",\n      "value": 3000.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p99_latency",\n      "value": 3600.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "submitted_txn",\n      "value": 42680.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_tps",\n      "value": 100.02417686183487\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_latency",\n      "value": 733.6994047619048\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p50_latency",\n      "value": 700.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p90_latency",\n      "value": 900.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p99_latency",\n      "value": 1000.0\n    }\n  ],\n  "text": "two traffics test: inner traffic : committed: 13668.69 txn/s, latency: 2756.72 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3600 ms), latency samples: 5088960\\ntwo traffics test : committed: 100.02 txn/s, latency: 733.70 ms, (p50: 700 ms, p70: 800, p90: 900 ms, p99: 1000 ms), latency samples: 1680\\nLatency breakdown for phase 0: [\\"MempoolToBlockCreation: max: 2.263, avg: 2.177\\", \\"ConsensusProposalToOrdered: max: 0.169, avg: 0.166\\", \\"ConsensusOrderedToCommit: max: 0.058, avg: 0.052\\", \\"ConsensusProposalToCommit: max: 0.224, avg: 0.218\\"]\\nMax non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.54s no progress at version 14521 (avg 0.07s) [limit 15].\\nMax epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.30s no progress at version 2410741 (avg 0.30s) [limit 16].\\nTest Ok"\n}'
Trailing Log Lines:
networkchaos.chaos-mesh.org "4-gcp--as-southeast1-to-3-gcp--us-east4-netem" deleted
test CompositeNetworkTest ... ok
Test Statistics: 
two traffics test: inner traffic : committed: 13668.69 txn/s, latency: 2756.72 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3600 ms), latency samples: 5088960
two traffics test : committed: 100.02 txn/s, latency: 733.70 ms, (p50: 700 ms, p70: 800, p90: 900 ms, p99: 1000 ms), latency samples: 1680
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.263, avg: 2.177", "ConsensusProposalToOrdered: max: 0.169, avg: 0.166", "ConsensusOrderedToCommit: max: 0.058, avg: 0.052", "ConsensusProposalToCommit: max: 0.224, avg: 0.218"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.54s no progress at version 14521 (avg 0.07s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.30s no progress at version 2410741 (avg 0.30s) [limit 16].
Test Ok

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="0" errors="0" uuid="41dda6b1-0acb-493d-bc6c-7d3c51c5c1cf">
    <testsuite name="local" tests="1" disabled="0" errors="0" failures="0">
        <testcase name="CompositeNetworkTest(network:multi-region-network-emulation(two traffics test)) with ">
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===
[2025-12-02T01:42:15Z INFO  aptos_forge::backend::k8s::cluster_helper] Deleting namespace forge-e2e-pr-18085: Some(NamespaceStatus { conditions: None, phase: Some("Terminating") })
[2025-12-02T01:42:15Z INFO  aptos_forge::backend::k8s::cluster_helper] aptos-node resources for Forge removed in namespace: forge-e2e-pr-18085

test result: ok. 1 passed; 0 soft failed; 0 hard failed; 0 filtered out

Debugging output:
NAME                                         READY   STATUS      RESTARTS   AGE
aptos-node-0-fullnode-eforge5c1f69d2-0       1/1     Running     0          13m
aptos-node-0-validator-0                     1/1     Running     0          13m
aptos-node-1-fullnode-eforge5c1f69d2-0       1/1     Running     0          13m
aptos-node-1-validator-0                     1/1     Running     0          13m
aptos-node-2-fullnode-eforge5c1f69d2-0       1/1     Running     0          13m
aptos-node-2-validator-0                     1/1     Running     0          13m
aptos-node-3-fullnode-eforge5c1f69d2-0       1/1     Running     0          13m
aptos-node-3-validator-0                     1/1     Running     0          13m
aptos-node-4-fullnode-eforge5c1f69d2-0       1/1     Running     0          13m
aptos-node-4-validator-0                     1/1     Running     0          13m
aptos-node-5-validator-0                     1/1     Running     0          13m
aptos-node-6-validator-0                     1/1     Running     0          13m
forge-testnet-deployer-58c6w                 0/1     Completed   0          14m
genesis-aptos-genesis-eforge5c1f69d2-dhf4l   0/1     Completed   0          13m

…#18085)

This was referenced Nov 11, 2025

[types] Introduce new payload for encrypted mempool #17919

Merged

[qs] Make ProofOfStore and SignedBatchInfo generic over BatchInfo #18062

Merged

This was referenced Nov 11, 2025

[qs] introduce BatchInfoExt and BatchSignatureAggregator enums to support extensible BatchInfo #18084

Merged

[api] local config flag to gate enc txn submission and batch validation #18086

Merged

[consensus] Introduce OptQS::V2 Payload #18087

Merged

ibalajiarun force-pushed the balaji/enc-pool-3 branch from 4e6435b to e1a91cd Compare November 11, 2025 20:49

ibalajiarun force-pushed the balaji/enc-pool-4 branch from d3049aa to 52f3018 Compare November 11, 2025 20:49

ibalajiarun changed the base branch from balaji/enc-pool-3 to graphite-base/18085 November 11, 2025 21:37

ibalajiarun force-pushed the graphite-base/18085 branch from e1a91cd to c07419d Compare November 12, 2025 00:20

ibalajiarun force-pushed the balaji/enc-pool-4 branch from 52f3018 to 251c641 Compare November 12, 2025 00:20

ibalajiarun changed the base branch from graphite-base/18085 to balaji/enc-pool-3 November 12, 2025 00:20

ibalajiarun force-pushed the balaji/enc-pool-3 branch from c07419d to eec8298 Compare November 13, 2025 01:26

ibalajiarun force-pushed the balaji/enc-pool-4 branch from 251c641 to a094eaf Compare November 13, 2025 01:26

ibalajiarun force-pushed the balaji/enc-pool-3 branch from eec8298 to ea4f27e Compare November 14, 2025 01:56

ibalajiarun force-pushed the balaji/enc-pool-4 branch 2 times, most recently from f2dbced to 9244123 Compare November 14, 2025 02:23

ibalajiarun changed the title ~~[qs] send proof v2 support with flag~~ [qs] Support V2 SignedBatchInfo and ProofOfStore messages with BlockInfoExt Nov 14, 2025

ibalajiarun marked this pull request as ready for review November 14, 2025 03:57

ibalajiarun requested review from JoshLind, bchocho, gregnazario, sasha8 and zekun000 as code owners November 14, 2025 03:57

JoshLind approved these changes Nov 14, 2025

View reviewed changes

ibalajiarun changed the base branch from balaji/enc-pool-3 to graphite-base/18085 November 16, 2025 05:52

ibalajiarun force-pushed the balaji/enc-pool-4 branch from 9244123 to 397fd1c Compare November 17, 2025 15:18

ibalajiarun force-pushed the graphite-base/18085 branch from ea4f27e to 4f27983 Compare November 17, 2025 15:18

ibalajiarun changed the base branch from graphite-base/18085 to balaji/enc-pool-3 November 17, 2025 15:18

ibalajiarun commented Nov 17, 2025

View reviewed changes

consensus/src/quorum_store/proof_coordinator.rs Show resolved Hide resolved

This comment has been minimized.

Sign in to view

ibalajiarun force-pushed the balaji/enc-pool-4 branch from 8a0a608 to ad4e2c2 Compare December 2, 2025 00:26

cursor bot reviewed Dec 2, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

[qs] BatchV2 support with new messages using BatchInfoExt

0c49bf4

ibalajiarun force-pushed the balaji/enc-pool-4 branch from ad4e2c2 to 0c49bf4 Compare December 2, 2025 01:05

cursor bot reviewed Dec 2, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

ibalajiarun merged commit 0faa04d into main Dec 2, 2025
48 checks passed

ibalajiarun deleted the balaji/enc-pool-4 branch December 2, 2025 01:55

rex1fernando pushed a commit that referenced this pull request Dec 2, 2025

[qs] BatchV2 support with new messages using BatchInfoExt (#18085)

5f1f46c

0o-de-lally pushed a commit to 0LNetworkCommunity/zapatos that referenced this pull request Dec 9, 2025

[qs] BatchV2 support with new messages using BatchInfoExt (aptos-labs…

59bf12c

…#18085)

ibalajiarun mentioned this pull request Dec 13, 2025

[pipeline] introduce decryption pipeline stage #18313

Merged

[qs] Introduce BatchV2 #18085

[qs] Introduce BatchV2 #18085

Conversation

ibalajiarun commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

ibalajiarun commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JoshLind left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

cursor bot left a comment

Choose a reason for hiding this comment

This is the final PR Bugbot will review for you during this billing cycle

Uh oh!

cursor bot Dec 2, 2025

Choose a reason for hiding this comment

Bug: Batch retrieval panics when serving V2 batches

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

cursor bot Dec 2, 2025

Choose a reason for hiding this comment

Bug: Signature verification fails due to type conversion mismatch

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Dec 2, 2025

✅ Forge suite compat success on 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d

Uh oh!

github-actions bot commented Dec 2, 2025

✅ Forge suite framework_upgrade success on 92c0534410a26cafa4509b2dfda12c86811abf0a ==> 0c49bf4e08275a47c56ed52473bf15632fe46c0d

Uh oh!

github-actions bot commented Dec 2, 2025

✅ Forge suite realistic_env_max_load success on 0c49bf4e08275a47c56ed52473bf15632fe46c0d

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ibalajiarun commented Nov 11, 2025 •

edited

Loading

ibalajiarun commented Nov 11, 2025 •

edited

Loading

✅ Forge suite `compat` success on `92c0534410a26cafa4509b2dfda12c86811abf0a` ==> `0c49bf4e08275a47c56ed52473bf15632fe46c0d`

✅ Forge suite `framework_upgrade` success on `92c0534410a26cafa4509b2dfda12c86811abf0a` ==> `0c49bf4e08275a47c56ed52473bf15632fe46c0d`

✅ Forge suite `realistic_env_max_load` success on `0c49bf4e08275a47c56ed52473bf15632fe46c0d`