rfc: QUIC Changes #2494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

ben-dz wants to merge 2 commits into main from bdz/quic-rfc

+313 −0

ben-dz commented Dec 19, 2025

Summary of Changes

RFC defines changes required to Solana validators and transaction senders' use of QUIC to support edge filtering.

ben-dz force-pushed the bdz/quic-rfc branch 2 times, most recently from a9dc9c6 to 5c9fead Compare

December 19, 2025 17:03

ben-dz marked this pull request as ready for review

December 19, 2025 17:12

ben-dz force-pushed the bdz/quic-rfc branch from 5c9fead to 0d2b0f5 Compare

December 19, 2025 22:29


          First draft of quic chagnes for edge filtering rfc.

f9ba0f0

ben-dz force-pushed the bdz/quic-rfc branch from 0d2b0f5 to f9ba0f0 Compare

December 19, 2025 22:30

snormore requested a review from Copilot

December 19, 2025 22:32

Copilot AI reviewed

View reviewed changes

Copilot AI left a comment

Pull request overview

This RFC proposes modifications to Solana validators' and transaction senders' use of QUIC protocol to enable FPGA-based edge filtering in the DoubleZero network. The changes aim to overcome QUIC's encryption, flow control, and packet formatting challenges while minimizing modifications to existing Solana validator and QUIC library code.

Key Changes:

Introduction of in-band encrypted session key sharing mechanism via modified HANDSHAKE_DONE packets
Frame substitution approach using RESET_STREAM to handle dropped traffic while maintaining flow control
Enforcement of specific packet formatting requirements including fixed 8-byte CIDs, 1232-byte stream frames, and standardized encryption

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
rfcs/rfcx-quic-changes-for-edge-filtering.md	New RFC document detailing QUIC protocol modifications for FPGA edge filtering support
CHANGELOG.md	Added changelog entry for the new RFC

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved


          Copilot fixes typos

fbfaeed

Co-authored-by: Copilot <[email protected]>

ben-dz force-pushed the bdz/quic-rfc branch from 44d6988 to fbfaeed Compare

December 19, 2025 22:45

alexpyattaev reviewed

View reviewed changes

alexpyattaev left a comment

Thanks for putting this together. Some initial feedback below:

rfcs/rfcx-quic-changes-for-edge-filtering.md

+                                 └─────────────────────────────────────────────────────────────––––┘
+              ```
+              This initial architecture passes through any non-QUIC traffic, and any traffic for which the FPGA does not have the keys. A future improvement may add a join-time option for a server to drop 1RTT traffic for which the FPGA does not have keys. The drawback of this would be an increased connection spin-up latency, for a benefit of further reducing bad traffic reaching the server.

alexpyattaev Dec 26, 2025

definition of join-time?

Author

ben-dz Dec 30, 2025

I'll update it to say "an option to be selected at the time of connection to DoubleZero"

rfcs/rfcx-quic-changes-for-edge-filtering.md Show resolved Hide resolved

rfcs/rfcx-quic-changes-for-edge-filtering.md


		### 2. Flow Control

		Assuming that the cryptographic problem is solved, the FPGA needs a way to handle the QUIC connection once it determines that it wants to drop a stream frame due to edge filtering logic. Unlike in UDP, the FPGA cannot drop the packet or frame. First, the client will re-try sending until the packet is acknowledged. Second, QUIC’s built in flow control will eventually cause the connection to stall because the server will not keep advancing the `MAX_DATA` window since it will not have received the amount of data that the client has sent, if the packet is received but has been shorted by dropping the frame.

alexpyattaev Dec 26, 2025

rm "Unlike in UDP, the FPGA cannot drop the packet or frame." ?

rfcs/rfcx-quic-changes-for-edge-filtering.md


		----

		As a result of this discrepancy between the two validator software stacks, there are two proposed options for the `FINAL_SIZE`: If Agave is changed to match Firedancer with a 2^62 `MAX_DATA`, then the FPGA will always set `FINAL_LENGTH` to 4k. If Agave continues to use the `MAX_DATA` backstop, then the FPGA will make a best guess based on offset+len, and it is recommended (but not necessary) that Quinn is modified based on the recommendation in #1 above.

alexpyattaev Dec 26, 2025

agave change is trivial

Author

ben-dz Dec 30, 2025

It's technically trivial to change, but would folks like to see the MAX_DATA backstop remain?

alexpyattaev Dec 30, 2025

It is not required for correct operation => not an issue worth debating here IMO. Just state the fact - MAX_DATA will be messed up and our servers will ignore it =)

rfcs/rfcx-quic-changes-for-edge-filtering.md


		- if Quinn receives a `RESET_STREAM` with a different `FINAL_LENGTH` then it has already determined, Quinn must not issue a `FNAL_SIZE_ERROR`. This is slightly different from the previous point, and addresses an edge case where the `RESET_STREAM` replaced a packet without the Fin but, but that packet has already been received by the server.

		4k Transactions & Fragmentation: Transactions which are fragmented across multiple packets may not be dropped until the last packet depending on the criteria causing the drop. There isn’t a workaround for this other than storing and forwarding, which is not practicable for the amount of traffic a single edge filtering node might be handling. The FPGA must issue the `RESET_STREAM` as soon as it knows that a drop is desired.

alexpyattaev Dec 26, 2025

not clear how fragmented would be handled

Author

ben-dz Dec 30, 2025

I'm not 100% on how we're going to handle it in the FPGA either, other than that we have to find a way to do so since fragmentation is inevitable. I have a few different ideas, but I think the handling system will partly depend on the fragmentation rules settled on. I'm coming around to a possible store-forward system being able to be performant enough if we place enough limits on fragmentation.

The point here is that the FPGA will issue the RESET_STREAM as soon as it knows the transaction is bad- which could be subbing for any frame in a multi-frame stream. So in making modifications to a receiving validator ahead of edge filtering support, software needs to account for three possibilities:

FPGA subs reset_stream for first frame, but delivers the rest of the frames.
FPGA subs reset_stream for a middle frame, having delivered some frames already, and delivering some additional one(s) after.
FPGA subs reset_stream for last frame, having already delivered all preceding data in the stream.

The first and last are most likely scenarios.

alexpyattaev Dec 30, 2025

delivering anything for a stream after reset is a protocol breaking change. hacking the server to ignore it is possible but not desirable. I'd prefer if FPGA zeroed out the payload or just dropped the last fragment. Granted it would not help with bandwidth much, but the server is welcome to blacklist the peer to save on bandwidth.

rfcs/rfcx-quic-changes-for-edge-filtering.md


		<br>

		Packet Fragmentation: Stream frames must be 1232 bytes, except for the last frame in a stream. If a frame is shorter than 1232 bytes and does not have the FIN flag, then the FPGA must replace it with a `RESET_STREAM` to prevent abuse of the connection.

alexpyattaev Dec 26, 2025

this is way more than needed to parse tx header. why?

Author

ben-dz Dec 30, 2025

imo, it makes reassembly and managing buffers simpler all around- max data until you don't have that much left, then the rest. On the Rx side, you have four windows into which new data gets slotted, and it always lands at one of those four addresses.

It's similar to the approach USB uses, but USB has to do it that way because there's no equivalent of the FIN flag. The short frame defines the end. We don't have to do it, but it does make for a clean RX side.

alexpyattaev Jan 1, 2026

problem is, this requires client compliance

rfcs/rfcx-quic-changes-for-edge-filtering.md


		> 💡 This ensures the FPGA has the most information possible as early as possible, that transactions are broken up into a predictable pattern, and that a malicious sender does not break a transaction into tiny pieces to get around filtering. Currently Agave allows smaller frames, but only four total fragments. Since there are already rules about fragmentation, this adjustment of those rules allows the Edge Filtration to be more useful and efficient. This is already usually met by a normal sender.

		> 👉 There is an option to instead enforce some smaller (but reasonable) minimum size for the first frame in a stream. For example, requiring all the signatures, or signatures + header of a transaction to be in the first frame. However once some size constraint must be enforced, we might as well enforce something that will make both Validator and FPGA code paths more efficient (thus optimizing for processor time, rather than network bandwidth).

alexpyattaev Dec 26, 2025

far more reasonable!

rfcs/rfcx-quic-changes-for-edge-filtering.md


		Coalescing: Short header packets must not be coalesced, even after a long-header packet. If the UDP datagram contains a QUIC short header packet, then that must be the first and only thing in the packet.

		Frame Ordering: Stream Frames must be the first thing in a QUIC packet.

alexpyattaev Dec 26, 2025

this may be very hard to enforce...

rfcs/rfcx-quic-changes-for-edge-filtering.md


		Frame Ordering: Stream Frames must be the first thing in a QUIC packet.

		Single Stream Frame per Packet: There must be only one Stream frame in a QUIC packet.

alexpyattaev Dec 26, 2025

smallest TX is <200 bytes. epic waste of IOPS...

Author

ben-dz Dec 30, 2025

This was one that had been suggested (don't recall by who) back in September and everyone seemed on board with it. In a bare-metal embedded world or an FPGA world, the added processor overhead is workable, but perhaps running on top of Linux makes that concerning?

Were transactions in separate UDP packets before QUIC?

rfcs/rfcx-quic-changes-for-edge-filtering.md

+              Session secrets will be passed to the FPGA encrypted by an FPGA pubkey so that any other snoopers of network traffic cannot intercept them. If the FPGA's private key is compromised, then it can be rotated, and the validator software updated to match the new key. Since session secrets are ephemeral, previously captured secrets have no future value. Any validator or sender with concerns that a particular session may have been compromised needs only to disconnect and reconnect to establish new secrets.
+              ### FPGA Access to Transactions
+              Some in the Solana community may be concerned that the DoubleZero FPGA will have access to transaction data as it passes through. The Solana Core Dev community has agreed that since DoubleZero is a trusted contributor to the Solana ecosystem, this is acceptable. A developer of Validator software would have similar access to transaction flow. Additionally, until recently the transactions were not encrypted in the first place, and the change to QUIC for TPU was for the purpose of flow control, not encryption. Any validator who does not wish to allow the DoubleZero FPGA this access can choose not opt into edge filtering.

alexpyattaev Dec 26, 2025

A developer of Validator software has no access to TX flow, the operator does. Devs do not distribute binaries, only source.

Author

ben-dz Dec 30, 2025 •

edited

Loading

Fair. I will just cut this. My point was more that there were other places someone could add code to do untoward things, but not really relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet