Allow trailing data in PVM blobs #355

xermicus · 2025-11-27T19:00:33Z

This PR changes the blob parser to only fail if the blob is smaller than what the header says.

Required for implementing what Solidity calls "full/perfect verification" (where the compiler needs to append arbitrary data to the blob).

Signed-off-by: xermicus <cyrill@parity.io>

koute · 2026-01-05T11:48:33Z

Why not include the metadata in the blob itself? That's why we have sections.

xermicus · 2026-01-05T12:44:41Z

I'm not sure how that'd work - since the blob contains the metadata hash it's a chicken-and-egg. There's also the aspect of while we can include arbitrary data elsewhere, this breaks Ethereum style workflows which we don't want.

koute · 2026-01-05T12:52:21Z

You wouldn't hash the metadata. The blob can already contain stuff which shouldn't probably be hashed because it doesn't affect the execution (e.g. the debug info).

Basically, you'd define a method like this maybe:

fn each_chunk_for_hash(&self, mut callback: impl FnMut(&[u8])) { ... }

This would skip the metadata and optional stuff which doesn't affect execution, and then to calculate a hash you'd do:

let hasher = Hasher::new();
blob.each_chunk_for_hash(|chunk| hasher.update(chunk));
let hash = hasher.finish();

And you'd use this hash to verify that the build is reproducible.

Also, on the plus side, you could use this same hash to build a repository of debug info blobs for programs deployed on-chain. People would strip debug info when uploading on-chain, and upload the full blob somewhere else, and since both hashes would then match it'd be easy to find the matching blob with debug info based on the on-chain blob.

xermicus · 2026-01-05T12:55:48Z

The requirement here is that we need to append CBOR data the same way as solc does.

koute · 2026-01-05T13:07:18Z

In that case I'd prefer if you'd just truncate the blob you pass to from_bytes yourself, which you could do at zero-cost if you'd make ArcBytes::subslice public (or if you can do it before the conversion to ArcBytes then you need no changes at all).

ProgramParts::from_bytes(bytes.subslice(0..ProgramBlob::blob_length(&bytes)))

Signed-off-by: Cyrill Leutwiler <bigcyrill@hotmail.com>

xermicus · 2026-01-13T10:30:13Z

Thanks for the suggestion.

Reverted the original change
Made the API public
Changed the caller to pass a slice as defined by the blob length

koute · 2026-01-13T12:09:21Z

crates/polkavm-common/src/program.rs

+        let blob_length = Self::blob_length(&bytes).ok_or(ProgramParseError(ProgramParseErrorKind::Other(
+            "blob does not contain the blob length",
+        )))?;
+        let bytes = bytes.subslice(0..blob_length as usize);


In general I meant doing this outside of PolkaVM code in downstream code; that's why I said to make subslice public, so that you don't need any special support for it here. :P

The .polkavm blob is essentially supposed to be self-contained; if you want to append/prepend data to it then you're welcome to do so, but I don't want to be responsible for handling that data. I'm probably going to e.g. also supply official APIs to write .polkavm blobs in the future (including an API to loselessly convert back and forth), and I just don't want to deal with all of the miscellaneous variations of adding extra metadata that explicitly avoids the intended way of adding such metadata (as a dedicated section inside the file), especially since I have no way of knowing how to handle such grafted metadata correctly.

And the proliferation of different PVM "blob" formats is not a theoretical issue, we already have these different formats:

Vanilla .polkavm blob as my tooling builds it

JAM service blob (metadata header + minimal PVM header + PVM code sections)

CoreVM service blob (metadata header same as JAM service blob + full .polkavm blob)

The contracts blob (full .polkavm blob + appended data)

The .polkavm blob was explicitly designed to be extensible so that this isn't necessary! But instead everyone just invents their own flavor, so I'm drawing a line in the sand - I can provide APIs to make the grafting of extra metadata possible, but I don't want to have to deal with it in my core crates.

I see the point but ignoring extra data wouldn't be part of the format? How I see it, the only thing it does is making the parser a bit more resilient. It also avoids us to maintain a custom version of polkatool.

Isn't it more like an ELF file with trailing data can still be parsed by readelf and objdump and you can still execute it fine?

Allow trailing data in PVM blobs

4fd0d78

Signed-off-by: xermicus <cyrill@parity.io>

xermicus requested a review from koute November 27, 2025 19:00

Merge branch 'master' into cl/blob_len

d7d6742

xermicus and others added 2 commits January 13, 2026 11:11

Merge branch 'master' into cl/blob_len

36dca06

truncate the blob at the callsite

8922827

Signed-off-by: Cyrill Leutwiler <bigcyrill@hotmail.com>

koute reviewed Jan 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow trailing data in PVM blobs #355

Allow trailing data in PVM blobs #355

Uh oh!

xermicus commented Nov 27, 2025

Uh oh!

koute commented Jan 5, 2026 •

edited

Loading

Uh oh!

xermicus commented Jan 5, 2026 •

edited

Loading

Uh oh!

koute commented Jan 5, 2026

Uh oh!

xermicus commented Jan 5, 2026

Uh oh!

koute commented Jan 5, 2026

Uh oh!

xermicus commented Jan 13, 2026

Uh oh!

koute Jan 13, 2026

Uh oh!

xermicus Jan 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Allow trailing data in PVM blobs #355

Are you sure you want to change the base?

Allow trailing data in PVM blobs #355

Uh oh!

Conversation

xermicus commented Nov 27, 2025

Uh oh!

koute commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xermicus commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

koute commented Jan 5, 2026

Uh oh!

xermicus commented Jan 5, 2026

Uh oh!

koute commented Jan 5, 2026

Uh oh!

xermicus commented Jan 13, 2026

Uh oh!

koute Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

xermicus Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

koute commented Jan 5, 2026 •

edited

Loading

xermicus commented Jan 5, 2026 •

edited

Loading

xermicus Jan 13, 2026 •

edited

Loading