-
Notifications
You must be signed in to change notification settings - Fork 86
Allow trailing data in PVM blobs #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: xermicus <cyrill@parity.io>
|
Why not include the metadata in the blob itself? That's why we have sections. |
|
I'm not sure how that'd work - since the blob contains the metadata hash it's a chicken-and-egg. There's also the aspect of while we can include arbitrary data elsewhere, this breaks Ethereum style workflows which we don't want. |
|
You wouldn't hash the metadata. The blob can already contain stuff which shouldn't probably be hashed because it doesn't affect the execution (e.g. the debug info). Basically, you'd define a method like this maybe: This would skip the metadata and optional stuff which doesn't affect execution, and then to calculate a hash you'd do: And you'd use this hash to verify that the build is reproducible. Also, on the plus side, you could use this same hash to build a repository of debug info blobs for programs deployed on-chain. People would strip debug info when uploading on-chain, and upload the full blob somewhere else, and since both hashes would then match it'd be easy to find the matching blob with debug info based on the on-chain blob. |
|
The requirement here is that we need to append CBOR data the same way as |
|
In that case I'd prefer if you'd just truncate the |
Signed-off-by: Cyrill Leutwiler <bigcyrill@hotmail.com>
|
Thanks for the suggestion.
|
| let blob_length = Self::blob_length(&bytes).ok_or(ProgramParseError(ProgramParseErrorKind::Other( | ||
| "blob does not contain the blob length", | ||
| )))?; | ||
| let bytes = bytes.subslice(0..blob_length as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I meant doing this outside of PolkaVM code in downstream code; that's why I said to make subslice public, so that you don't need any special support for it here. :P
The .polkavm blob is essentially supposed to be self-contained; if you want to append/prepend data to it then you're welcome to do so, but I don't want to be responsible for handling that data. I'm probably going to e.g. also supply official APIs to write .polkavm blobs in the future (including an API to loselessly convert back and forth), and I just don't want to deal with all of the miscellaneous variations of adding extra metadata that explicitly avoids the intended way of adding such metadata (as a dedicated section inside the file), especially since I have no way of knowing how to handle such grafted metadata correctly.
And the proliferation of different PVM "blob" formats is not a theoretical issue, we already have these different formats:
- Vanilla
.polkavmblob as my tooling builds it - JAM service blob (metadata header + minimal PVM header + PVM code sections)
- CoreVM service blob (metadata header same as JAM service blob + full .polkavm blob)
- The contracts blob (full
.polkavmblob + appended data)
The .polkavm blob was explicitly designed to be extensible so that this isn't necessary! But instead everyone just invents their own flavor, so I'm drawing a line in the sand - I can provide APIs to make the grafting of extra metadata possible, but I don't want to have to deal with it in my core crates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the point but ignoring extra data wouldn't be part of the format? How I see it, the only thing it does is making the parser a bit more resilient. It also avoids us to maintain a custom version of polkatool.
Isn't it more like an ELF file with trailing data can still be parsed by readelf and objdump and you can still execute it fine?
This PR changes the blob parser to only fail if the blob is smaller than what the header says.
Required for implementing what Solidity calls "full/perfect verification" (where the compiler needs to append arbitrary data to the blob).