Debugging: preserve original Wasm bytecode inside of compiled ELF artifact. by cfallin · Pull Request #12636 · bytecodealliance/wasmtime

cfallin · 2026-02-23T07:02:12Z

This PR adds logic to embed the original core Wasm module(s) from a compilation into a new ELF section, alongside other metadata sections, when guest debugging is enabled. When a component is compiled (with debugging), the core Wasms inside are preserved, accessible by their StaticModuleIndexes.

The need for this support arises from the guest-debugger ecosystem. Consider either a debug
component (bytecodealliance/rfcs#45) or a bespoke debugger in native code using Wasmtime's APIs. In either case, the existing APIs to introspect execution state provide Module references for each instance from each stack frame, and PC offsets into these Modules are the way in which breakpoints are configured. The debugger will somehow need to associate these Modules with the original Wasm bytecode, including e.g. any custom sections containing the producer-specific ways of encoding debug metadata, to do something useful. In particular also note that the GDB-stub protocol as extended for Wasm requires read access directly to the Wasm bytecode (it shows up as part of a "memory map" that is viewed by the standard read-remote-memory command); we can't delegate this requirement to the remote end of the stub connection, but have to handle it in the stub server that runs inside Wasmtime (as a component or bespoke).

We have two main choices: carry the original bytecode all the way through the Wasmtime compilation pipeline and present it via Module::bytecode(), ready to use; or say that this task is out-of-scope and that the debugger top-half can find it on disk somehow.

Unfortunately the latter ("out of scope, find the file") is somewhat at odds with the desired developer experience:

It means that we need some way of mapping a compiled Wasm artifact back to a source Wasm; absent "here's the full bytecode", that means "here's the path to the full bytecode", but that path is an identifier that may not be universally accessible (consider e.g. capabilities/preopens present for a debugger component) or portable (consider e.g. moving the artifact to a different machine).
- Or we don't even provide that metadata, and require the user to explicitly specify the same module filename twice -- once to actually run it, and once as an argument to the debugger.
It means that we should account for stale artifacts and mark the mismatch somehow; e.g. if the user starts debugging with Wasmtime, either from a .cwasm on disk or with one produced in-memory just for this run, and then subsequently rebuilds their source .wasm, we no longer have a reference for it. (The same problem exists one level up if source code is edited, but source to a Wasm producer toolchain is definitely out-of-scope for Wasmtime.)
It means that special logic is required in the case of components to map a module back to a specific component section (we would essentially have to expose the static module IDs, then require the debugger top-half to re-implement our exact flattening algorithm to find that core module).

The permissions issue alone was enough to convince me that we should do something better than providing a filename (why should we have to authorize the adapter to read the user's filesystem?) but all of the other benefits -- ensuring an exact match and ensuring perfect availability -- are a nice bonus. The main downside is making the .cwasm larger (possibly substantially so), but this overhead is only present when enabling guest-debugging, the data has to be present anyway, and this is likely not a dealbreaker.

…ifact. This PR adds logic to embed the original core Wasm module(s) from a compilation into a new ELF section, alongside other metadata sections. When a component is compiled, the core Wasms inside are preserved, accessible by their `StaticModuleIndex`es. The need for this support arises from the guest-debugger ecosystem. Consider either a debug component (bytecodealliance/rfcs#45) or a bespoke debugger in native code using Wasmtime's APIs. In either case, the existing APIs to introspect execution state provide `Module` references for each instance from each stack frame, and PC offsets into these `Module`s are the way in which breakpoints are configured. The debugger will somehow need to associate these `Module`s with the original Wasm bytecode, including e.g. any custom sections containing the producer-specific ways of encoding debug metadata, to do something useful. In particular also note that the GDB-stub protocol as extended for Wasm requires read access directly to the Wasm bytecode (it shows up as part of a "memory map" that is viewed by the standard read-remote-memory command); we can't delegate this requirement to the remote end of the stub connection, but have to handle it in the stub server that runs inside Wasmtime (as a component or bespoke). We have two main choices: carry the original bytecode all the way through the Wasmtime compilation pipeline and present it via `Module::bytecode()`, ready to use; or say that this task is out-of-scope and that the debugger top-half can find it on disk somehow. Unfortunately the latter ("out of scope, find the file") is somewhat at odds with the desired developer experience: - It means that we need some way of mapping a compiled Wasm artifact back to a source Wasm; absent "here's the full bytecode", that means "here's the path to the full bytecode", but that path is an identifier that may not be universally accessible (consider e.g. capabilities/preopens present for a debugger component) or portable (consider e.g. moving the artifact to a different machine). - Or we don't even provide that metadata, and require the user to explicitly specify the same module filename twice -- once to actually run it, and once as an argument to the debugger. - It means that we should account for stale artifacts and mark the mismatch somehow; e.g. if the user starts debugging with Wasmtime, either from a `.cwasm` on disk or with one produced in-memory just for this run, and then subsequently rebuilds their source `.wasm`, we no longer have a reference for it. (The same problem exists one level up if source code is edited, but source to a Wasm producer toolchain is definitely out-of-scope for Wasmtime.) - It means that special logic is required in the case of components to map a module back to a specific component section (we would essentially have to expose the static module IDs, then require the debugger top-half to re-implement our exact flattening algorithm to find that core module). The permissions issue alone was enough to convince me that we should do something better than providing a filename (why should we have to authorize the adapter to read the user's filesystem?) but all of the other benefits -- ensuring an exact match and ensuring perfect availability -- are a nice bonus. The main downside is making the `.cwasm` larger (possibly substantially so), but this overhead is only present when enabling guest-debugging, the data has to be present anyway, and this is likely not a dealbreaker.

crates/wasmtime/src/runtime/code_memory.rs

crates/environ/src/compile/module_artifacts.rs

crates/wasmtime/src/runtime/code_memory.rs

crates/wasmtime/src/runtime/module.rs

cfallin requested a review from a team as a code owner February 23, 2026 07:02

cfallin requested review from dicej and removed request for a team February 23, 2026 07:02

cfallin force-pushed the guest-debugging-preserve-bytecode branch 3 times, most recently from 25ccdd2 to 008ee12 Compare February 23, 2026 07:32

github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Feb 23, 2026

cfallin force-pushed the guest-debugging-preserve-bytecode branch from 008ee12 to 1dfaefb Compare February 23, 2026 22:20

dicej requested a review from alexcrichton February 23, 2026 23:03

cfallin added 2 commits February 23, 2026 15:37

miri ignore tests with compilation

9971872

cfallin force-pushed the guest-debugging-preserve-bytecode branch from 1dfaefb to 9971872 Compare February 23, 2026 23:37

alexcrichton approved these changes Feb 23, 2026

View reviewed changes

Review feedback.

97c98a0

cfallin enabled auto-merge February 24, 2026 00:39

cfallin added this pull request to the merge queue Feb 24, 2026

Merged via the queue into bytecodealliance:main with commit c07c94d Feb 24, 2026
45 checks passed

cfallin deleted the guest-debugging-preserve-bytecode branch February 24, 2026 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Debugging: preserve original Wasm bytecode inside of compiled ELF artifact.#12636

Debugging: preserve original Wasm bytecode inside of compiled ELF artifact.#12636
cfallin merged 3 commits intobytecodealliance:mainfrom
cfallin:guest-debugging-preserve-bytecode

cfallin commented Feb 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

cfallin commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cfallin commented Feb 23, 2026 •

edited

Loading