Skip to content

Snapshot-First Sandbox Lifecycle #1268

@simongdavies

Description

@simongdavies

Snapshot-First Sandbox Lifecycle — Requirements Document

1 Purpose & Scope

This document specifies the requirements for replacing Hyperlight's current "build from binary + config" sandbox creation model with a snapshot-first lifecycle. Every sandbox — whether created by a direct Hyperlight user, hyperlight-wasm, or hyperlight-js — will be instantiated from a persistent, versioned snapshot image. The existing UninitializedSandbox / evolve() / MultiUseSandbox state machine is eliminated in favour of a unified Sandbox type constructed via a builder pattern.

1.1 Goals

# Goal
G-1 Sub-millisecond cold-start by mmap-loading a pre-built snapshot instead of re-executing guest init code.
G-2 Deterministic, reproducible sandbox state via content-addressed snapshots.
G-3 Composable specialisation through CoW overlay layers (e.g. base runtime → application module → per-request config).
G-4 Distribution and supply-chain security through OCI-compatible artefact packaging, signing, and verification.
G-5 Cross-platform support (Linux KVM/MSHV, Windows WHP) from Day 1.
G-6 Unified, ergonomic public API — a single Sandbox type with a builder, replacing the multi-type state machine.

1.2 Definitions

Term Meaning
Snapshot A serialised, self-describing blob that captures the full guest-visible state of a micro-VM at a point in time (memory, page tables, layout metadata).
Layer A CoW diff that records only the pages that changed between a parent snapshot and a child snapshot.
Base image The bottom-most snapshot in a layer stack; contains a complete memory image.
Overlay Any layer above the base; contains only dirty pages plus metadata.
Snapshot cache An in-process, read-only mmap of a snapshot file shared by all sandboxes that load from it.
Mapped file region A guest-PA range backed by an external file (e.g. a Wasm module) rather than by the snapshot blob.

2 Requirements

R1 — Snapshot Format

ID Requirement
R1.1 A snapshot SHALL be a single-file, self-describing binary format.
R1.2 The format SHALL contain a version header (magic bytes + format version) so that loaders can reject incompatible snapshots early.
R1.3 The format SHALL embed a content hash (blake3, carried forward from the current in-memory Snapshot.hash) that covers all mutable content (memory, layout metadata, entrypoint, stack-top GVA).
R1.4 The format SHALL include: guest physical memory image, SandboxMemoryLayout, page-table pages, entrypoint (NextAction), stack-top GVA. CPU registers (segment registers, control registers, EFER, etc.) SHALL NOT be persisted; the loader SHALL reconstruct them via standard_64bit_defaults(root_pt_gpa) at load time. This is safe because Hyperlight guests run single-threaded in ring 0 and never modify segment descriptors, GDT/IDT, or control register bits.
R1.5 The format SHALL include an architecture tag (e.g. x86_64). Snapshots are hypervisor-agnostic — the guest memory image, page tables, and layout are pure x86_64 guest state and SHALL be loadable on any supported hypervisor (KVM, MSHV, WHP). The tag exists only to reject snapshots from a different ISA should Hyperlight ever support one.
R1.6 The format SHALL NOT contain process-local handles, pointers, or the current sandbox_id counter value.
R1.7 The format SHALL store memory as page-aligned, optionally compressed regions so that individual pages can be loaded on demand in a future phase.
R1.8 The format SHALL NOT embed an external-file descriptor table. The snapshot is a pure memory image; it does not know or care about external files. Responsibility for mapping external files (and verifying their integrity) belongs to the host application via map_external_file (R11.3). When snapshots are distributed via OCI, the OCI manifest (R7.2) is the correct place to declare companion blobs.
R1.9 LoadInfo (unwind tables) SHALL NOT be persisted. Unwind tables are only needed for debugging and profiling (host-side stack walking); they are irrelevant in release/production mode and the guest executes correctly without them. When debug support is required, the loader SHALL reconstruct unwind information by re-parsing the guest binary's ELF headers at load time.

R2 — Snapshot Creation ("Bake")

ID Requirement
R2.1 A CLI tool (hl-snap or subcommand of a broader hl CLI) SHALL accept a guest binary + config and produce a snapshot file.
R2.2 The tool SHALL boot the guest, execute its hyperlight_main / init path, wait for the guest to signal readiness, then serialise the snapshot to disk.
R2.3 The tool SHALL support an --overlay flag that takes a parent snapshot, boots from it, executes a caller-supplied specialisation step (e.g. "load this Wasm module"), and writes only the dirty pages as a new layer file.
R2.4 Layers created at bake time SHALL be usable identically to layers created at runtime (see R5).
R2.5 The bake tool SHALL compute and embed the content hash (R1.3) and architecture tag (R1.5).
R2.6 Programmatic bake: the same operation SHALL be available as a Rust library call (SnapshotBuilder::bake(config) -> Result<SnapshotFile>) so that hyperlight-wasm / hyperlight-js can bake snapshots without shelling out.
R2.7 When an external file is mapped into guest PA space during bake (R11.1), the bake tool SHALL NOT absorb the file content into the snapshot blob — the GPA range is left as a hole for the host to fill at load time via map_external_file.

R3 — Snapshot Loading

ID Requirement
R3.1 The runtime SHALL load a snapshot via mmap (MAP_PRIVATE on Linux, CreateFileMapping + MapViewOfFile on Windows) so that the kernel provides CoW semantics and page-level sharing across sandboxes.
R3.2 On load, the runtime SHALL verify the content hash and architecture tag; mismatches SHALL produce a clear, actionable error.
R3.3 The runtime SHALL reconstruct SandboxMemoryLayout and page tables from the snapshot metadata, and derive CPU register state from the page-table root GPA via standard_64bit_defaults(), without re-executing guest code.
R3.4 The runtime SHALL NOT automatically resolve external files. It is the host application's responsibility to call map_external_file(path, gpa, expected_hash) after loading a snapshot. The expected_hash parameter enables integrity verification — if the file's blake3 hash does not match, the call SHALL return an error.
R3.5 The snapshot cache (R3.1) SHALL be process-wide and reference-counted; dropping the last sandbox using a given snapshot file SHALL unmap the region.
R3.6 Loading SHALL NOT require the original guest binary to be present. The only exception is unwind-info reconstruction for debugging/profiling (R1.9), which requires the guest ELF headers.

R4 — Snapshot Restore (Fast-Reset)

Restore resets a Sandbox to clean snapshot state between guest calls. The mechanism is platform-specific:

ID Requirement
R4.1 Linux/KVM: restore SHALL use madvise(MADV_DONTNEED) on the MAP_PRIVATE snapshot mapping. The kernel discards dirty CoW pages; on next guest access, KVM takes an EPT/NPT violation, the host kernel faults in clean pages from the file. The current codebase already uses MADV_DONTNEED on KVM for zeroing anonymous memory, but file-backed MAP_PRIVATE mappings have not been validated in this context. Phase 3 SHALL include a validation spike confirming that KVM does not cache host physical addresses in a way that goes stale after MADV_DONTNEED — if it does, an explicit KVM_SET_USER_MEMORY_REGION re-registration or alternative approach will be needed.
R4.1a Linux/MSHV: MADV_DONTNEED is NOT safe — MSHV pins guest pages eagerly and would desync from the host view. Restore SHALL use either (a) explicit MSHV un-pin + MADV_DONTNEED + re-pin, or (b) selective memcpy of dirty pages from the snapshot file. The chosen approach SHALL be validated in Phase 3.
R4.1b Windows/WHP: there is no MADV_DONTNEED equivalent for file-mapped views. Restore SHALL unmap and remap the file-backed section through the full 3-way chain: WHvUnmapGpaRangeUnmapViewOfFile2 (surrogate) → UnmapViewOfFile (host) → MapViewOfFileEx (host, FILE_MAP_COPY) → MapViewOfFileNuma2 (surrogate) → WHvMapGpaRange2. The Placeholder API (MEM_REPLACE_PLACEHOLDER, Windows 10 1803+) SHOULD be used for the host view to guarantee address stability.
R4.2 Register state SHALL be reconstructed from the page-table root GPA (via standard_64bit_defaults()), entrypoint, and stack-top GVA — not read from the snapshot. GPRs are zeroed; RIP and RSP are set at dispatch time.
R4.3 Any dynamic mapped regions (slot ≥ 2) that were added after snapshot load SHALL be unmapped on restore; regions present in the snapshot SHALL be re-established.
R4.4 Restore latency SHALL be validated by benchmark per platform; target < 100 µs for a 64 MiB guest on Linux/KVM. Linux/MSHV and Windows/WHP targets TBD after Phase 3 measurement — MSHV may require un-pin/re-pin or dirty-page memcpy (R4.1a), and WHP requires a 6-syscall remap cycle (R4.1b), both of which have unknown latency profiles until validated.

R5 — CoW Overlay Layers

ID Requirement
R5.1 The runtime SHALL support a layer stack: [base, overlay_1, overlay_2, …].
R5.2 Each overlay SHALL record only the pages that differ from its parent.
R5.3 On sandbox instantiation, the runtime SHALL compose the stack by mapping the base and then applying overlays in order via memcpy of dirty pages into the mapped base. On Linux/KVM, an optimisation MAY use mmap with MAP_FIXED to replace dirty page ranges directly from the layer file, avoiding the copy — but this is NOT safe on MSHV (pin-page desync risk) and has no equivalent on Windows. The memcpy path SHALL be the cross-platform default.
R5.4 Runtime layer creation: after executing a specialisation step inside a sandbox, the caller SHALL be able to capture a new overlay via sandbox.snapshot_layer() -> Result<LayerFile>.
R5.5 Layers SHALL use the same format version header and content hash scheme as base snapshots (R1.2, R1.3).
R5.6 A layer file SHALL reference its parent by content hash, enabling integrity verification of the full stack.

R6 — Memory Sizing & Heap Growth

ID Requirement
R6.1 The snapshot format SHALL record the heap bounds at capture time.
R6.2 At load time, the caller MAY request a larger heap than the snapshot's original heap. The runtime SHALL append zero-initialised pages beyond the snapshot heap boundary ("grow-only append").
R6.3 Shrinking the heap below the snapshot size SHALL be rejected.
R6.4 Grow-only append SHALL NOT invalidate the snapshot content hash (the hash covers only original content; appended pages are deterministically zero).
R6.5 The guest-side allocator SHALL discover the extended heap via the SandboxMemoryLayout fields updated by the host before first entry.

R7 — OCI Distribution

ID Requirement
R7.1 Snapshot files and layer files SHALL be publishable as OCI artefacts with dedicated media types (e.g. application/vnd.hyperlight.snapshot.v1, application/vnd.hyperlight.layer.v1).
R7.2 A manifest SHALL reference the base snapshot and ordered layers. Additional blobs (e.g. external files the host needs to map) MAY be included at the publisher's discretion but are not required by the format.
R7.3 A dedicated CLI tool (or hl subcommand) SHALL support push, pull, inspect operations against any OCI-compliant registry.
R7.4 The runtime SHALL support pulling a snapshot directly from a registry at sandbox creation time (with local caching).
R7.5 Signing and verification SHALL integrate with cosign / notation; the CLI SHALL support sign and verify subcommands.

R8 — Metrics, Logging & Tracing

ID Requirement
R8.1 The runtime SHALL emit metrics for: snapshot load time, restore time, bake time, layer compose time, OCI pull time, cache hit/miss ratio.
R8.2 Structured log events SHALL be emitted on snapshot version mismatch, hash verification failure, and architecture tag mismatch, including the expected vs. actual values.

R9 — Security

ID Requirement
R9.1 Snapshot files loaded from disk SHALL be hash-verified.
R9.2 OCI-pulled snapshots SHALL require signature verification by default; an explicit opt-out flag SHALL be necessary to skip it.
R9.3 The snapshot format parser SHALL be fuzz-tested (extend existing fuzz/ targets).
R9.4 External file content SHALL be hash-verified against the caller-supplied expected_hash in map_external_file before mapping into guest PA space.
R9.5 Mapped file regions SHALL be mapped read-only at the host level (PROT_READ / PAGE_READONLY). Guest-side permissions (read, execute, etc.) are controlled by the hypervisor's second-level page tables, not by host mmap flags. The runtime SHALL reject any attempt to map an external file as host-writable.

R10 — Unified Sandbox API (Direct Hyperlight Users)

ID Requirement
R10.1 The public API SHALL expose a single Sandbox type (replacing UninitializedSandbox and MultiUseSandbox). Sandbox is always ready to execute guest calls.
R10.2 Sandbox SHALL be constructed via a SandboxBuilder that accepts: a snapshot source (file path, OCI reference, or in-memory bytes), optional overlay layers, optional heap-growth parameter, host function registrations, and config overrides.
R10.3 Legacy-compat only (will be removed in a future release): SandboxBuilder SHALL also accept a GuestBinary + SandboxConfiguration as input and internally perform a one-shot bake-and-load, so that existing call-sites can migrate incrementally. This path SHALL be marked #[deprecated] from Day 1 with a message directing users to pre-bake snapshots.
R10.4 UninitializedSandbox, MultiUseSandbox, and Sandbox::evolve() SHALL be removed from the public API.
R10.5 Host functions SHALL be registered on the builder (not on a transient uninitialised sandbox). The builder SHALL validate that all host functions referenced by the guest's import table are registered before returning a Sandbox.
R10.6 Sandbox SHALL expose at minimum: call<T>(&self, name, args) -> Result<T>, snapshot(&self) -> Result<SnapshotFile>, snapshot_layer(&self) -> Result<LayerFile>. Cleanup is handled by Drop.
R10.7 The Callable trait and call_guest_function_by_name pattern SHALL be preserved on Sandbox.

R11 — Mapped File Regions

External files (e.g. Wasm modules) can be mapped read-only into the guest's physical address space without being absorbed into the snapshot. This is the only supported mapping mode — writable mappings are not supported.

ID Requirement
R11.1 Sandbox (and the bake tool) SHALL support mapping external files into guest PA space as read-only. The host mmaps the file MAP_PRIVATE + read-only on Linux, CreateFileMapping(PAGE_READONLY) + MapViewOfFileEx on Windows. The guest sees the file at a fixed GPA. The snapshot does NOT absorb the file bytes — the host re-maps the file after loading the snapshot. On restore, the mapping is re-established from the original file.
R11.2 The current map_file_cow() (Linux-only, MAP_PRIVATE, guest flags READ|EXECUTE) SHALL be preserved under the new map_external_file API, maintaining the existing hyperlight-wasm MAPPED_BINARY_VA flow.
R11.3 The current map_region() SHALL be unified with map_file_cow() under a single map_external_file(path, gpa, expected_hash) API on SandboxBuilder (for bake-time) and Sandbox (for runtime).
R11.4 Drop for Sandbox (and the VM backend) SHALL unmap all dynamic regions.
R11.5 The snapshot format does not track external files (R1.8). It is the host application's responsibility to map any required external files after loading the snapshot, using map_external_file. The OCI manifest (R7.2) MAY declare companion blobs to assist tooling and distribution.

3 Implementation Phases

Phase Title Key Deliverables Dependencies
1 Unified Sandbox Type & Builder SandboxBuilder, Sandbox, remove UninitializedSandbox/evolve()/MultiUseSandbox. Internal bake-and-load behind builder. All existing tests green against new API.
2 Snapshot Serialisation Versioned binary format, SnapshotFile serialiser/deserialiser, content-hash verification, architecture tag. Fuzz target for parser.
3 mmap-Based Load & Restore mmap/CreateFileMapping load path. Platform-specific restore: MADV_DONTNEED (KVM), MSHV un-pin/re-pin or dirty-page memcpy (MSHV), full unmap/remap cycle with Placeholder API (WHP). Benchmark harness per platform. Phase 2
4 Mapped File Regions Refactor Unify map_file_cow/map_regionmap_external_file (read-only). expected_hash verification on map. Windows parity. Phase 2
5 Bake CLI & Library hl snap bake command, SnapshotBuilder::bake() library API, --overlay support, mapped-file-aware bake. Phases 2, 4
6 CoW Overlay Layers Layer format, layer compose on load, snapshot_layer() runtime API, parent-hash chaining. Phase 3
7 Heap Grow-Only Append Append zero pages at load, update SandboxMemoryLayout, guest-side discovery. Phase 3
8 OCI Distribution Media types, manifest schema, hl snap push/pull/inspect, registry pull at sandbox creation, local cache. Phase 5
9 Signing & Verification cosign/notation integration, hl snap sign/verify, default-on verification for OCI pulls. Phase 8

4 Migration Guide (Sketch)

Before (current API)

let mut usbox = UninitializedSandbox::new(
    GuestBinary::FilePath(path),
    Some(config),
    None, None, None,
)?;
usbox.register_host_function("HostFunc", |call| { /* … */ })?;
let mut sandbox = usbox.evolve(Noop)?;
let result = sandbox.call_guest_function_by_name::<String>("GuestFunc",)?;

After (proposed API)

// One-shot from binary (legacy-compat, internally bakes + loads)
let sandbox = SandboxBuilder::new()
    .with_guest_binary(GuestBinary::FilePath(path))
    .with_config(config)
    .with_host_function("HostFunc", |call| { /* … */ })
    .build()?;

// From pre-baked snapshot (fast path)
let sandbox = SandboxBuilder::new()
    .with_snapshot("path/to/app.hlsnap")
    .with_layer("path/to/overlay.hllayer")
    .with_heap_growth(HeapGrowth::AppendPages(1024))
    .with_host_function("HostFunc", |call| { /* … */ })
    .build()?;

let result: String = sandbox.call("GuestFunc", args)?;

5 Open Questions

# Question Options Status
Q-1 Snapshot file extension .hlsnap / .hls / .hyperlight Open
Q-2 Layer file extension .hllayer / .hll Open
Q-3 Compression algorithm for on-disk pages zstd / lz4 / none Open
Q-4 OCI media type namespace application/vnd.hyperlight.* / application/vnd.microsoft.hyperlight.* Open
Q-5 Transitional deprecation release before full removal of old API Yes (one release with #[deprecated]) / No (straight removal) Open

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions