Skip to content

feat: add address allocation model for pointer-to-integer semantics #1002

@Stevengre

Description

@Stevengre

Problem

PR #812 added PtrLocal transmute-to-usize support, making the interior-mut3 test pass. However, the implementation uses the pointer metadata PTR_OFFSET (typically 0) as a stand-in for a real memory address, producing incorrect semantics:

Confirmed fail tests:

  • ptr-transmute-nonzero-fail: a stack pointer transmuted to usize yields 0, but real Rust stack addresses are never null
  • ptr-transmute-two-locals-fail: two different local variable pointers both transmute to 0, making them indistinguishable

Root cause: KMIR uses symbolic references (PtrLocal(offset, place(local(N), ...))) to represent pointers with no numeric address concept. When Rust code converts a pointer to an integer (transmute, ptr as usize), KMIR cannot produce a meaningful value.

Industry Comparison

Miri: Lazy Address Allocation + Provenance Tracking

Miri pointers are abstract (AllocId, offset) pairs. Concrete addresses are only assigned on demand (at ptr-to-int cast time).

Core mechanism:

  • alloc_addresses/mod.rs maintains base_addr: FxHashMap<AllocId, u64>
  • ptr→int: addr_from_alloc_id() lazily assigns a base address, returns base + offset, adds AllocId to the exposed set
  • int→ptr: searches the exposed set for an allocation whose address range contains the target integer, recovering provenance
  • Inserts random gaps between allocations to simulate real allocator behavior

Source: alloc_addresses/mod.rs

Kani: CBMC Object Encoding, No Provenance

Kani compiles Rust to CBMC's GOTO programs, inheriting CBMC's object-based memory model.

Core mechanism:

  • Pointers are bitvectors: top N bits (default 16) = object-id, remaining bits = offset
  • ptr→int and int→ptr are both bitvector reinterpretations (codegen_misc_castcast_to)
  • Does not distinguish PointerExposeAddress, Transmute, IntToInt — all go through the same path
  • Does not track provenance — Issue #1274 decided not to handle it

Limitation: Cast integers are not real addresses but CBMC's internal (object-id << N | offset) encoding. Integer arithmetic mixes object-id and offset bits.

Source: rvalue.rs
Docs: Kani Rust Feature Support

Verus: Strict Provenance + Ghost Permission Tokens

Verus follows Rust's Strict Provenance model (RFC 3559), using ghost types for memory permission tracking.

Core mechanism:

  • Pointer = PtrData<T> = (address: usize, provenance: Provenance, metadata)
  • Memory access requires PointsTo<T> ghost permission token
  • ptr→int: cast_ptr_to_usize extracts the address component (no preconditions)
  • int→ptr: requires expose_provenance() to obtain an IsExposed ghost token, then with_exposed_provenance() to reconstruct

Source: raw_ptr.rs
PR: verus#1244 (Ptrs)

Comparison Summary

Dimension Miri Kani Verus KMIR (current)
Pointer repr (AllocId, offset) bitvector (obj_id, offset) (addr, provenance, meta) PtrLocal(offset, place, ...)
ptr→int lazy base + offset bitvector cast extract addr PTR_OFFSET (usually 0)
int→ptr search exposed set bitvector cast needs IsExposed token unsupported
Provenance Stacked/Tree Borrows none ghost token none
Address uniqueness yes (random gaps) yes (unique obj-id) yes (abstract) no (all zero)

Rust-level reference: unsafe-code-guidelines#286

Proposed Approach: Miri-style Lazy Address Allocation

Add address mapping cells to the K configuration:

<addressMap> .Map </addressMap>        // local(N) -> (base_addr, size)
<nextAddress> 4096:Int </nextAddress>  // start from non-zero to avoid NULL
<exposedSet> .Set </exposedSet>        // allocations with exposed provenance

ptr→int rule:

rule <k> #cast(PtrLocal(_, place(LOCAL, _), _, metadata(_, PTR_OFFSET, _)),
               castKindTransmute, _, TY_TARGET)
      => #intAsType(BASE +Int (PTR_OFFSET *Int ELEM_SIZE), ...) ... </k>
     <addressMap> ... LOCAL |-> (BASE, _) ... </addressMap>
  requires #isIntType(lookupTy(TY_TARGET))

// Lazy allocation: if LOCAL doesn't have an address yet
rule <k> #cast(PtrLocal(_, place(LOCAL, _), _, _), castKindTransmute, _, TY_TARGET)
      => #allocAddress(LOCAL, SIZE) ~> #cast(...) ... </k>
     <addressMap> MAP </addressMap>
  requires #isIntType(lookupTy(TY_TARGET))
   andBool notBool LOCAL in_keys(MAP)

Pros:

  • Most complete semantics: different allocations get different addresses, supports roundtrip
  • Lazy — no overhead for programs that never do ptr→int
  • Can support both PointerExposeAddress and with_exposed_provenance

Cons:

  • Requires new configuration cells
  • Concrete vs symbolic address decision needed for symbolic execution
  • Stack frame lifetime management (cleanup on function return)

Related PRs / Issues (can be closed)

  • PR #812 — current PTR_OFFSET workaround, superseded by address model
  • PR #877 — skips alignment checking block; with real addresses, alignment checks execute normally
  • Issue #638 — "Extend pointer emulation to pass alignment check" — exactly what the address model achieves

Acceptance Criteria

  • ptr-transmute-nonzero-fail.rs renamed to ptr-transmute-nonzero.rs (test passes)
  • ptr-transmute-two-locals-fail.rs renamed to ptr-transmute-two-locals.rs (test passes)
  • interior-mut3.rs continues to pass
  • Support castKindPointerExposeAddress (ptr as usize)
  • Support castKindPointerWithExposedProvenance (usize as *const T) via exposed set
  • Different allocations produce different ptr→int results
  • All ptr→int results are non-zero
  • ptr→int→ptr roundtrip recovers the original allocation
  • No regressions: all existing prove tests pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    status:readyClear and approved by human

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions