-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add address allocation model for pointer-to-integer semantics #1002
Description
Problem
PR #812 added PtrLocal transmute-to-usize support, making the interior-mut3 test pass. However, the implementation uses the pointer metadata PTR_OFFSET (typically 0) as a stand-in for a real memory address, producing incorrect semantics:
Confirmed fail tests:
ptr-transmute-nonzero-fail: a stack pointer transmuted tousizeyields0, but real Rust stack addresses are never nullptr-transmute-two-locals-fail: two different local variable pointers both transmute to0, making them indistinguishable
Root cause: KMIR uses symbolic references (PtrLocal(offset, place(local(N), ...))) to represent pointers with no numeric address concept. When Rust code converts a pointer to an integer (transmute, ptr as usize), KMIR cannot produce a meaningful value.
Industry Comparison
Miri: Lazy Address Allocation + Provenance Tracking
Miri pointers are abstract (AllocId, offset) pairs. Concrete addresses are only assigned on demand (at ptr-to-int cast time).
Core mechanism:
alloc_addresses/mod.rsmaintainsbase_addr: FxHashMap<AllocId, u64>- ptr→int:
addr_from_alloc_id()lazily assigns a base address, returnsbase + offset, adds AllocId to theexposedset - int→ptr: searches the
exposedset for an allocation whose address range contains the target integer, recovering provenance - Inserts random gaps between allocations to simulate real allocator behavior
Source: alloc_addresses/mod.rs
Kani: CBMC Object Encoding, No Provenance
Kani compiles Rust to CBMC's GOTO programs, inheriting CBMC's object-based memory model.
Core mechanism:
- Pointers are bitvectors: top N bits (default 16) = object-id, remaining bits = offset
- ptr→int and int→ptr are both bitvector reinterpretations (
codegen_misc_cast→cast_to) - Does not distinguish
PointerExposeAddress,Transmute,IntToInt— all go through the same path - Does not track provenance — Issue #1274 decided not to handle it
Limitation: Cast integers are not real addresses but CBMC's internal (object-id << N | offset) encoding. Integer arithmetic mixes object-id and offset bits.
Source: rvalue.rs
Docs: Kani Rust Feature Support
Verus: Strict Provenance + Ghost Permission Tokens
Verus follows Rust's Strict Provenance model (RFC 3559), using ghost types for memory permission tracking.
Core mechanism:
- Pointer =
PtrData<T>=(address: usize, provenance: Provenance, metadata) - Memory access requires
PointsTo<T>ghost permission token - ptr→int:
cast_ptr_to_usizeextracts theaddresscomponent (no preconditions) - int→ptr: requires
expose_provenance()to obtain anIsExposedghost token, thenwith_exposed_provenance()to reconstruct
Source: raw_ptr.rs
PR: verus#1244 (Ptrs)
Comparison Summary
| Dimension | Miri | Kani | Verus | KMIR (current) |
|---|---|---|---|---|
| Pointer repr | (AllocId, offset) |
bitvector (obj_id, offset) |
(addr, provenance, meta) |
PtrLocal(offset, place, ...) |
| ptr→int | lazy base + offset |
bitvector cast | extract addr |
PTR_OFFSET (usually 0) |
| int→ptr | search exposed set | bitvector cast | needs IsExposed token |
unsupported |
| Provenance | Stacked/Tree Borrows | none | ghost token | none |
| Address uniqueness | yes (random gaps) | yes (unique obj-id) | yes (abstract) | no (all zero) |
Rust-level reference: unsafe-code-guidelines#286
Proposed Approach: Miri-style Lazy Address Allocation
Add address mapping cells to the K configuration:
<addressMap> .Map </addressMap> // local(N) -> (base_addr, size)
<nextAddress> 4096:Int </nextAddress> // start from non-zero to avoid NULL
<exposedSet> .Set </exposedSet> // allocations with exposed provenanceptr→int rule:
rule <k> #cast(PtrLocal(_, place(LOCAL, _), _, metadata(_, PTR_OFFSET, _)),
castKindTransmute, _, TY_TARGET)
=> #intAsType(BASE +Int (PTR_OFFSET *Int ELEM_SIZE), ...) ... </k>
<addressMap> ... LOCAL |-> (BASE, _) ... </addressMap>
requires #isIntType(lookupTy(TY_TARGET))
// Lazy allocation: if LOCAL doesn't have an address yet
rule <k> #cast(PtrLocal(_, place(LOCAL, _), _, _), castKindTransmute, _, TY_TARGET)
=> #allocAddress(LOCAL, SIZE) ~> #cast(...) ... </k>
<addressMap> MAP </addressMap>
requires #isIntType(lookupTy(TY_TARGET))
andBool notBool LOCAL in_keys(MAP)Pros:
- Most complete semantics: different allocations get different addresses, supports roundtrip
- Lazy — no overhead for programs that never do ptr→int
- Can support both
PointerExposeAddressandwith_exposed_provenance
Cons:
- Requires new configuration cells
- Concrete vs symbolic address decision needed for symbolic execution
- Stack frame lifetime management (cleanup on function return)
Related PRs / Issues (can be closed)
- PR #812 — current
PTR_OFFSETworkaround, superseded by address model - PR #877 — skips alignment checking block; with real addresses, alignment checks execute normally
- Issue #638 — "Extend pointer emulation to pass alignment check" — exactly what the address model achieves
Acceptance Criteria
-
ptr-transmute-nonzero-fail.rsrenamed toptr-transmute-nonzero.rs(test passes) -
ptr-transmute-two-locals-fail.rsrenamed toptr-transmute-two-locals.rs(test passes) -
interior-mut3.rscontinues to pass - Support
castKindPointerExposeAddress(ptr as usize) - Support
castKindPointerWithExposedProvenance(usize as *const T) via exposed set - Different allocations produce different ptr→int results
- All ptr→int results are non-zero
- ptr→int→ptr roundtrip recovers the original allocation
- No regressions: all existing prove tests pass