Skip to content

Islands#271

Merged
arcanis merged 4 commits intomainfrom
mael/resolution-split
Apr 7, 2026
Merged

Islands#271
arcanis merged 4 commits intomainfrom
mael/resolution-split

Conversation

@arcanis
Copy link
Copy Markdown
Member

@arcanis arcanis commented Mar 30, 2026

This PR implements a very experimental concept of islands.

  • A project may have multiple islands, which each declare a set of workspaces they host.
  • The general island (doesn't have a name at the moment) hosts all remaining workspaces.
  • Islands are installed through separate dependency graphs.
  • Islands have their own link step. Only the node-modules linker is supported at this time.
  • Islands are currently resolved via the PubGrub algorithm.
    • In practice it means that each package name will only map to a single version.
    • In the future the resolution could use the general greedy resolution.
  • The lockfile contains the resolution for all islands.

Islands can currently be used to have node_modules install within a PnP project, although I need to stress that they are highly experimental and will definitely change. This is reflected in the way they are configured (through a yarnrc setting called unstableIslands).

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Confidence Score: 3/5

  • Not safe to merge as-is: contains an unsound unsafe block that is cancel-unsafe (use-after-free risk) and a VersionSet contract violation that can produce incorrect PubGrub solutions.
  • Two P1 issues exist: (1) the unsafe transmute is not cancel-safe — dropping the island resolution future mid-flight leaves a blocking thread with a dangling reference; (2) IslandVersionSet::full() violates the VersionSet contract, which is a correctness invariant that PubGrub depends on. The P2 issues (workspace hash missing island deps, hard-coded NPM registry) are non-blocking for a clearly experimental feature but should be addressed before wider adoption.
  • packages/zpm/src/island.rs (unsafe transmute), packages/zpm/src/island_types.rs (full() contract), packages/zpm/src/install.rs (workspace hash).

Important Files Changed

Filename Overview
packages/zpm/src/island.rs New file implementing island resolution via PubGrub; contains an unsound unsafe transmute of ctx to 'static that is not cancel-safe, and a completely disabled lockfile fast-path (dead code).
packages/zpm/src/island_types.rs Defines IslandVersionSet VersionSet impl; full() returns a Semver range that incorrectly returns false for non-semver versions via contains, violating the VersionSet contract.
packages/zpm/src/island_provider.rs New PubGrub DependencyProvider; hardcodes NPM registry in fetch_versions, missing support for custom/scoped registries.
packages/zpm/src/install.rs Integrates island resolution into the main install pipeline; compute_workspace_hash only consults normalized_resolutions (greedy), missing island package dependencies from the hash.
packages/zpm/src/linker/nm/mod.rs Adds link_island_nm for per-workspace node_modules linking for islands; logic looks correct for the single-workspace-per-island case.
packages/zpm/src/lockfile.rs Extends Lockfile struct with an islands field and adds serialisation/deserialisation support; island entries share the main entries map for checksums, which looks intentional and correct.
packages/zpm-semver/src/pubgrub.rs New module converting zpm semver Range tokens to pubgrub Ranges<PubgrubVersion>; conversions look correct for the handled operators.
packages/zpm/src/linker/mod.rs Adds per-island link step after the main project link; currently only NodeModules linker is supported, consistent with the PR description.
tests/acceptance-tests/pkg-tests-specs/sources/features/islands.test.ts Adds acceptance tests covering basic island scenarios (single/multiple islands, no-deps, empty islands error); good coverage of the happy path but no tests for cancellation safety, custom registries, or lockfile staleness with dependency changes.

Comments Outside Diff (3)

  1. packages/zpm/src/island.rs, line 153-201 (link)

    Unsafe transmute is unsound on future cancellation

    The SAFETY comment claims the transmuted reference is valid because the JoinHandle is .awaited immediately. This is incorrect when the resolve_island future is dropped/cancelled (e.g., by try_join in resolve_and_fetch when another future fails).

    When a JoinHandle future is dropped in Tokio, the associated blocking thread is NOT aborted — it continues executing. After resolve_island is cancelled, ctx may be freed while the blocking thread still holds ctx_static, producing a use-after-free.

    Repro scenario:

    1. Island A and island B resolve concurrently via try_join_all.
    2. Island A's pubgrub spawn_blocking task starts, holding ctx_static.
    3. Island B's resolution returns an error, causing try_join to cancel island A's future.
    4. Island A's JoinHandle future is dropped; the blocking thread keeps running against freed memory.

    The safe alternative is to use a channel or Arc to pass ownership to the blocking task instead of a raw reference:

    // Instead of transmuting ctx, clone the data the blocking task needs into owned values:
    let enforced_resolutions = ctx.enforced_resolutions.clone();
    let package_cache = ctx.package_cache.map(|c| c.clone_arc());
    // … pass owned values into spawn_blocking

    Or drive the pubgrub solver incrementally in async context to avoid spawn_blocking entirely (as the existing TODO comment at line 184 already suggests).

  2. packages/zpm/src/island_types.rs, line 286-289 (link)

    full() violates the VersionSet contract for non-semver versions

    full() returns Semver(Ranges::full()), but the contains implementation returns false for any version whose .version() is None (workspace packages, link:, portal:, etc.):

    IslandVersionSet::Semver(r) => {
        match v.version() {
            Some(version) => r.contains(&PubgrubVersion::new(version)),
            None => false,  // ← "full" range excludes non-semver versions!
        }
    }

    The VersionSet contract requires full().contains(v) == true for all versions v. Breaking this invariant can cause pubgrub to skip valid candidates or produce incorrect "no solution" errors for packages that only have non-semver versions.

    The current design avoids hitting this because workspace and link: packages are always pre-constrained to exact singletons before pubgrub sees them. However, pubgrub internally uses full() when initialising package constraints (complement of empty) and the invariant should be maintained for correctness:

    fn full() -> Self {
        // Use NoneOf([]) — the exact "everything" set — so non-semver
        // versions are also covered without ambiguity.
        IslandVersionSet::Exact(ExactSet::NoneOf(SmallVec::new()))
    }

    E2E test suggestion: add a test with a workspace package that has a link: dependency to another workspace and confirm the island resolves without error.

  3. packages/zpm-semver/src/pubgrub.rs, line 100-106 (link)

    LessThanOrEqual conversion is incorrect — it excludes the exact version

    The <=v case first computes strictly_lower_than(v) and then unions it with singleton(v). This is mathematically correct. However, the initial range is built with a clone of version for strictly_lower_than, which is fine.

    Wait — actually looking more carefully, this is correct: < v ∪ {v} = <= v. ✓

    Disregard — there is no bug here. (Leaving this note so reviewers don't have to re-examine the same line.)

Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/zpm/src/island.rs
Line: 153-201

Comment:
**Unsafe transmute is unsound on future cancellation**

The SAFETY comment claims the transmuted reference is valid because the `JoinHandle` is `.await`ed immediately. This is incorrect when the `resolve_island` future is **dropped/cancelled** (e.g., by `try_join` in `resolve_and_fetch` when another future fails).

When a `JoinHandle` future is dropped in Tokio, the associated **blocking thread is NOT aborted** — it continues executing. After `resolve_island` is cancelled, `ctx` may be freed while the blocking thread still holds `ctx_static`, producing a use-after-free.

Repro scenario:
1. Island A and island B resolve concurrently via `try_join_all`.
2. Island A's pubgrub `spawn_blocking` task starts, holding `ctx_static`.
3. Island B's resolution returns an error, causing `try_join` to cancel island A's future.
4. Island A's `JoinHandle` future is dropped; the blocking thread keeps running against freed memory.

The safe alternative is to use a channel or `Arc` to pass ownership to the blocking task instead of a raw reference:

```rust
// Instead of transmuting ctx, clone the data the blocking task needs into owned values:
let enforced_resolutions = ctx.enforced_resolutions.clone();
let package_cache = ctx.package_cache.map(|c| c.clone_arc());
// … pass owned values into spawn_blocking
```

Or drive the pubgrub solver incrementally in async context to avoid `spawn_blocking` entirely (as the existing TODO comment at line 184 already suggests).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/zpm/src/island_types.rs
Line: 286-289

Comment:
**`full()` violates the `VersionSet` contract for non-semver versions**

`full()` returns `Semver(Ranges::full())`, but the `contains` implementation returns `false` for any version whose `.version()` is `None` (workspace packages, `link:`, `portal:`, etc.):

```rust
IslandVersionSet::Semver(r) => {
    match v.version() {
        Some(version) => r.contains(&PubgrubVersion::new(version)),
        None => false,  // ← "full" range excludes non-semver versions!
    }
}
```

The `VersionSet` contract requires `full().contains(v) == true` for **all** versions `v`. Breaking this invariant can cause pubgrub to skip valid candidates or produce incorrect "no solution" errors for packages that only have non-semver versions.

The current design avoids hitting this because workspace and `link:` packages are always pre-constrained to exact singletons before pubgrub sees them. However, pubgrub internally uses `full()` when initialising package constraints (complement of empty) and the invariant should be maintained for correctness:

```rust
fn full() -> Self {
    // Use NoneOf([]) — the exact "everything" set — so non-semver
    // versions are also covered without ambiguity.
    IslandVersionSet::Exact(ExactSet::NoneOf(SmallVec::new()))
}
```

E2E test suggestion: add a test with a workspace package that has a `link:` dependency to another workspace and confirm the island resolves without error.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/zpm/src/install.rs
Line: 1082-1119

Comment:
**`compute_workspace_hash` misses island package dependencies**

`compute_workspace_hash` traverses `self.result.install_state.normalized_resolutions`, but island resolutions live in `self.result.install_state.island_normalized_resolutions`. Workspace locators that belong to an island will not find their dependencies here and will be hashed as bare locators.

Consequence: adding, upgrading, or removing a dependency in an island workspace won't change the workspace hash stored in `lockfile.workspaces`. Lockfile-staleness checks that rely on this hash (e.g. immutable installs) will incorrectly report no change.

An E2E test demonstrating this: run `install` with an island workspace that has a dependency, then update the dependency version in its `package.json` and run `install --immutable` — it should fail, but currently it may pass silently.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/zpm/src/island_provider.rs
Line: 105-113

Comment:
**`fetch_versions` always uses the NPM registry, ignoring custom registries**

`Registry::Npm(package.clone())` hard-codes the npm registry for every package. Packages from scoped registries (e.g. `@company/foo` resolving from a private registry configured via `npmRegistryServer` or `npmScopes`) will be fetched from the wrong endpoint.

Consider deriving the registry from the package's scope using the project configuration (similar to how `http_npm::get_registry` is called elsewhere), or at minimum documenting this limitation prominently.

E2E test suggestion: add an island test with a scoped package (`@scope/pkg`) that points to a custom registry and assert it resolves correctly.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/zpm-semver/src/pubgrub.rs
Line: 100-106

Comment:
**`LessThanOrEqual` conversion is incorrect — it excludes the exact version**

The `<=v` case first computes `strictly_lower_than(v)` and then unions it with `singleton(v)`. This is mathematically correct. However, the initial range is built with a **clone** of `version` for `strictly_lower_than`, which is fine.

Wait — actually looking more carefully, this is correct: `< v ∪ {v} = <= v`. ✓

Disregard — there is no bug here. (Leaving this note so reviewers don't have to re-examine the same line.)

How can I resolve this? If you propose a fix, please make it concise.

Reviews (4): Last reviewed commit: "Fixes fetch bug on unsupported architect..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

⏱️ Benchmark Results

gatsby install-full-cold

Metric Base Head Difference
Mean 2.318s 2.385s +2.91% ⚠️
Median 2.316s 2.370s +2.34% ⚠️
Min 2.176s 2.308s
Max 2.553s 2.592s
Std Dev 0.072s 0.061s
📊 Raw benchmark data (gatsby install-full-cold)

Base times: 2.277s, 2.176s, 2.332s, 2.553s, 2.458s, 2.411s, 2.317s, 2.233s, 2.302s, 2.276s, 2.342s, 2.325s, 2.331s, 2.300s, 2.315s, 2.350s, 2.293s, 2.328s, 2.302s, 2.270s, 2.301s, 2.309s, 2.335s, 2.248s, 2.374s, 2.361s, 2.190s, 2.282s, 2.327s, 2.321s

Head times: 2.346s, 2.369s, 2.308s, 2.397s, 2.375s, 2.370s, 2.371s, 2.383s, 2.372s, 2.385s, 2.369s, 2.413s, 2.350s, 2.369s, 2.359s, 2.441s, 2.556s, 2.592s, 2.430s, 2.409s, 2.367s, 2.350s, 2.339s, 2.345s, 2.320s, 2.359s, 2.453s, 2.370s, 2.319s, 2.377s


gatsby install-cache-and-lock (warm, with lockfile)

Metric Base Head Difference
Mean 0.345s 0.357s +3.36% ⚠️
Median 0.345s 0.350s +1.39% ⚠️
Min 0.339s 0.342s
Max 0.350s 0.441s
Std Dev 0.002s 0.023s
📊 Raw benchmark data (gatsby install-cache-and-lock (warm, with lockfile))

Base times: 0.345s, 0.339s, 0.343s, 0.345s, 0.344s, 0.343s, 0.343s, 0.340s, 0.343s, 0.345s, 0.347s, 0.343s, 0.346s, 0.348s, 0.345s, 0.344s, 0.347s, 0.344s, 0.344s, 0.345s, 0.346s, 0.345s, 0.345s, 0.344s, 0.345s, 0.349s, 0.350s, 0.347s, 0.347s, 0.349s

Head times: 0.349s, 0.348s, 0.352s, 0.348s, 0.348s, 0.349s, 0.349s, 0.353s, 0.351s, 0.348s, 0.350s, 0.353s, 0.358s, 0.418s, 0.412s, 0.350s, 0.347s, 0.354s, 0.351s, 0.350s, 0.351s, 0.352s, 0.353s, 0.441s, 0.343s, 0.344s, 0.344s, 0.344s, 0.345s, 0.342s

@arcanis arcanis merged commit cd6d95a into main Apr 7, 2026
14 of 15 checks passed
@arcanis arcanis deleted the mael/resolution-split branch April 7, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant