fix(perf): wire engine selection through openRepo to fix query benchmarks by carlos-alm · Pull Request #869 · optave/ops-codegraph-tool

carlos-alm · 2026-04-05T20:56:33Z

Summary

openRepo() always tried native NAPI path first regardless of engine selection, causing both "wasm" and "native" benchmark workers to measure the same native rusqlite open/close overhead (~27ms vs ~10ms with direct better-sqlite3)
This produced a false 177% regression in fnDeps query benchmarks (v3.7.0→v3.9.0), blocking CI on all open PRs
Add engine param to openRepo() and check CODEGRAPH_ENGINE env var so query functions respect engine selection
Set CODEGRAPH_ENGINE in benchmark fork workers
Add v3.9.0 fnDeps depth 1/3/5 to KNOWN_REGRESSIONS (published data measured with broken routing)

Root cause

openRepo() in src/db/connection.ts had no engine selection — it unconditionally called isNativeAvailable() and used the native path when available. The benchmark fork (scripts/lib/fork-engine.ts) set __BENCH_ENGINE__ per worker to label engine runs, but openRepo never checked it. On CI where both engines are available, both workers measured identical native performance.

Test plan

regression-guard.test.ts — 14/14 pass (fnDeps regressions now in KNOWN_REGRESSIONS)
queries-unit.test.ts — 74/74 pass
Integration tests — all query tests pass
TypeScript compiles clean
Lint clean on changed files

…arks openRepo() always tried the native NAPI path first regardless of engine selection, causing both "wasm" and "native" benchmark workers to measure the same native rusqlite open/close overhead (~27ms vs ~10ms with direct better-sqlite3). This produced a false 177% regression in fnDeps query benchmarks between v3.7.0 and v3.9.0, blocking CI on all PRs. - Add engine param to openRepo() and check CODEGRAPH_ENGINE env var - Set CODEGRAPH_ENGINE in benchmark fork workers so query functions respect the engine selection per worker - Add v3.9.0 fnDeps depth 1/3/5 to KNOWN_REGRESSIONS since the published data was measured with the broken engine routing

claude · 2026-04-05T20:56:47Z

Claude finished @carlos-alm's task —— View job

PR Review: Engine Selection Fix

Todo List:

Read and analyze changed files
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity review
Phase 4: Code quality assessment
Phase 5: Backlog compliance grading
Provide final review recommendation

Root Cause Analysis

Problem Being Solved: The openRepo() function in src/db/connection.ts unconditionally tried the native NAPI path first regardless of explicit engine selection, causing both "wasm" and "native" benchmark workers to measure identical native rusqlite performance (~27ms vs ~10ms with better-sqlite3). This produced false 177% regressions in fnDeps query benchmarks (v3.7.0→v3.9.0), blocking CI on all open PRs.

Why This Approach: The fix correctly addresses the root cause by:

Adding engine parameter to openRepo() options
Checking process.env.CODEGRAPH_ENGINE environment variable
Setting CODEGRAPH_ENGINE in benchmark fork workers
Respecting explicit engine selection with proper fallback hierarchy

Risk Assessment: ✅ This fixes a real measurement bug, not hiding symptoms. The issue was in infrastructure—benchmark isolation was broken, not the measured code.

Configuration Change Detection

✅ No high-risk configuration changes detected:

No biome/linting rule changes
No test timeout or coverage threshold modifications
No build configuration changes
No TypeScript strict mode changes

Test Integrity Review

✅ Strong test integrity:

The KNOWN_REGRESSIONS additions are appropriately documented with detailed explanations
Regressions are marked as temporary with clear fix path (v3.10.0 will reflect corrected measurements)
Root cause is explained: engine routing bug, not actual performance regression
No test weakening—just acknowledging bad historical data due to measurement bug

Code Quality Assessment

src/db/connection.ts (openRepo changes):

✅ Clean parameter addition with proper TypeScript typing
✅ Correct fallback hierarchy: opts.engine > CODEGRAPH_ENGINE env > 'auto'
✅ Well-documented with clear comment explaining purpose
✅ Engine check properly guards native path: engine !== 'wasm' && isNativeAvailable()

scripts/lib/fork-engine.ts (benchmark isolation fix):

✅ Minimal, surgical change: added CODEGRAPH_ENGINE: workerName to forked env
✅ Proper placement in forkWorker function where all env setup happens
✅ Consistent with existing pattern ([envKey]: workerName)

tests/benchmarks/regression-guard.test.ts (known regressions):

✅ Detailed documentation explaining the root cause
✅ Clear indication this is published data with broken measurement
✅ Proper scoping: only v3.9.0 fnDeps metrics affected
✅ Forward-looking note about v3.10.0 corrected measurements

Backlog Compliance

Zero-dep: ✓ — No new runtime dependencies added
Foundation-aligned: ✓ — Maintains P1 (always-current) and P4 (zero-cost core) principles
Problem-fit: 5 — Directly addresses measurement accuracy and CI reliability
Breaking: No — Purely additive parameter, backward compatible API
Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None. This is a clean infrastructure fix that:

Addresses real measurement bug with proper root cause analysis
Maintains backward compatibility (optional parameter with sensible defaults)
Properly documents known issues without weakening test standards
Follows established patterns in the codebase

Final Recommendation

Rating: ⭐⭐⭐⭐⭐ (5/5)
Action: APPROVE
Reasoning: Excellent fix that properly addresses the root cause of false regression reports. Clean implementation, proper documentation, and strong test integrity. The KNOWN_REGRESSIONS additions are well-justified as they acknowledge bad historical measurement data rather than lowering standards. Critical for unblocking CI on all open PRs while maintaining benchmark integrity.

greptile-apps · 2026-04-05T21:01:54Z

Greptile Summary

This PR fixes a benchmark instrumentation bug where openRepo() ignored engine selection, causing both wasm and native benchmark workers to measure identical native rusqlite performance (~27 ms), producing a false 177% regression in fnDeps query benchmarks that blocked CI on all open PRs.

The fix is two-part: forkWorker in scripts/lib/fork-engine.ts now sets CODEGRAPH_ENGINE=<engine> in each worker's environment alongside the existing __BENCH_ENGINE__ label, and openRepo() in src/db/connection.ts now reads opts.engine || process.env.CODEGRAPH_ENGINE || 'auto' to route to the correct backend. openReadonlyWithNative() receives the same env-var check for consistency.

scripts/lib/fork-engine.ts: one-line fix adding CODEGRAPH_ENGINE: workerName to the forked worker env — the minimal change needed to wire engine identity through fork()
src/db/connection.ts: openRepo() gains an opts.engine param with priority opts.engine > CODEGRAPH_ENGINE env > auto; openReadonlyWithNative() gains the same env-var check; all existing callers remain backward-compatible since engine is optional
tests/benchmarks/regression-guard.test.ts: three v3.9.0 fnDeps KNOWN_REGRESSIONS entries to unblock CI (measurements taken under broken routing) plus a new self-enforcing stale-exemption test that reads package.json and fails if any entry is more than 1 minor version behind the current release, ensuring bookkeeping cannot silently go stale

Confidence Score: 5/5

Safe to merge — root cause correctly identified and fixed with no P0/P1 issues remaining.

The fix is minimal, targeted, and correct. The two-part change (set CODEGRAPH_ENGINE in the fork + respect it in openRepo) fully addresses the false-regression root cause. Backward compatibility is preserved: opts.engine is optional and all existing callers continue to work unchanged. The stale-exemption test adds durable self-enforcement so KNOWN_REGRESSIONS bookkeeping cannot silently go stale. No logic bugs, security issues, or edge-case regressions found across any of the three changed files.

No files require special attention.

Important Files Changed

Filename	Overview
scripts/lib/fork-engine.ts	Adds CODEGRAPH_ENGINE env var to forked worker processes so each worker uses its intended engine; this is the primary bug fix.
src/db/connection.ts	Adds engine selection to openRepo() (opts.engine > CODEGRAPH_ENGINE env > auto) and makes openReadonlyWithNative() consistent, fixing native-path bypass.
tests/benchmarks/regression-guard.test.ts	Adds 3 KNOWN_REGRESSIONS exemptions for v3.9.0 fnDeps metrics measured with broken routing; adds stale-entry self-detection test keyed on package version.

Sequence Diagram

sequenceDiagram
    participant P as Parent (forkEngines)
    participant W1 as Worker: wasm
    participant W2 as Worker: native
    participant OR as openRepo()

    Note over P,OR: BEFORE fix — CODEGRAPH_ENGINE never set in worker env
    P->>W1: fork(env: __BENCH_ENGINE__=wasm)
    W1->>OR: openRepo()
    OR-->>W1: isNativeAvailable()=true → uses rusqlite
    P->>W2: fork(env: __BENCH_ENGINE__=native)
    W2->>OR: openRepo()
    OR-->>W2: isNativeAvailable()=true → uses rusqlite
    Note over W1,W2: Both workers measure native overhead — false regression!

    Note over P,OR: AFTER fix — CODEGRAPH_ENGINE wired through fork
    P->>W1: fork(env: __BENCH_ENGINE__=wasm, CODEGRAPH_ENGINE=wasm)
    W1->>OR: openRepo()
    OR-->>W1: engine=wasm → skips native → uses better-sqlite3
    P->>W2: fork(env: __BENCH_ENGINE__=native, CODEGRAPH_ENGINE=native)
    W2->>OR: openRepo()
    OR-->>W2: engine=native → uses rusqlite NativeRepository

_{Reviews (2): Last reviewed commit: "fix: add staleness check for KNOWN_REGRE..." | Re-trigger Greptile}

greptile-apps · 2026-04-05T21:02:01Z

tests/benchmarks/regression-guard.test.ts

+const KNOWN_REGRESSIONS = new Set([
+  '3.9.0:1-file rebuild',
+  '3.9.0:fnDeps depth 1',
+  '3.9.0:fnDeps depth 3',
+  '3.9.0:fnDeps depth 5',
+]);


KNOWN_REGRESSIONS entries have no expiry enforcement

The three 3.9.0:fnDeps depth {1,3,5} entries are documented in the inline comment above (lines 68–73) to be removed once v3.10.0 benchmark data is merged. There is no automated check that enforces this cleanup. If these entries are left in place after v3.10.0 data lands, the regression guard will silently pass genuine regressions in those three metrics for all future releases.

Consider adding a test-time assertion that checks the current package version and warns (or fails) when any KNOWN_REGRESSIONS entry is more than one minor version behind the running package — this would make the stale-exemption problem self-detecting rather than requiring manual bookkeeping.

Fixed in b6e919a. Added a KNOWN_REGRESSIONS entries are not stale test that reads the current package version, computes the minor-version gap for each entry, and fails when any entry is more than 1 minor version behind. This makes stale exemptions self-detecting — the test will fail as soon as v3.10.0 is released if the entries aren't cleaned up.

github-actions · 2026-04-05T21:02:39Z

Codegraph Impact Analysis

3 functions changed → 41 callers affected across 33 files

forkWorker in scripts/lib/fork-engine.ts:67 (5 transitive callers)
openRepo in src/db/connection.ts:361 (29 transitive callers)
openReadonlyWithNative in src/db/connection.ts:412 (8 transitive callers)

carlos-alm · 2026-04-05T23:38:55Z

Addressed the P1 openReadonlyWithNative engine-bypass gap (from the Greptile review). Fixed in 99b6e0f — added the same CODEGRAPH_ENGINE env-var guard to openReadonlyWithNative() so it skips the native path when engine === 'wasm', consistent with openRepo(). This closes the latent measurement gap for structure-query, dataflow, and module-map callers.

carlos-alm · 2026-04-05T23:39:39Z

@greptileai

The fnDeps query latency ~3x regression in 3.9.0 vs 3.7.0 is a pre-existing main issue caused by openRepo engine routing, not by this PR. Add to KNOWN_REGRESSIONS to unblock CI while fix is tracked in PR #869/#870.

greptile-apps bot reviewed Apr 5, 2026

View reviewed changes

carlos-alm added 2 commits April 5, 2026 17:37

fix: add engine guard to openReadonlyWithNative (#869)

99b6e0f

fix: add staleness check for KNOWN_REGRESSIONS entries (#869)

b6e919a

carlos-alm merged commit a728026 into main Apr 6, 2026
18 checks passed

carlos-alm deleted the fix/fndeps-benchmark-engine-routing branch April 6, 2026 01:36

github-actions bot locked and limited conversation to collaborators Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(perf): wire engine selection through openRepo to fix query benchmarks#869

fix(perf): wire engine selection through openRepo to fix query benchmarks#869
carlos-alm merged 3 commits intomainfrom
fix/fndeps-benchmark-engine-routing

carlos-alm commented Apr 5, 2026

Uh oh!

claude bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Apr 5, 2026

Uh oh!

carlos-alm Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Apr 5, 2026

Uh oh!

carlos-alm commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Apr 5, 2026

Summary

Root cause

Test plan

Uh oh!

claude bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Engine Selection Fix

Root Cause Analysis

Configuration Change Detection

Test Integrity Review

Code Quality Assessment

Backlog Compliance

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codegraph Impact Analysis

Uh oh!

carlos-alm commented Apr 5, 2026

Uh oh!

carlos-alm commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Apr 5, 2026 •

edited

Loading

greptile-apps bot commented Apr 5, 2026 •

edited

Loading

github-actions bot commented Apr 5, 2026 •

edited

Loading