fix(perf): wire engine selection through openRepo to fix query benchmarks#869
fix(perf): wire engine selection through openRepo to fix query benchmarks#869carlos-alm merged 3 commits intomainfrom
Conversation
…arks openRepo() always tried the native NAPI path first regardless of engine selection, causing both "wasm" and "native" benchmark workers to measure the same native rusqlite open/close overhead (~27ms vs ~10ms with direct better-sqlite3). This produced a false 177% regression in fnDeps query benchmarks between v3.7.0 and v3.9.0, blocking CI on all PRs. - Add engine param to openRepo() and check CODEGRAPH_ENGINE env var - Set CODEGRAPH_ENGINE in benchmark fork workers so query functions respect the engine selection per worker - Add v3.9.0 fnDeps depth 1/3/5 to KNOWN_REGRESSIONS since the published data was measured with the broken engine routing
|
Claude finished @carlos-alm's task —— View job PR Review: Engine Selection Fix
|
Greptile SummaryThis PR fixes a benchmark instrumentation bug where The fix is two-part:
Confidence Score: 5/5Safe to merge — root cause correctly identified and fixed with no P0/P1 issues remaining. The fix is minimal, targeted, and correct. The two-part change (set CODEGRAPH_ENGINE in the fork + respect it in openRepo) fully addresses the false-regression root cause. Backward compatibility is preserved: opts.engine is optional and all existing callers continue to work unchanged. The stale-exemption test adds durable self-enforcement so KNOWN_REGRESSIONS bookkeeping cannot silently go stale. No logic bugs, security issues, or edge-case regressions found across any of the three changed files. No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant P as Parent (forkEngines)
participant W1 as Worker: wasm
participant W2 as Worker: native
participant OR as openRepo()
Note over P,OR: BEFORE fix — CODEGRAPH_ENGINE never set in worker env
P->>W1: fork(env: __BENCH_ENGINE__=wasm)
W1->>OR: openRepo()
OR-->>W1: isNativeAvailable()=true → uses rusqlite
P->>W2: fork(env: __BENCH_ENGINE__=native)
W2->>OR: openRepo()
OR-->>W2: isNativeAvailable()=true → uses rusqlite
Note over W1,W2: Both workers measure native overhead — false regression!
Note over P,OR: AFTER fix — CODEGRAPH_ENGINE wired through fork
P->>W1: fork(env: __BENCH_ENGINE__=wasm, CODEGRAPH_ENGINE=wasm)
W1->>OR: openRepo()
OR-->>W1: engine=wasm → skips native → uses better-sqlite3
P->>W2: fork(env: __BENCH_ENGINE__=native, CODEGRAPH_ENGINE=native)
W2->>OR: openRepo()
OR-->>W2: engine=native → uses rusqlite NativeRepository
Reviews (2): Last reviewed commit: "fix: add staleness check for KNOWN_REGRE..." | Re-trigger Greptile |
| const KNOWN_REGRESSIONS = new Set([ | ||
| '3.9.0:1-file rebuild', | ||
| '3.9.0:fnDeps depth 1', | ||
| '3.9.0:fnDeps depth 3', | ||
| '3.9.0:fnDeps depth 5', | ||
| ]); |
There was a problem hiding this comment.
KNOWN_REGRESSIONS entries have no expiry enforcement
The three 3.9.0:fnDeps depth {1,3,5} entries are documented in the inline comment above (lines 68–73) to be removed once v3.10.0 benchmark data is merged. There is no automated check that enforces this cleanup. If these entries are left in place after v3.10.0 data lands, the regression guard will silently pass genuine regressions in those three metrics for all future releases.
Consider adding a test-time assertion that checks the current package version and warns (or fails) when any KNOWN_REGRESSIONS entry is more than one minor version behind the running package — this would make the stale-exemption problem self-detecting rather than requiring manual bookkeeping.
There was a problem hiding this comment.
Fixed in b6e919a. Added a KNOWN_REGRESSIONS entries are not stale test that reads the current package version, computes the minor-version gap for each entry, and fails when any entry is more than 1 minor version behind. This makes stale exemptions self-detecting — the test will fail as soon as v3.10.0 is released if the entries aren't cleaned up.
Codegraph Impact Analysis3 functions changed → 41 callers affected across 33 files
|
|
Addressed the P1 |

Summary
openRepo()always tried native NAPI path first regardless of engine selection, causing both "wasm" and "native" benchmark workers to measure the same native rusqlite open/close overhead (~27ms vs ~10ms with direct better-sqlite3)engineparam toopenRepo()and checkCODEGRAPH_ENGINEenv var so query functions respect engine selectionCODEGRAPH_ENGINEin benchmark fork workersKNOWN_REGRESSIONS(published data measured with broken routing)Root cause
openRepo()insrc/db/connection.tshad no engine selection — it unconditionally calledisNativeAvailable()and used the native path when available. The benchmark fork (scripts/lib/fork-engine.ts) set__BENCH_ENGINE__per worker to label engine runs, butopenReponever checked it. On CI where both engines are available, both workers measured identical native performance.Test plan
regression-guard.test.ts— 14/14 pass (fnDeps regressions now in KNOWN_REGRESSIONS)queries-unit.test.ts— 74/74 pass