Skip to content

Conversation

@roryabraham
Copy link
Contributor

See the full proposal here: https://expensify.slack.com/archives/C05LX9D6E07/p1770841028162209

This is a very, very messy draft. I have not reviewed all this code and it is not the quality I would hold myself to for production code.

roryabraham and others added 20 commits February 9, 2026 12:59
Add vitest, @vitest/browser, @vitest/browser-playwright, playwright,
and tinybench as dev dependencies. Configure vitest for browser-mode
benchmarks running in headless Chromium with real IndexedDB via the
IDBKeyValProvider storage backend.

Add npm scripts: bench, bench:compare, bench:save.

Co-authored-by: Cursor <[email protected]>
Create data generators that produce realistic Onyx store data modeled
after App's ONYXKEYS structure, with four configurable tiers:

- small:   50 reports, 500 actions, 50 txns (~1 MB)
- modest:  250 reports, 2.5k actions, 250 txns (~5 MB)
- heavy:   1k reports, 10k actions, 1k txns (~20 MB)
- extreme: 5k reports, 50k actions, 5k txns (~100 MB)

Includes factory functions for reports, report actions, transactions,
policies, and personal details with realistic field shapes. Also adds
shared setup/teardown helpers for initializing and seeding Onyx in
benchmark runs.

Co-authored-by: Cursor <[email protected]>
Add tinybench-based benchmarks for all perf-sensitive Onyx methods,
each running across four data tiers (small/modest/heavy/extreme):

- set.bench.ts:     set(), multiSet(), setCollection()
- merge.bench.ts:   merge(), mergeCollection(), update() (mixed ops)
- connect.bench.ts: connect() registration, collection subscribers,
                    notification throughput
- init.bench.ts:    init() with initialKeyStates
- clear.bench.ts:   clear() at each scale

Add scripts/compareBenchmarks.sh which automates baseline-vs-current
branch comparison using vitest's --outputJson and --compare flags.

Co-authored-by: Cursor <[email protected]>
Document the benchmark suite, data tiers, how to run benchmarks,
and how to compare performance across branches.

Co-authored-by: Cursor <[email protected]>
Create SQLiteQueries.ts with all SQL query strings as named constants,
shared by both native and web providers. Move the native SQLiteProvider
from a flat file to SQLiteProvider/index.native.ts to prepare for the
web SQLiteProvider in the same directory. Refactor native provider to
import queries from the shared module instead of inlining SQL strings.

Co-authored-by: Cursor <[email protected]>
Introduce a DirtyMap class that coalesces rapid successive writes to the
same key, deferring persistence via requestIdleCallback. This allows
set/multiSet to return near-instantly by staging values in memory while
the actual storage flush happens asynchronously in batches. Reads check
the dirty map first for consistency. Merge operations flush pending
writes before delegating to the provider to ensure correct semantics.

Benchmarks show 94-99% improvement on set/multiSet/merge operations
compared to the baseline IDB implementation.

Co-authored-by: Cursor <[email protected]>
Implement the web SQLiteProvider backed by @sqlite.org/sqlite-wasm using
the opfs-sahpool VFS for OPFS persistence without COOP/COEP headers. All
database operations run in a dedicated Web Worker to keep the main thread
free. The provider uses prepared statements for all queries and shares
SQL constants with the native provider via SQLiteQueries.ts.

Replace localStorage-based InstanceSync with BroadcastChannel for more
reliable cross-tab communication. After the worker persists a batch, it
broadcasts changed keys so other tabs can update their caches.

Update web platform selection to prefer SQLiteProvider when OPFS and
Workers are available, with automatic fallback to IDBKeyValProvider for
older browsers.

Co-authored-by: Cursor <[email protected]>
Replace the single-type DirtyMap with a two-type staging layer that
distinguishes between SET entries (full values, flushed via multiSet)
and MERGE entries (accumulated patches, flushed via multiMerge). This
preserves JSON_PATCH efficiency on SQLite while eliminating the
flushNow() call that caused mergeCollection() and update() to regress.

All write operations (set, merge, multiSet, multiMerge) now return
immediately after staging in the DirtyMap. Merges on keys with a
pending SET apply the patch in-memory; merges on keys with a pending
MERGE accumulate patches. On flush, SET entries use reference identity
to handle concurrent mutations, while MERGE entries are removed at
flush start to prevent double-application.

Benchmarks show mergeCollection() improved from +228% regression to
-90% improvement at modest scale (56ms -> 5.6ms), and from +2% to
-99% at extreme scale (1.91s -> 27.8ms).

Co-authored-by: Cursor <[email protected]>
- scripts/generateBenchReport.ts: parse Vitest benchmark JSON, output color-coded HTML
- scripts/benchAndReport.sh: run benchmarks and generate report (single run, branch compare, or multi-config)
- npm run bench:report and bench:report:compare
- Add tsx dev dependency for report generator
- Document in README; add bench-results.html to .gitignore

Co-authored-by: Cursor <[email protected]>
After each benchmark run, detect results with high variance (RME > 50%
on operations > 1ms, or extreme outlier spikes shifting mean > 2ms/20%)
and re-run only the affected benchmark files. Merge improved results and
repeat up to --max-retries times (default 3).

Criteria carefully tuned to avoid false positives:
- Sub-ms operations (clear, connect register) are excluded from outlier
  detection since GC spikes don't affect their cross-run stability
- Low sample count alone doesn't trigger re-runs (slow operations like
  multiSet 5000 inherently get ~10 samples but are consistent)

Integrated into benchAndReport.sh (on by default, --no-stabilize to skip).

Co-authored-by: Cursor <[email protected]>
tinybench already handles statistical rigor by running each benchmark
for a time budget and collecting many samples. The stabilization layer
added complexity without clear value — it produced false positives,
caused the multi-config pipeline to abort (exit code 1 from noisy
results killed the script under set -e), and re-runs consistently
returned the same results.

Also fixes benchAndReport.sh argument parsing (IFS→parameter expansion)
and simplifies run_bench() to call vitest directly.

Co-authored-by: Cursor <[email protected]>
Three-config comparison:
- Baseline: Rory-Benchmarks branch (no DirtyMap, no workers, IDB only)
- DM+IDB: DirtyMap + workers + IndexedDB
- DM+SQLite: DirtyMap + workers + SQLite WASM

Results show 93-99% improvements across write operations.

Co-authored-by: Cursor <[email protected]>
Replace the SQLiteProvider-specific worker architecture with a single
unified worker that supports both SQLite and IDB backends:

- Create SQLiteProvider/index.web.ts: StorageProvider implementation for
  SQLite WASM, reshaped from the old worker.ts handle* functions
- Create lib/storage/worker.ts: provider-agnostic worker that dynamically
  imports the appropriate StorageProvider based on init message
- Create WorkerStorageProvider.ts: generic main-thread proxy replacing
  both SQLiteProvider/index.ts and IDBKeyValProvider web usage
- Update platforms/index.ts: use WorkerStorageProvider with backend
  selection based on OPFS availability
- Remove old SQLiteProvider/index.ts (web proxy) and worker.ts

Both IDB and SQLite now run off the main thread in the same unified
worker, with BroadcastChannel cross-tab sync handled at the worker level.

Co-authored-by: Cursor <[email protected]>
Update the Architecture Overview diagram, Proposed Solution section, and
Phase 3 implementation description to document the new provider-agnostic
unified worker that runs both SQLite and IDB backends off the main thread.

Co-authored-by: Cursor <[email protected]>
- Rename DirtyMap to WriteBuffer throughout the codebase
- Upgrade worker.ts to broadcast actual values (not just key names) on
  BroadcastChannel after persistence, using typed message formats:
  set (full values), merge (raw patches), remove, clear
- Rewrite InstanceSync to handle value-bearing messages directly:
  set -> onStorageKeyChanged(key, value) with no storage read,
  merge -> fastMerge(cachedValue, patch) against the in-memory cache,
  remove/clear -> notify with null
- Remove all InstanceSync send-side calls from storage/index.ts since
  the unified worker handles broadcasting for all backends
- Remove handlesBroadcast flag (unnecessary given unified worker)
- Update PROPOSAL_DRAFT.md with Phase 5 documentation and revised
  cross-tab sync description

Co-authored-by: Cursor <[email protected]>
…cture

Replace the custom C++ SQLite/WASM build with official packages:
- Web: @sqlite.org/sqlite-wasm with opfs-sahpool VFS (IDB fallback)
- Native: react-native-nitro-sqlite with sync APIs

Key changes:
- Restore SQLiteProvider implementations for web and native from git history
- Shared SQLiteQueries.ts as single source of truth for all SQL
- Unified worker.ts with dynamic backend selection and graceful WASM fallback
- NativeFlushWorker using react-native-worklets-core Worker Runtime
- NativeBufferStore simplified to pure shared_mutex-protected AnyMap HybridObject
- Value-bearing BroadcastChannel messages for cross-tab sync
- WriteBuffer with SET/MERGE entry coalescing via BufferStore interface
- Updated PROPOSAL_DRAFT.md with revised architecture and Glaze future note

Co-authored-by: Cursor <[email protected]>
Root cause: the web worker's onmessage handler was async but each incoming
message spawned an independent invocation. When init took time (dynamic
WASM/IDB imports), data operations arrived and called provider!.method()
on a null provider, causing silent hangs.

Three fixes:
1. worker.ts: Replace concurrent onmessage with a serial message queue.
   Messages are processed one at a time, guaranteeing init completes
   before any data operation touches the provider.

2. WorkerStorageProvider.init(): Return the init Promise so callers
   can await worker readiness (previously returned void/undefined).

3. storage/index.ts: Properly await async init() return values before
   resolving the init gate that unblocks all storage operations.

Also:
- Widen StorageProvider.init type to () => void | Promise<void>
- Add idb-keyval to vitest optimizeDeps.include to prevent mid-test
  Vite dependency reloads
- Regenerate benchmark comparison with all 5 files (init + clear now work)

Co-authored-by: Cursor <[email protected]>
@roryabraham roryabraham self-assigned this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant