-
Notifications
You must be signed in to change notification settings - Fork 92
[POC] Onyx + wasm = :hyperfastparrot: #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
+16,655
−759
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add vitest, @vitest/browser, @vitest/browser-playwright, playwright, and tinybench as dev dependencies. Configure vitest for browser-mode benchmarks running in headless Chromium with real IndexedDB via the IDBKeyValProvider storage backend. Add npm scripts: bench, bench:compare, bench:save. Co-authored-by: Cursor <[email protected]>
Create data generators that produce realistic Onyx store data modeled after App's ONYXKEYS structure, with four configurable tiers: - small: 50 reports, 500 actions, 50 txns (~1 MB) - modest: 250 reports, 2.5k actions, 250 txns (~5 MB) - heavy: 1k reports, 10k actions, 1k txns (~20 MB) - extreme: 5k reports, 50k actions, 5k txns (~100 MB) Includes factory functions for reports, report actions, transactions, policies, and personal details with realistic field shapes. Also adds shared setup/teardown helpers for initializing and seeding Onyx in benchmark runs. Co-authored-by: Cursor <[email protected]>
Add tinybench-based benchmarks for all perf-sensitive Onyx methods,
each running across four data tiers (small/modest/heavy/extreme):
- set.bench.ts: set(), multiSet(), setCollection()
- merge.bench.ts: merge(), mergeCollection(), update() (mixed ops)
- connect.bench.ts: connect() registration, collection subscribers,
notification throughput
- init.bench.ts: init() with initialKeyStates
- clear.bench.ts: clear() at each scale
Add scripts/compareBenchmarks.sh which automates baseline-vs-current
branch comparison using vitest's --outputJson and --compare flags.
Co-authored-by: Cursor <[email protected]>
Document the benchmark suite, data tiers, how to run benchmarks, and how to compare performance across branches. Co-authored-by: Cursor <[email protected]>
Create SQLiteQueries.ts with all SQL query strings as named constants, shared by both native and web providers. Move the native SQLiteProvider from a flat file to SQLiteProvider/index.native.ts to prepare for the web SQLiteProvider in the same directory. Refactor native provider to import queries from the shared module instead of inlining SQL strings. Co-authored-by: Cursor <[email protected]>
Introduce a DirtyMap class that coalesces rapid successive writes to the same key, deferring persistence via requestIdleCallback. This allows set/multiSet to return near-instantly by staging values in memory while the actual storage flush happens asynchronously in batches. Reads check the dirty map first for consistency. Merge operations flush pending writes before delegating to the provider to ensure correct semantics. Benchmarks show 94-99% improvement on set/multiSet/merge operations compared to the baseline IDB implementation. Co-authored-by: Cursor <[email protected]>
Implement the web SQLiteProvider backed by @sqlite.org/sqlite-wasm using the opfs-sahpool VFS for OPFS persistence without COOP/COEP headers. All database operations run in a dedicated Web Worker to keep the main thread free. The provider uses prepared statements for all queries and shares SQL constants with the native provider via SQLiteQueries.ts. Replace localStorage-based InstanceSync with BroadcastChannel for more reliable cross-tab communication. After the worker persists a batch, it broadcasts changed keys so other tabs can update their caches. Update web platform selection to prefer SQLiteProvider when OPFS and Workers are available, with automatic fallback to IDBKeyValProvider for older browsers. Co-authored-by: Cursor <[email protected]>
Replace the single-type DirtyMap with a two-type staging layer that distinguishes between SET entries (full values, flushed via multiSet) and MERGE entries (accumulated patches, flushed via multiMerge). This preserves JSON_PATCH efficiency on SQLite while eliminating the flushNow() call that caused mergeCollection() and update() to regress. All write operations (set, merge, multiSet, multiMerge) now return immediately after staging in the DirtyMap. Merges on keys with a pending SET apply the patch in-memory; merges on keys with a pending MERGE accumulate patches. On flush, SET entries use reference identity to handle concurrent mutations, while MERGE entries are removed at flush start to prevent double-application. Benchmarks show mergeCollection() improved from +228% regression to -90% improvement at modest scale (56ms -> 5.6ms), and from +2% to -99% at extreme scale (1.91s -> 27.8ms). Co-authored-by: Cursor <[email protected]>
- scripts/generateBenchReport.ts: parse Vitest benchmark JSON, output color-coded HTML - scripts/benchAndReport.sh: run benchmarks and generate report (single run, branch compare, or multi-config) - npm run bench:report and bench:report:compare - Add tsx dev dependency for report generator - Document in README; add bench-results.html to .gitignore Co-authored-by: Cursor <[email protected]>
After each benchmark run, detect results with high variance (RME > 50% on operations > 1ms, or extreme outlier spikes shifting mean > 2ms/20%) and re-run only the affected benchmark files. Merge improved results and repeat up to --max-retries times (default 3). Criteria carefully tuned to avoid false positives: - Sub-ms operations (clear, connect register) are excluded from outlier detection since GC spikes don't affect their cross-run stability - Low sample count alone doesn't trigger re-runs (slow operations like multiSet 5000 inherently get ~10 samples but are consistent) Integrated into benchAndReport.sh (on by default, --no-stabilize to skip). Co-authored-by: Cursor <[email protected]>
tinybench already handles statistical rigor by running each benchmark for a time budget and collecting many samples. The stabilization layer added complexity without clear value — it produced false positives, caused the multi-config pipeline to abort (exit code 1 from noisy results killed the script under set -e), and re-runs consistently returned the same results. Also fixes benchAndReport.sh argument parsing (IFS→parameter expansion) and simplifies run_bench() to call vitest directly. Co-authored-by: Cursor <[email protected]>
Three-config comparison: - Baseline: Rory-Benchmarks branch (no DirtyMap, no workers, IDB only) - DM+IDB: DirtyMap + workers + IndexedDB - DM+SQLite: DirtyMap + workers + SQLite WASM Results show 93-99% improvements across write operations. Co-authored-by: Cursor <[email protected]>
Replace the SQLiteProvider-specific worker architecture with a single unified worker that supports both SQLite and IDB backends: - Create SQLiteProvider/index.web.ts: StorageProvider implementation for SQLite WASM, reshaped from the old worker.ts handle* functions - Create lib/storage/worker.ts: provider-agnostic worker that dynamically imports the appropriate StorageProvider based on init message - Create WorkerStorageProvider.ts: generic main-thread proxy replacing both SQLiteProvider/index.ts and IDBKeyValProvider web usage - Update platforms/index.ts: use WorkerStorageProvider with backend selection based on OPFS availability - Remove old SQLiteProvider/index.ts (web proxy) and worker.ts Both IDB and SQLite now run off the main thread in the same unified worker, with BroadcastChannel cross-tab sync handled at the worker level. Co-authored-by: Cursor <[email protected]>
Update the Architecture Overview diagram, Proposed Solution section, and Phase 3 implementation description to document the new provider-agnostic unified worker that runs both SQLite and IDB backends off the main thread. Co-authored-by: Cursor <[email protected]>
- Rename DirtyMap to WriteBuffer throughout the codebase - Upgrade worker.ts to broadcast actual values (not just key names) on BroadcastChannel after persistence, using typed message formats: set (full values), merge (raw patches), remove, clear - Rewrite InstanceSync to handle value-bearing messages directly: set -> onStorageKeyChanged(key, value) with no storage read, merge -> fastMerge(cachedValue, patch) against the in-memory cache, remove/clear -> notify with null - Remove all InstanceSync send-side calls from storage/index.ts since the unified worker handles broadcasting for all backends - Remove handlesBroadcast flag (unnecessary given unified worker) - Update PROPOSAL_DRAFT.md with Phase 5 documentation and revised cross-tab sync description Co-authored-by: Cursor <[email protected]>
…cture Replace the custom C++ SQLite/WASM build with official packages: - Web: @sqlite.org/sqlite-wasm with opfs-sahpool VFS (IDB fallback) - Native: react-native-nitro-sqlite with sync APIs Key changes: - Restore SQLiteProvider implementations for web and native from git history - Shared SQLiteQueries.ts as single source of truth for all SQL - Unified worker.ts with dynamic backend selection and graceful WASM fallback - NativeFlushWorker using react-native-worklets-core Worker Runtime - NativeBufferStore simplified to pure shared_mutex-protected AnyMap HybridObject - Value-bearing BroadcastChannel messages for cross-tab sync - WriteBuffer with SET/MERGE entry coalescing via BufferStore interface - Updated PROPOSAL_DRAFT.md with revised architecture and Glaze future note Co-authored-by: Cursor <[email protected]>
Root cause: the web worker's onmessage handler was async but each incoming message spawned an independent invocation. When init took time (dynamic WASM/IDB imports), data operations arrived and called provider!.method() on a null provider, causing silent hangs. Three fixes: 1. worker.ts: Replace concurrent onmessage with a serial message queue. Messages are processed one at a time, guaranteeing init completes before any data operation touches the provider. 2. WorkerStorageProvider.init(): Return the init Promise so callers can await worker readiness (previously returned void/undefined). 3. storage/index.ts: Properly await async init() return values before resolving the init gate that unblocks all storage operations. Also: - Widen StorageProvider.init type to () => void | Promise<void> - Add idb-keyval to vitest optimizeDeps.include to prevent mid-test Vite dependency reloads - Regenerate benchmark comparison with all 5 files (init + clear now work) Co-authored-by: Cursor <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See the full proposal here: https://expensify.slack.com/archives/C05LX9D6E07/p1770841028162209
This is a very, very messy draft. I have not reviewed all this code and it is not the quality I would hold myself to for production code.