perf: Zero-copy binary refactor and CLOCK replacer#17
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 13 minutes and 38 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR restructures benchmark setup to avoid repeated initialization within timed loops, converts operator table ownership from unique to shared pointers, replaces the LRUReplacer algorithm from list-based LRU to CLOCK-based eviction, refactors HeapTable storage from delimiter strings to binary payload format, adds performance regression testing infrastructure, disables diagnostic logging, and introduces type-query methods for Value. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
benchmarks/execution_bench.cpp (1)
31-57:⚠️ Potential issue | 🟠 MajorDestroy the storage stack before deleting the benchmark directory.
remove_all(test_dir)runs whiletable,bpm, anddisk_managerare still alive. IfBufferPoolManagerflushes dirty pages in its destructor, it will target a path you already removed, and on platforms with mandatory file locking the cleanup can fail outright. Put the storage objects in an inner scope and remove the directory afterwards.Suggested fix
static void BM_ExecutionSeqScan(benchmark::State& state) { std::string test_dir = "./bench_exec_scan_" + std::to_string(state.range(0)); std::filesystem::remove_all(test_dir); std::filesystem::create_directories(test_dir); - - StorageManager disk_manager(test_dir); - BufferPoolManager bpm(2000, disk_manager); - - Schema schema; - schema.add_column("id", common::ValueType::TYPE_INT64); - schema.add_column("data", common::ValueType::TYPE_TEXT); - - auto table = std::make_shared<HeapTable>("scan_table", bpm, schema); - table->create(); - SetupBenchTable(*table, state.range(0)); - - for (auto _ : state) { - auto scan_op = std::make_unique<SeqScanOperator>(table); - scan_op->init(); - scan_op->open(); - Tuple tuple; - while (scan_op->next(tuple)) { - benchmark::DoNotOptimize(tuple); + { + StorageManager disk_manager(test_dir); + BufferPoolManager bpm(2000, disk_manager); + + Schema schema; + schema.add_column("id", common::ValueType::TYPE_INT64); + schema.add_column("data", common::ValueType::TYPE_TEXT); + + auto table = std::make_shared<HeapTable>("scan_table", bpm, schema); + table->create(); + SetupBenchTable(*table, state.range(0)); + + for (auto _ : state) { + auto scan_op = std::make_unique<SeqScanOperator>(table); + scan_op->init(); + scan_op->open(); + Tuple tuple; + while (scan_op->next(tuple)) { + benchmark::DoNotOptimize(tuple); + } } } - + state.SetItemsProcessed(state.iterations() * state.range(0)); std::filesystem::remove_all(test_dir); }Apply the same scoping pattern to
BM_ExecutionHashJoin.Also applies to: 62-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmarks/execution_bench.cpp` around lines 31 - 57, The benchmark removes test_dir while storage objects are still alive; wrap creation of StorageManager, BufferPoolManager, Schema, HeapTable and related setup (the block that creates test_dir, StorageManager, BufferPoolManager, schema, table, SetupBenchTable, and the scan loop using SeqScanOperator) in an inner scope so those objects (StorageManager, BufferPoolManager, HeapTable, etc.) are destructed before calling std::filesystem::remove_all(test_dir); apply the same inner-scope pattern to the BM_ExecutionHashJoin benchmark to ensure destructors run before deleting the directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@include/executor/operator.hpp`:
- Around line 114-115: The call sites still construct HeapTable with
std::make_unique but the operator constructors now take std::shared_ptr; update
the three locations that construct a HeapTable for IndexScanOperator and
SeqScanOperator to use std::make_shared<storage::HeapTable>(...) instead of
std::make_unique<storage::HeapTable>(...), ensuring the produced shared_ptr is
passed to the IndexScanOperator and SeqScanOperator constructors (refer to the
IndexScanOperator and SeqScanOperator calls around the join setup in
query_executor.cpp).
In `@include/storage/buffer_pool_manager.hpp`:
- Around line 104-107: The public accessor get_storage_manager() exposes a
mutable StorageManager& allowing callers to bypass BufferPoolManager invariants
(page_table_, replacer_, latches) — remove the accessor or change its signature
to return a const StorageManager& (or provide a narrow const-safe facade) so
callers cannot call mutating methods like read_page/write_page/allocate_page
directly; update any call sites (if any) to use BufferPoolManager's controlled
APIs instead and ensure the class header only exposes non-mutating operations or
an opaque const interface to StorageManager.
In `@src/storage/heap_table.cpp`:
- Around line 281-314: The deserializer advances cursor through the buffer
(data) without verifying remaining buffer size, so add explicit bounds checks
before each read: ensure cursor < data_len before reading the type byte, ensure
cursor + 8 <= data_len before memcpy of the 8-byte numeric payload, and ensure
cursor + 4 <= data_len before reading the uint32_t length and then ensure cursor
+ len <= data_len before constructing the string; use the existing buffer-length
parameter available to this function (e.g. data_len/tuple_size) and return an
error or throw if any check fails; update the loop around
schema_.column_count(), the branches handling common::ValueType, and the
string-deserialization path that calls common::Value::make_text to use these
guards.
- Around line 120-137: The loop in heap_table.cpp serializes all numeric values
as double causing precision loss for large integers; change the numeric branch
to distinguish integer vs floating numeric types (use val.is_integer() or check
val.type() for the integer kind) and serialize integers as int64_t via
val.to_int64() (8 bytes) while keeping floating types as double via
val.to_float64(); keep writing the uint8_t type tag to payload so
deserialization (the code that reads the numeric back, currently casting double
to int64_t around line 27) can read the correct 8-byte representation and
reconstruct int64_t for integer tags and double for floating tags. Ensure you
update the corresponding deserialization logic to check the stored type tag and
memcpy/interpret 8 bytes as int64_t for integer types and as double for floating
types.
---
Outside diff comments:
In `@benchmarks/execution_bench.cpp`:
- Around line 31-57: The benchmark removes test_dir while storage objects are
still alive; wrap creation of StorageManager, BufferPoolManager, Schema,
HeapTable and related setup (the block that creates test_dir, StorageManager,
BufferPoolManager, schema, table, SetupBenchTable, and the scan loop using
SeqScanOperator) in an inner scope so those objects (StorageManager,
BufferPoolManager, HeapTable, etc.) are destructed before calling
std::filesystem::remove_all(test_dir); apply the same inner-scope pattern to the
BM_ExecutionHashJoin benchmark to ensure destructors run before deleting the
directory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1f5febec-0aef-4c5b-a527-8f1e70e486a8
📒 Files selected for processing (9)
benchmarks/execution_bench.cppinclude/executor/operator.hppinclude/storage/buffer_pool_manager.hppinclude/storage/heap_table.hppinclude/storage/lru_replacer.hppsrc/executor/operator.cppsrc/storage/heap_table.cppsrc/storage/lru_replacer.cpptests/buffer_pool_tests.cpp
| explicit SeqScanOperator(std::shared_ptr<storage::HeapTable> table, Transaction* txn = nullptr, | ||
| LockManager* lock_manager = nullptr); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find remaining scan/index operator call sites that still create HeapTable with unique_ptr.
rg -n -C3 'make_unique<storage::HeapTable>|make_shared<storage::HeapTable>|SeqScanOperator|IndexScanOperator' src includeRepository: poyrazK/cloudSQL
Length of output: 9868
Three uncompiled call sites still pass std::unique_ptr<storage::HeapTable> to operators expecting std::shared_ptr.
The signature change to SeqScanOperator and IndexScanOperator requires all callers to pass std::make_shared<storage::HeapTable> instead of std::make_unique. Three locations in src/executor/query_executor.cpp still need updating:
- Lines 789–793 (IndexScanOperator)
- Lines 805–807 (SeqScanOperator, first join table)
- Lines 851–853 (SeqScanOperator, second join table)
Replace all three instances of std::make_unique<storage::HeapTable>(...) with std::make_shared<storage::HeapTable>(...) to resolve the compilation errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@include/executor/operator.hpp` around lines 114 - 115, The call sites still
construct HeapTable with std::make_unique but the operator constructors now take
std::shared_ptr; update the three locations that construct a HeapTable for
IndexScanOperator and SeqScanOperator to use
std::make_shared<storage::HeapTable>(...) instead of
std::make_unique<storage::HeapTable>(...), ensuring the produced shared_ptr is
passed to the IndexScanOperator and SeqScanOperator constructors (refer to the
IndexScanOperator and SeqScanOperator calls around the join setup in
query_executor.cpp).
There was a problem hiding this comment.
🧹 Nitpick comments (3)
.github/workflows/ci.yml (3)
150-153: Add retry logic and either use or remove ccache.Two issues:
- Missing retry logic for
apt-get installunlike thebuildjob (line 45), which could cause intermittent CI failures.ccacheis installed but never utilized—no cache action orCCACHE_DIRexport. Either add caching or remove it from the install list.♻️ Proposed fix
- name: Install dependencies run: | sudo apt-get update - sudo apt-get install -y cmake clang ninja-build ccache + sudo apt-get install -y cmake clang ninja-build || (sleep 10 && sudo apt-get update && sudo apt-get install -y cmake clang ninja-build)Alternatively, if you want to use ccache, add a cache step similar to lines 47-53 and export
CCACHE_DIRin the configure step.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 150 - 153, The "Install dependencies" step runs apt-get without retries and installs ccache but doesn't use it; update that step to mirror the build job's retry wrapper for apt-get install (retry loop with sleeps) to avoid transient CI failures, and either remove "ccache" from the apt-get list or add ccache usage: create a cache step for ccache (similar to the build job's cache lines) and ensure you export CCACHE_DIR in the configure/build steps so ccache is actually used.
155-162: Consider specifying the compiler explicitly for reproducible benchmarks.The
buildjob uses a matrix to specify the compiler, but this job relies on the system default. For consistent benchmark comparisons across CI runs (especially if the runner image changes), consider pinning the compiler.♻️ Optional: specify compiler
cmake .. -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_CXX_COMPILER=clang++ \ -DBUILD_BENCHMARKS=ON \ -DBUILD_TESTS=OFF🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 155 - 162, The CI CMake configure step currently relies on system defaults (cmake .. -G Ninja ... -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCHMARKS=ON -DBUILD_TESTS=OFF); update that CMake invocation to pin compilers by adding -DCMAKE_C_COMPILER and -DCMAKE_CXX_COMPILER (using the matrix-provided compiler variables in the workflow) so the build job produces reproducible benchmarks—modify the configure command inside the "Configure CMake (Release)" step to pass those two CMake variables referencing the job/matrix compiler names.
169-174: Add a timeout and capture benchmark results as artifacts.The project tracks performance baselines (see
docs/performance/REPORT_V1.md), making artifact capture valuable for regression detection. Without a timeout, a hung benchmark (e.g., port binding failure innetwork_bench) will block CI for up to 6 hours.Google Benchmark v1.8.3 is in use and supports JSON output via built-in flags:
♻️ Proposed improvement
- name: Run Benchmarks + timeout-minutes: 15 run: | cd build - ./storage_bench - ./execution_bench - ./network_bench + ./storage_bench --benchmark_out=storage_results.json --benchmark_out_format=json + ./execution_bench --benchmark_out=execution_results.json --benchmark_out_format=json + ./network_bench --benchmark_out=network_results.json --benchmark_out_format=json + + - name: Upload Benchmark Results + uses: actions/upload-artifact@v4 + with: + name: benchmark-results + path: build/*_results.json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 169 - 174, Wrap each benchmark invocation (storage_bench, execution_bench, network_bench) with a timeout (e.g., GNU timeout) to prevent CI hangs, pass Google Benchmark flags --benchmark_format=json and --benchmark_out=<name>.json to produce machine-readable results (use distinct names like storage_bench.json, execution_bench.json, network_bench.json), and add a subsequent GitHub Actions upload-artifact step to publish those JSON files as CI artifacts for baseline/regression tracking; ensure the timeout value is reasonable for your suites and apply the same pattern to all three binaries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 150-153: The "Install dependencies" step runs apt-get without
retries and installs ccache but doesn't use it; update that step to mirror the
build job's retry wrapper for apt-get install (retry loop with sleeps) to avoid
transient CI failures, and either remove "ccache" from the apt-get list or add
ccache usage: create a cache step for ccache (similar to the build job's cache
lines) and ensure you export CCACHE_DIR in the configure/build steps so ccache
is actually used.
- Around line 155-162: The CI CMake configure step currently relies on system
defaults (cmake .. -G Ninja ... -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCHMARKS=ON
-DBUILD_TESTS=OFF); update that CMake invocation to pin compilers by adding
-DCMAKE_C_COMPILER and -DCMAKE_CXX_COMPILER (using the matrix-provided compiler
variables in the workflow) so the build job produces reproducible
benchmarks—modify the configure command inside the "Configure CMake (Release)"
step to pass those two CMake variables referencing the job/matrix compiler
names.
- Around line 169-174: Wrap each benchmark invocation (storage_bench,
execution_bench, network_bench) with a timeout (e.g., GNU timeout) to prevent CI
hangs, pass Google Benchmark flags --benchmark_format=json and
--benchmark_out=<name>.json to produce machine-readable results (use distinct
names like storage_bench.json, execution_bench.json, network_bench.json), and
add a subsequent GitHub Actions upload-artifact step to publish those JSON files
as CI artifacts for baseline/regression tracking; ensure the timeout value is
reasonable for your suites and apply the same pattern to all three binaries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 25b6b0bd-9cf2-4678-9bd9-bd94a9424971
📒 Files selected for processing (3)
.github/workflows/ci.ymlsrc/network/rpc_client.cppsrc/network/rpc_server.cpp
✅ Files skipped from review due to trivial changes (2)
- src/network/rpc_server.cpp
- src/network/rpc_client.cpp
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
scripts/check_perf_regression.py (1)
1-3: Add shebang for executable script.The file is marked executable but lacks a shebang line. While the CI invokes it explicitly with
python3, adding a shebang improves portability for local use.Proposed fix
+#!/usr/bin/env python3 import json import sys import os🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/check_perf_regression.py` around lines 1 - 3, This script lacks a shebang; add a Unix shebang line (#!/usr/bin/env python3) as the very first line of the file before the import statements so the executable bit works for local execution, save the file, and ensure the file mode remains executable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/check_perf_regression.py`:
- Around line 29-35: Guard against division by zero when computing change: check
the value of old_time (from base_map[name]) before computing change = (new_time
- old_time) / old_time; if old_time is zero or None, handle it explicitly (e.g.,
log or print a warning referencing name and old_time and skip computing
change/continue, or set change to math.inf or a sentinel) and then format the
output accordingly when printing name, old_time, new_time, and change; update
the block that reads base_map, old_time, new_time, change and the print
statement to implement this check and handling.
- Around line 10-17: The JSON load currently swallows all errors and returns
True on failure; change it so a failure to load current_file fails the check
(return False or re-raise) while a missing baseline_file is handled gracefully:
split the try/except into two blocks—one that opens/parses current_file and on
any exception prints the error and returns False (or raise), and a second that
tries to open/parses baseline_file but catches FileNotFoundError to set
baseline=None and only treats other exceptions as failures (print error + return
False). Reference the current_file and baseline_file loading logic in the
existing try/except and update the error messages accordingly.
---
Nitpick comments:
In `@scripts/check_perf_regression.py`:
- Around line 1-3: This script lacks a shebang; add a Unix shebang line
(#!/usr/bin/env python3) as the very first line of the file before the import
statements so the executable bit works for local execution, save the file, and
ensure the file mode remains executable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b6541bc-b4a9-42d3-b0ed-69363064cfbf
📒 Files selected for processing (6)
.github/workflows/ci.ymlbenchmarks/execution_bench.cppinclude/common/value.hppscripts/check_perf_regression.pysrc/executor/query_executor.cppsrc/storage/heap_table.cpp
🚧 Files skipped from review as they are similar to previous changes (3)
- benchmarks/execution_bench.cpp
- .github/workflows/ci.yml
- src/storage/heap_table.cpp
This PR finalizes Phase 3 of the performance optimizations.
Changes:
last_page_id_hint to solve the O(N^2) free space search bottleneck.std::to_string,std::stringstream,std::getline) and replaced the on-disk format with a zero-copy, direct memory-mapped binary layout.Summary by CodeRabbit
Release Notes
New Features
Improvements
Chores