Skip to content

Conversation

@514Ben
Copy link
Contributor

@514Ben 514Ben commented Feb 3, 2026

Summary

Creates comprehensive documentation for transferring QlikView QVD files to ClickHouse using MooseStack workflows (Linear: ENG-2099).

Changes

New Guide: guides/qvd-to-clickhouse.mdx

Covers:

  • Overview of QVD files and migration benefits (performance, cost, flexibility, scalability)
  • Required dependencies (pyqvd, fsspec, pandas, tenacity, moose-lib)
  • Auto-generating Pydantic models from QVD schema using QvdIntrospector and QvdModelGenerator
  • Batch insertion with QvdBatchInserter (includes retry logic, progress tracking, error handling)
  • Complete workflow example with task and workflow configuration
  • Configuration options:
    • Cloud storage support (S3, Azure Blob, GCS) via fsspec
    • File tracking for incremental processing
  • Best practices for batch sizes, memory management, idempotency, and logging
  • Links to related documentation (Workflows, Data Warehouses, Insert Data, Retries/Timeouts)

Navigation Update

Added entry to src/config/navigation.ts:

  • Icon: IconDatabaseImport
  • Status: draft (for internal review)
  • Languages: Python-only (since pyqvd is Python-specific)

Code Examples

All code examples are based on the reference implementation at registry/pipeline-registry/qvd_to_clickhouse and demonstrate proper MooseStack workflow patterns.

Related Documentation

🤖 Generated with Claude Code


Note

Medium Risk
Mostly documentation and navigation changes, but it also alters vercel.json build env (NEXT_PUBLIC_NAV_VARIANT) which could change what content is visible in deployed docs if misconfigured.

Overview
Adds a new draft guide, guides/qvd-to-clickhouse, documenting a MooseStack workflow-based pipeline for syncing QVD files into ClickHouse (setup, config/env vars, model generation, running/monitoring, and troubleshooting).

Updates the Improving the Performance of Your Dashboards guide with clearer local setup guidance (remote ClickHouse seeding, Boreal connection-string instructions), MCP-focused copilot wording, and new progress/help callouts.

Exposes the new guide via draft navigation and sets Vercel build env NEXT_PUBLIC_NAV_VARIANT: "draft" to support draft navigation behavior.

Written by Cursor Bugbot for commit 4739016. This will update automatically on new commits. Configure here.

Creates comprehensive documentation for transferring QlikView QVD files to ClickHouse using MooseStack workflows. Covers auto-generating Pydantic models, batch insertion with retry logic, cloud storage configuration, and best practices.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

Summary by CodeRabbit

  • Documentation
    • Added new guide: "QVD to ClickHouse" for transferring data using MooseStack workflows
    • Enhanced "Performant Dashboards" guide with clearer setup instructions, remote database guidance, and production deployment steps
    • Added community support resource section

Walkthrough

Adds a new MDX guide describing a MooseStack workflow to transfer QlikView QVD files to ClickHouse, updates an existing dashboard guide with UI/import and deployment clarifications, registers the new guide in navigation as a draft (IconDatabaseImport, languages: ["python"]), and sets NEXT_PUBLIC_NAV_VARIANT="draft" in vercel.json. No runtime code changes.

Changes

Cohort / File(s) Summary
Docs: QVD → ClickHouse guide
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
New MDX guide covering prerequisites, QVD introspection, Pydantic model generation, fsspec-backed sources, batch/incremental insertion patterns, retry/error handling, REST monitoring endpoints, scheduling, and example workflows (local and production).
Docs: Performant Dashboards
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
Added UI component imports, clarified remote ClickHouse / MCP guidance, expanded Boreal connection instructions, added “Need help?” callout and enhanced local→production workflow callouts and environment examples.
Navigation
apps/framework-docs-v2/src/config/navigation.ts
Added a draft navigation entry for guides/qvd-to-clickhouse with title, description, IconDatabaseImport, languages ["python"], and status: "draft".
Deployment config
apps/framework-docs-v2/vercel.json
Added build environment variable NEXT_PUBLIC_NAV_VARIANT: "draft" under build.env.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • callicles

Poem

QVDs hum, schemas bright,
MooseStack guides them through the night.
ClickHouse waits in tidy rows,
Batches flow where data goes.
A draft appears — the docs take flight.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title directly summarizes the main change: adding a new guide for QVD to ClickHouse data transfer.
Description check ✅ Passed Description comprehensively documents all changes including the new guide content, navigation updates, and configuration modifications.
Linked Issues check ✅ Passed All ENG-2099 requirements met: guide created with QVD overview, dependencies, model generation, batch insertion, configuration options, navigation entry with draft status and IconDatabaseImport.
Out of Scope Changes check ✅ Passed Updates to performant-dashboards.mdx and vercel.json are tangential documentation improvements and build configuration, not directly addressing ENG-2099 but reasonably scoped to documentation quality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch qvd-to-clickhouse-guide

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel
Copy link

vercel bot commented Feb 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs-v2 Ready Ready Preview, Comment Feb 3, 2026 8:28pm

Request Review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 205-255: The constructor's max_retries is ignored because the
_insert_batch method is decorated with a hardcoded
`@retry`(stop=stop_after_attempt(3)); either remove the unused constructor
parameter or wire max_retries into the retry logic by deleting the static
decorator on _insert_batch and wrapping the insert logic in a Tenacity Retrying/
retry(...) call that uses stop=stop_after_attempt(self.max_retries) (or call
Retrying(...).call(self._do_insert_batch, records) from inside _insert_batch),
so the retry count comes from the instance attribute; update references to
_insert_batch and the constructor parameter accordingly.
- Around line 49-109: The Python code block containing the QvdIntrospector class
(the triple-backtick header currently reading ```python
filename="qvd_introspector.py" copy) needs the documentation test directive
added so the snippet is validated; update the code fence header to include `@test`
(for example: ```python `@test` filename="qvd_introspector.py" copy) for this and
any other Python blocks in the file that lack it so the snippet (including
QvdIntrospector, _load, _extract_schema, _sanitize_name, get_schema,
get_row_count) will be picked up by the docs test runner.
- Around line 91-98: The _sanitize_name function can raise IndexError when the
input normalizes to an empty string; after building sanitized in _sanitize_name,
add a guard that if sanitized == "" (or contains only underscores) set a safe
fallback (e.g., "_" or "_field") before accessing sanitized[0], then apply the
existing check to prefix with "_" if the first character is a digit and finally
return the lowercased result; update references to _sanitize_name accordingly.
- Around line 461-488: In _read_batches, ensure the temporary file (tmp_path) is
removed even if processing raises: after creating the NamedTemporaryFile (with
delete=False) assign tmp_path, then wrap the QvdReader/df processing and
batching logic in a try block and perform os.unlink(tmp_path) in a finally
block; make sure tmp_path is defined before the try and that os is imported so
cleanup always runs regardless of exceptions.
- Around line 231-243: The current _read_batches method loads the entire QVD
into memory via QvdReader(self.file_path).to_pandas(), which can OOM for large
files; change it to stream chunks using pyqvd’s chunking API (QvdTable.from_qvd)
and yield each chunk converted to records. Replace the QvdReader usage in
_read_batches with an iteration like for tbl in
QvdTable.from_qvd(self.file_path, chunk_size=self.batch_size): and for each tbl
yield tbl.to_pandas().to_dict(orient="records"), preserving the existing column
sanitization step (apply self._sanitize_name to tbl column names) so behavior of
file_path, batch_size and _sanitize_name remains consistent.
- Around line 544-559: The is_file_processed function currently builds SQL with
string interpolation which allows injection; change it to use a parameterized
query API instead of f-string: replace the processed_files_table.query call in
is_file_processed with a parameterized client.query.execute (or the table
client's parameterized method) passing the SQL with named placeholders for
file_path and status and a params dict (e.g., {"path": file_path,
"status":"success"}), then return based on the result length; keep
mark_file_processed unchanged. Ensure you reference the same function name
is_file_processed and the table/query call when making the change.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6006b93 and 4920c5b.

📒 Files selected for processing (2)
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
  • apps/framework-docs-v2/src/config/navigation.ts
🧰 Additional context used
📓 Path-based instructions (5)
apps/framework-docs-v2/content/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

apps/framework-docs-v2/content/**/*.mdx: Use {{ include "shared/path.mdx" }} directives to reuse content fragments, which are processed via processIncludes() during build
Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Guide MDX files in content/guides/ must include frontmatter with title and description fields

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run linting checks before submitting PRs for TypeScript/JavaScript code

**/*.{ts,tsx,js,jsx}: Group imports by external dependencies, internal modules, and types; use named exports from barrel files (index.ts)
Use camelCase for variables and functions, PascalCase for types/classes/components, UPPER_SNAKE_CASE for constants in TypeScript/JavaScript
Prefix unused variables with underscore (e.g., _unusedParam) to bypass linting errors in TypeScript/JavaScript
Format TypeScript/JavaScript code with Prettier using experimentalTernaries: true; auto-formats on commit via Husky + lint-staged

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
apps/framework-docs-v2/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Component library must use shadcn/ui components located in src/components/ui/

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Prefer interfaces for objects, types for unions/intersections in TypeScript; explicit return types on public APIs

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build CLI Binaries (linux-arm64-glibc, ubuntu-22-8-core, aarch64-unknown-linux-gnu)
  • GitHub Check: Package and Publish Templates
🔇 Additional comments (1)
apps/framework-docs-v2/src/config/navigation.ts (1)

1253-1262: Navigation entry looks good.

Correct slug, draft status, icon, and language scoping are consistent with guide onboarding patterns.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 91 to 98
def _sanitize_name(self, name: str) -> str:
"""Convert QVD field names to valid Python identifiers."""
# Replace spaces and special characters with underscores
sanitized = "".join(c if c.isalnum() else "_" for c in name)
# Ensure it starts with a letter or underscore
if sanitized[0].isdigit():
sanitized = f"_{sanitized}"
return sanitized.lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against empty/invalid field names in _sanitize_name.

sanitized[0] will throw if the field name normalizes to an empty string. Add a fallback to avoid IndexError.

Proposed fix
-        if sanitized[0].isdigit():
+        if not sanitized:
+            sanitized = "_field"
+        if sanitized[0].isdigit():
             sanitized = f"_{sanitized}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _sanitize_name(self, name: str) -> str:
"""Convert QVD field names to valid Python identifiers."""
# Replace spaces and special characters with underscores
sanitized = "".join(c if c.isalnum() else "_" for c in name)
# Ensure it starts with a letter or underscore
if sanitized[0].isdigit():
sanitized = f"_{sanitized}"
return sanitized.lower()
def _sanitize_name(self, name: str) -> str:
"""Convert QVD field names to valid Python identifiers."""
# Replace spaces and special characters with underscores
sanitized = "".join(c if c.isalnum() else "_" for c in name)
# Ensure it starts with a letter or underscore
if not sanitized:
sanitized = "_field"
if sanitized[0].isdigit():
sanitized = f"_{sanitized}"
return sanitized.lower()
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 91
- 98, The _sanitize_name function can raise IndexError when the input normalizes
to an empty string; after building sanitized in _sanitize_name, add a guard that
if sanitized == "" (or contains only underscores) set a safe fallback (e.g., "_"
or "_field") before accessing sanitized[0], then apply the existing check to
prefix with "_" if the first character is a digit and finally return the
lowercased result; update references to _sanitize_name accordingly.

Comment on lines 205 to 255
def __init__(
self,
file_path: str,
table: OlapTable,
model_class: Type[BaseModel],
batch_size: int = 10000,
max_retries: int = 3
):
"""
Initialize the batch inserter.

Args:
file_path: Path to the QVD file (local or fsspec-compatible URI)
table: MooseStack OlapTable instance
model_class: Pydantic model class for validation
batch_size: Number of records per batch
max_retries: Maximum retry attempts for failed batches
"""
self.file_path = file_path
self.table = table
self.model_class = model_class
self.batch_size = batch_size
self.max_retries = max_retries
self._total_inserted = 0
self._total_failed = 0

def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file in batches using pyqvd."""
reader = QvdReader(self.file_path)
df = reader.to_pandas()

# Sanitize column names to match Pydantic model
df.columns = [self._sanitize_name(col) for col in df.columns]

for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")

def _sanitize_name(self, name: str) -> str:
"""Convert column names to valid Python identifiers."""
sanitized = "".join(c if c.isalnum() else "_" for c in name)
if sanitized[0].isdigit():
sanitized = f"_{sanitized}"
return sanitized.lower()

@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10)
)
def _insert_batch(self, records: list) -> Dict[str, int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wire max_retries into the retry strategy or remove it The @retry decorator hardcodes stop_after_attempt(3) and ignores the constructor’s max_retries. Pass self.max_retries into a dynamic retry setup (e.g. use a Retrying(stop=stop_after_attempt(self.max_retries), …) inside the method) or drop the unused parameter.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 205
- 255, The constructor's max_retries is ignored because the _insert_batch method
is decorated with a hardcoded `@retry`(stop=stop_after_attempt(3)); either remove
the unused constructor parameter or wire max_retries into the retry logic by
deleting the static decorator on _insert_batch and wrapping the insert logic in
a Tenacity Retrying/ retry(...) call that uses
stop=stop_after_attempt(self.max_retries) (or call
Retrying(...).call(self._do_insert_batch, records) from inside _insert_batch),
so the retry count comes from the instance attribute; update references to
_insert_batch and the constructor parameter accordingly.

Comment on lines 231 to 243
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file in batches using pyqvd."""
reader = QvdReader(self.file_path)
df = reader.to_pandas()

# Sanitize column names to match Pydantic model
df.columns = [self._sanitize_name(col) for col in df.columns]

for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

pyqvd QvdReader streaming chunk read API

💡 Result:

PyQvd’s “streaming”/chunked read is exposed via a chunk_size argument (number of rows/records per chunk). When you set it, the read returns an iterator yielding QvdTable chunks rather than one in-memory table. [1][2]

Recommended public APIs

  • From a file path

    from pyqvd import QvdTable
    for chunk_tbl in QvdTable.from_qvd("data.qvd", chunk_size=100_000):
        ...

    This yields QvdTable objects per chunk. [2]

  • From a binary stream

    from pyqvd import QvdTable
    with open("data.qvd", "rb") as f:
        for chunk_tbl in QvdTable.from_stream(f, chunk_size=100_000):
            ...

    Note: chunk iteration requires a seekable stream; non-seekable streams will fail. [2]

Lower-level reader class

  • pyqvd.io.QvdFileReader(source: str | BinaryIO, chunk_size: int | None) with read() -> QvdTable | Iterator[QvdTable]. [1]

Sources: [1] https://pyqvd.readthedocs.io/stable/api/io.html, [2] https://pyqvd.readthedocs.io/latest/guide/introduction.html


Stream QVD in chunks instead of loading entire file
Use pyqvd’s chunk_size API—e.g.

for tbl in QvdTable.from_qvd(self.file_path, chunk_size=self.batch_size):
    yield tbl.to_pandas().to_dict(orient="records")

to yield batches and avoid materializing the full DataFrame.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 231
- 243, The current _read_batches method loads the entire QVD into memory via
QvdReader(self.file_path).to_pandas(), which can OOM for large files; change it
to stream chunks using pyqvd’s chunking API (QvdTable.from_qvd) and yield each
chunk converted to records. Replace the QvdReader usage in _read_batches with an
iteration like for tbl in QvdTable.from_qvd(self.file_path,
chunk_size=self.batch_size): and for each tbl yield
tbl.to_pandas().to_dict(orient="records"), preserving the existing column
sanitization step (apply self._sanitize_name to tbl column names) so behavior of
file_path, batch_size and _sanitize_name remains consistent.

Comment on lines 461 to 488
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
# Open file using fsspec (supports s3://, gs://, az://, etc.)
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
import tempfile
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
tmp.write(f.read())
tmp_path = tmp.name

# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]

for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")

# Cleanup
import os
os.unlink(tmp_path)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Temp file cleanup can leak on exceptions.

If anything fails after NamedTemporaryFile, os.unlink(tmp_path) is skipped. Wrap processing in try/finally to ensure cleanup.

Proposed fix
-        # Process from temp file
-        reader = QvdReader(tmp_path)
-        df = reader.to_pandas()
-        df.columns = [self._sanitize_name(col) for col in df.columns]
-
-        for start in range(0, len(df), self.batch_size):
-            end = min(start + self.batch_size, len(df))
-            batch_df = df.iloc[start:end]
-            yield batch_df.to_dict(orient="records")
-
-        # Cleanup
-        import os
-        os.unlink(tmp_path)
+        try:
+            # Process from temp file
+            reader = QvdReader(tmp_path)
+            df = reader.to_pandas()
+            df.columns = [self._sanitize_name(col) for col in df.columns]
+
+            for start in range(0, len(df), self.batch_size):
+                end = min(start + self.batch_size, len(df))
+                batch_df = df.iloc[start:end]
+                yield batch_df.to_dict(orient="records")
+        finally:
+            # Cleanup
+            import os
+            os.unlink(tmp_path)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
# Open file using fsspec (supports s3://, gs://, az://, etc.)
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
import tempfile
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
tmp.write(f.read())
tmp_path = tmp.name
# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]
for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")
# Cleanup
import os
os.unlink(tmp_path)
```
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
# Open file using fsspec (supports s3://, gs://, az://, etc.)
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
import tempfile
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
tmp.write(f.read())
tmp_path = tmp.name
try:
# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]
for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")
finally:
# Cleanup
import os
os.unlink(tmp_path)
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 461
- 488, In _read_batches, ensure the temporary file (tmp_path) is removed even if
processing raises: after creating the NamedTemporaryFile (with delete=False)
assign tmp_path, then wrap the QvdReader/df processing and batching logic in a
try block and perform os.unlink(tmp_path) in a finally block; make sure tmp_path
is defined before the try and that os is imported so cleanup always runs
regardless of exceptions.

Comment on lines 544 to 559
def is_file_processed(file_path: str) -> bool:
"""Check if a file has already been processed successfully."""
result = processed_files_table.query(
f"SELECT * FROM qvd_processed_files WHERE file_path = '{file_path}' AND status = 'success'"
)
return len(result) > 0

def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None):
"""Record file processing result."""
processed_files_table.insert([{
"file_path": file_path,
"processed_at": datetime.now(),
"record_count": record_count,
"status": status,
"error_message": error
}])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

moose-lib OlapTable.query parameterized queries

💡 Result:

In moose-lib, you don’t typically call an OlapTable.query(...) method directly—you run parameterized queries via the provided query client, and you reference the OlapTable in a safe, structured way.

TypeScript (@514labs/moose-lib): use the sql\...`` tagged template

Interpolate tables and values through the sql tag (don’t string-concatenate). Example: (docs.fiveonefour.com)

async ({ limit = 10 }, { client, sql }) => {
  const result = await client.query.execute(
    sql`SELECT * FROM ${clickhouseTable} LIMIT ${limit}`
  );
  return await result.json();
}

Python (moose_lib): use typed {name: Type} placeholders + params dict

Pass parameters separately using the format-string placeholders (and use Identifier for table names). Example: (docs.fiveonefour.com)

def handler(client, params):
    return client.query.execute(
        "SELECT * FROM {table: Identifier} LIMIT {limit: Int32}",
        {"table": clickhouse_table.name, "limit": params.limit},
    )

If you share which language you’re using (TS vs Python) and what you want to parameterize (WHERE clauses, IN lists, dates, etc.), I can show the exact pattern for that case.

Citations:


Use parameterized queries instead of string interpolation for the file_path filter
Replace the raw SQL string in is_file_processed with a parameterized call, for example:

result = client.query.execute(
    "SELECT * FROM qvd_processed_files WHERE file_path = {path: String} AND status = {status: String}",
    {"path": file_path, "status": "success"},
)
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 544
- 559, The is_file_processed function currently builds SQL with string
interpolation which allows injection; change it to use a parameterized query API
instead of f-string: replace the processed_files_table.query call in
is_file_processed with a parameterized client.query.execute (or the table
client's parameterized method) passing the SQL with named placeholders for
file_path and status and a params dict (e.g., {"path": file_path,
"status":"success"}), then return based on the result length; keep
mark_file_processed unchanged. Ensure you reference the same function name
is_file_processed and the table/query call when making the change.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 29-109: The Python snippet for QvdIntrospector (file
qvd_introspector.py) is missing the `@test` directive required for doc CI; update
the fenced code block that begins with ```python filename="qvd_introspector.py"
copy to ```python `@test` filename="qvd_introspector.py" copy so the
QvdIntrospector class and its methods (_load, _extract_schema, _sanitize_name,
get_schema, get_row_count) are syntax-checked during documentation builds.
- Around line 544-549: The is_file_processed function currently interpolates
file_path into the SQL string which risks SQL injection; change the
processed_files_table.query call to use a parameterized query or the query API's
parameter binding (e.g., use placeholders like %s or ? or named params depending
on the DB client) and pass file_path as a separate argument, or sanitize/escape
it via the DB client's parameterization helper, so the query becomes a
parameterized SELECT against qvd_processed_files with file_path and status bound
safely rather than concatenated.
- Around line 205-212: The retry decorator on QvdBatchInserter._insert_batch
currently hardcodes stop_after_attempt(3); change the implementation so the
actual instance max_retries is used by wrapping the insert logic in a nested
function (e.g., inner_insert or _do_insert) that is decorated with
`@retry`(stop=stop_after_attempt(self.max_retries), wait=..., retry=...) or by
programmatically creating a Tenacity Retrying object using self.max_retries, and
then call that inner function from _insert_batch; ensure you reference
QvdBatchInserter._insert_batch and the instance attribute max_retries so each
instance uses its configured retry count.
- Around line 551-559: The function mark_file_processed currently types error as
str but defaults it to None; update the signature to use Optional[str] for the
error parameter (from typing import Optional) and adjust any calls if necessary
to accept None, ensuring the annotation and default value match; reference the
function name mark_file_processed and the error parameter when making the
change.
- Around line 231-242: The _read_batches method currently calls
QvdReader(self.file_path).to_pandas(), which loads the entire file into memory
and contradicts the “chunked reading” claim; to fix, replace the full-load call
with a chunked reader like QvdFileReader(self.file_path,
chunk_size=self.batch_size).read() or use QvdTable.from_qvd(...,
chunk_size=self.batch_size) and iterate over returned chunks, applying
self._sanitize_name to each chunk's columns and yielding
chunk.to_dict(orient="records"); alternatively, if you cannot switch readers,
update the method docstring and wording to remove the “chunked reading” claim
and keep the existing implementation.
- Around line 461-487: In _read_batches, avoid loading the whole remote file
into memory and ensure the temp file is always removed: when writing the fsspec
file to the NamedTemporaryFile (tmp_path) use shutil.copyfileobj(f, tmp) instead
of tmp.write(f.read()), and wrap the processing/reading of tmp_path (the
QvdReader, df creation and batching loop) in a try/finally so that
os.unlink(tmp_path) runs in the finally block even on errors; reference the
_read_batches function, tmp_path variable, QvdReader use, and os.unlink call
when making the change.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6006b93 and 4920c5b.

📒 Files selected for processing (2)
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
  • apps/framework-docs-v2/src/config/navigation.ts
🧰 Additional context used
📓 Path-based instructions (5)
apps/framework-docs-v2/content/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

apps/framework-docs-v2/content/**/*.mdx: Use {{ include "shared/path.mdx" }} directives to reuse content fragments, which are processed via processIncludes() during build
Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Guide MDX files in content/guides/ must include frontmatter with title and description fields

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run linting checks before submitting PRs for TypeScript/JavaScript code

**/*.{ts,tsx,js,jsx}: Group imports by external dependencies, internal modules, and types; use named exports from barrel files (index.ts)
Use camelCase for variables and functions, PascalCase for types/classes/components, UPPER_SNAKE_CASE for constants in TypeScript/JavaScript
Prefix unused variables with underscore (e.g., _unusedParam) to bypass linting errors in TypeScript/JavaScript
Format TypeScript/JavaScript code with Prettier using experimentalTernaries: true; auto-formats on commit via Husky + lint-staged

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
apps/framework-docs-v2/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Component library must use shadcn/ui components located in src/components/ui/

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Prefer interfaces for objects, types for unions/intersections in TypeScript; explicit return types on public APIs

Files:

  • apps/framework-docs-v2/src/config/navigation.ts
🧠 Learnings (7)
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Navigation configuration must be manually maintained in `src/config/navigation.ts` and not auto-generated from file structure

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields

Applied to files:

  • apps/framework-docs-v2/src/config/navigation.ts
🔇 Additional comments (1)
apps/framework-docs-v2/src/config/navigation.ts (1)

1253-1262: Navigation entry looks complete and consistent.

Required fields and draft status are set as expected for a new guide.

Based on learnings: Add new guides to guidesNavigationConfig in src/config/navigation.ts with required fields: type, slug, title, icon, languages, and status (start as 'draft').

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 29 to 109
```bash
pip install pyqvd fsspec pandas tenacity moose-lib
```

| Package | Purpose |
|---------|---------|
| `pyqvd` | Read QVD files and extract schema metadata |
| `fsspec` | Abstract filesystem interface for local, S3, GCS, Azure storage |
| `pandas` | DataFrame manipulation for batch processing |
| `tenacity` | Retry logic for resilient batch insertions |
| `moose-lib` | MooseStack workflow and OLAP table integration |

## Auto-Generating Pydantic Models from QVD Schema

QVD files contain embedded schema information. Rather than manually defining Pydantic models, you can introspect the QVD file and generate models automatically.

### QvdIntrospector

The `QvdIntrospector` class reads a QVD file and extracts column names and data types:

```python filename="qvd_introspector.py" copy
from pyqvd import QvdReader
from typing import Dict, Any, List, Tuple

class QvdIntrospector:
"""Introspect QVD file schema to extract column definitions."""

# Map QVD field types to Python/Pydantic types
TYPE_MAP = {
"ASCII": "str",
"TEXT": "str",
"INTEGER": "int",
"REAL": "float",
"FIX": "float",
"DUAL": "str", # Dual values stored as strings
"DATE": "datetime",
"TIME": "datetime",
"TIMESTAMP": "datetime",
"INTERVAL": "float",
"MONEY": "float",
}

def __init__(self, file_path: str):
self.file_path = file_path
self._reader = None
self._schema = None

def _load(self):
"""Lazy load the QVD file."""
if self._reader is None:
self._reader = QvdReader(self.file_path)
self._schema = self._extract_schema()

def _extract_schema(self) -> List[Tuple[str, str]]:
"""Extract column names and types from QVD metadata."""
columns = []
for field in self._reader.fields:
field_name = self._sanitize_name(field.name)
field_type = self.TYPE_MAP.get(field.type, "str")
columns.append((field_name, field_type))
return columns

def _sanitize_name(self, name: str) -> str:
"""Convert QVD field names to valid Python identifiers."""
# Replace spaces and special characters with underscores
sanitized = "".join(c if c.isalnum() else "_" for c in name)
# Ensure it starts with a letter or underscore
if sanitized[0].isdigit():
sanitized = f"_{sanitized}"
return sanitized.lower()

def get_schema(self) -> List[Tuple[str, str]]:
"""Return list of (column_name, python_type) tuples."""
self._load()
return self._schema

def get_row_count(self) -> int:
"""Return the number of rows in the QVD file."""
self._load()
return self._reader.num_rows
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add @test directives to Python code blocks.

All Python snippets in this guide lack the @test directive required for doc validation. Please add @test to each Python code block so the snippets are syntax-checked in CI.

As per coding guidelines: Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 29
- 109, The Python snippet for QvdIntrospector (file qvd_introspector.py) is
missing the `@test` directive required for doc CI; update the fenced code block
that begins with ```python filename="qvd_introspector.py" copy to ```python
`@test` filename="qvd_introspector.py" copy so the QvdIntrospector class and its
methods (_load, _extract_schema, _sanitize_name, get_schema, get_row_count) are
syntax-checked during documentation builds.

Comment on lines 205 to 212
def __init__(
self,
file_path: str,
table: OlapTable,
model_class: Type[BaseModel],
batch_size: int = 10000,
max_retries: int = 3
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

sed -n '150,300p' apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx

Repository: 514-labs/moosestack

Length of output: 5102


Use max_retries in the retry decorator
The @retry on QvdBatchInserter._insert_batch uses stop_after_attempt(3), ignoring the max_retries parameter. Refactor to apply stop_after_attempt(self.max_retries) (e.g. wrap the insert logic in a nested function decorated with your instance’s max_retries).

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 205
- 212, The retry decorator on QvdBatchInserter._insert_batch currently hardcodes
stop_after_attempt(3); change the implementation so the actual instance
max_retries is used by wrapping the insert logic in a nested function (e.g.,
inner_insert or _do_insert) that is decorated with
`@retry`(stop=stop_after_attempt(self.max_retries), wait=..., retry=...) or by
programmatically creating a Tenacity Retrying object using self.max_retries, and
then call that inner function from _insert_batch; ensure you reference
QvdBatchInserter._insert_batch and the instance attribute max_retries so each
instance uses its configured retry count.

Comment on lines 231 to 242
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file in batches using pyqvd."""
reader = QvdReader(self.file_path)
df = reader.to_pandas()

# Sanitize column names to match Pydantic model
df.columns = [self._sanitize_name(col) for col in df.columns]

for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

pyqvd QvdReader to_pandas chunk streaming

💡 Result:

pyqvd doesn’t stream directly into a single growing pandas DataFrame, but it can read a QVD in record chunks and you can convert each chunk to pandas (to_pandas()) and process/write incrementally.

Chunked read → pandas per chunk (recommended)

from pyqvd import QvdTable

for tbl_chunk in QvdTable.from_qvd("big.qvd", chunk_size=100_000):
    df = tbl_chunk.to_pandas()
    # process df (e.g., write to parquet, aggregate, etc.)

chunk_size is “number of records” and from_qvd(..., chunk_size=...) returns an iterator of QvdTable slices. [1]

Same idea via the lower-level reader

from pyqvd.io import QvdFileReader

reader = QvdFileReader("big.qvd", chunk_size=100_000)
for tbl_chunk in reader.read():
    df = tbl_chunk.to_pandas()

QvdFileReader(..., chunk_size=...) + read() returns either a single QvdTable or an iterator when chunking is enabled. [2]

If you want a true “generator of chunks” API (alternative library)

qvd-utils explicitly exposes read_in_chunks(..., chunk_size=...) returning a generator of dicts that you can wrap into DataFrames. [3]

Sources: PyQvd core API (from_qvd, to_pandas) [1], PyQvd I/O API (QvdFileReader, chunking) [2], qvd-utils PyPI docs (chunk generator) [3].


Implement streaming or update wording. The current reader.to_pandas() call loads the entire QVD into memory, so it doesn’t perform chunked reading. Use QvdFileReader(self.file_path, chunk_size=self.batch_size).read() (or QvdTable.from_qvd(..., chunk_size=...)) to process in batches, or adjust the description to drop the “chunked reading” claim.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 231
- 242, The _read_batches method currently calls
QvdReader(self.file_path).to_pandas(), which loads the entire file into memory
and contradicts the “chunked reading” claim; to fix, replace the full-load call
with a chunked reader like QvdFileReader(self.file_path,
chunk_size=self.batch_size).read() or use QvdTable.from_qvd(...,
chunk_size=self.batch_size) and iterate over returned chunks, applying
self._sanitize_name to each chunk's columns and yielding
chunk.to_dict(orient="records"); alternatively, if you cannot switch readers,
update the method docstring and wording to remove the “chunked reading” claim
and keep the existing implementation.

Comment on lines 461 to 487
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
# Open file using fsspec (supports s3://, gs://, az://, etc.)
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
import tempfile
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
tmp.write(f.read())
tmp_path = tmp.name

# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]

for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")

# Cleanup
import os
os.unlink(tmp_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wrap temp file usage in try/finally and stream copy

Ensure tmp_path is always deleted by wrapping the processing in a try/finally, and replace tmp.write(f.read()) with shutil.copyfileobj(f, tmp) to avoid loading the entire file into memory.

Proposed fix
-        with fsspec.open(
-            self.file_path,
-            mode="rb",
-            **self.storage_options
-        ) as f:
-            # Write to temp file for pyqvd compatibility
-            import tempfile
-            with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
-                tmp.write(f.read())
-                tmp_path = tmp.name
-
-        # Process from temp file
-        reader = QvdReader(tmp_path)
-        df = reader.to_pandas()
-        df.columns = [self._sanitize_name(col) for col in df.columns]
-
-        for start in range(0, len(df), self.batch_size):
-            end = min(start + self.batch_size, len(df))
-            batch_df = df.iloc[start:end]
-            yield batch_df.to_dict(orient="records")
-
-        # Cleanup
-        import os
-        os.unlink(tmp_path)
+        import os
+        import tempfile
+        import shutil
+        with fsspec.open(
+            self.file_path,
+            mode="rb",
+            **self.storage_options
+        ) as f:
+            # Write to temp file for pyqvd compatibility
+            with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
+                shutil.copyfileobj(f, tmp)
+                tmp_path = tmp.name
+
+        try:
+            # Process from temp file
+            reader = QvdReader(tmp_path)
+            df = reader.to_pandas()
+            df.columns = [self._sanitize_name(col) for col in df.columns]
+
+            for start in range(0, len(df), self.batch_size):
+                end = min(start + self.batch_size, len(df))
+                batch_df = df.iloc[start:end]
+                yield batch_df.to_dict(orient="records")
+        finally:
+            os.unlink(tmp_path)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
# Open file using fsspec (supports s3://, gs://, az://, etc.)
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
import tempfile
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
tmp.write(f.read())
tmp_path = tmp.name
# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]
for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")
# Cleanup
import os
os.unlink(tmp_path)
def _read_batches(self) -> Generator[list, None, None]:
"""Read QVD file from any fsspec-compatible filesystem."""
import os
import tempfile
import shutil
with fsspec.open(
self.file_path,
mode="rb",
**self.storage_options
) as f:
# Write to temp file for pyqvd compatibility
with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
shutil.copyfileobj(f, tmp)
tmp_path = tmp.name
try:
# Process from temp file
reader = QvdReader(tmp_path)
df = reader.to_pandas()
df.columns = [self._sanitize_name(col) for col in df.columns]
for start in range(0, len(df), self.batch_size):
end = min(start + self.batch_size, len(df))
batch_df = df.iloc[start:end]
yield batch_df.to_dict(orient="records")
finally:
os.unlink(tmp_path)
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 461
- 487, In _read_batches, avoid loading the whole remote file into memory and
ensure the temp file is always removed: when writing the fsspec file to the
NamedTemporaryFile (tmp_path) use shutil.copyfileobj(f, tmp) instead of
tmp.write(f.read()), and wrap the processing/reading of tmp_path (the QvdReader,
df creation and batching loop) in a try/finally so that os.unlink(tmp_path) runs
in the finally block even on errors; reference the _read_batches function,
tmp_path variable, QvdReader use, and os.unlink call when making the change.

Comment on lines 544 to 549
def is_file_processed(file_path: str) -> bool:
"""Check if a file has already been processed successfully."""
result = processed_files_table.query(
f"SELECT * FROM qvd_processed_files WHERE file_path = '{file_path}' AND status = 'success'"
)
return len(result) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use parameterized queries instead of string interpolation
Interpolating file_path directly into the SQL string risks injection; switch to a parameterized query or sanitize inputs.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 544
- 549, The is_file_processed function currently interpolates file_path into the
SQL string which risks SQL injection; change the processed_files_table.query
call to use a parameterized query or the query API's parameter binding (e.g.,
use placeholders like %s or ? or named params depending on the DB client) and
pass file_path as a separate argument, or sanitize/escape it via the DB client's
parameterization helper, so the query becomes a parameterized SELECT against
qvd_processed_files with file_path and status bound safely rather than
concatenated.

Comment on lines 551 to 559
def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None):
"""Record file processing result."""
processed_files_table.insert([{
"file_path": file_path,
"processed_at": datetime.now(),
"record_count": record_count,
"status": status,
"error_message": error
}])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for typing imports and context around the snippet
rg "from typing import" -n apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
rg "Optional" -n apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
rg -n -C 10 "mark_file_processed" apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx

Repository: 514-labs/moosestack

Length of output: 1662


🏁 Script executed:

#!/bin/bash
tail -n +500 apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx | head -n 100

Repository: 514-labs/moosestack

Length of output: 2847


Annotate error parameter as Optional[str]
Default value None conflicts with the str annotation; update to Optional[str].

Signature update
-def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None):
+def mark_file_processed(
+    file_path: str,
+    record_count: int,
+    status: str,
+    error: Optional[str] = None
+):
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 551
- 559, The function mark_file_processed currently types error as str but
defaults it to None; update the signature to use Optional[str] for the error
parameter (from typing import Optional) and adjust any calls if necessary to
accept None, ensuring the annotation and default value match; reference the
function name mark_file_processed and the error parameter when making the
change.

Remove implementation code that confuses users:
- Removed QvdIntrospector, QvdModelGenerator, and QvdBatchInserter class implementations
- Removed pip install commands with specific package versions
- Removed all code users would need to write themselves

Replace with usage-focused content:
- How to install the pipeline from the registry
- Configuration via environment variables
- Using the provided init_qvd.py CLI tool
- Running workflows and monitoring via API
- Error handling and troubleshooting

The guide now shows HOW TO USE the pre-built pipeline instead of HOW TO BUILD IT from scratch.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@linear
Copy link

linear bot commented Feb 3, 2026

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 159-172: The Pydantic model class QvdItemModel is being
overwritten by assigning QvdItemModel = OlapTable(...); rename the table
variable (for example QvdItemTable or qvd_item_table) so the Pydantic class
QvdItemModel remains intact, update any references that expect the OlapTable to
use the new table name, and ensure the OlapTable columns and name remain the
same.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4920c5b and 8af8c86.

📒 Files selected for processing (1)
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧰 Additional context used
📓 Path-based instructions (2)
apps/framework-docs-v2/content/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

apps/framework-docs-v2/content/**/*.mdx: Use {{ include "shared/path.mdx" }} directives to reuse content fragments, which are processed via processIncludes() during build
Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Guide MDX files in content/guides/ must include frontmatter with title and description fields

Files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Validate code snippets in documentation with the `test` directive for TypeScript and Python code blocks

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/moose/src/query-examples/**/*.ts : Query example functions must use `eventsModel.toSql()` to generate SQL, ensuring consistent query construction through the defined model

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

- Add columns prop to CTACards
- Change href to ctaLink in CTACard
- Add ctaLabel and Icon props
- Remove unused BulletPointsCard import
@514Ben
Copy link
Contributor Author

514Ben commented Feb 3, 2026

Change QvdItemModel = OlapTable(...) to QvdItemTable = OlapTable(...) to prevent shadowing the Pydantic model class.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@514Ben
Copy link
Contributor Author

514Ben commented Feb 3, 2026

Fixed the example code issue identified by CodeRabbit - changed QvdItemModel = OlapTable(...) to QvdItemTable = OlapTable(...) to avoid overwriting the Pydantic model class.

Fix critical issues and enhance user guidance:
- Add missing component imports (ConditionalContent, CustomizePanel, etc.)
- Clarify MCP requirement for AI copilot support
- Add community support callout and connection string instructions
- Improve section transitions with progress summaries
- Fix GitHub capitalization and clarify local/remote setup

These changes improve the guide's accuracy and help users successfully
migrate their dashboards to ClickHouse.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The tracking information is accessible through the API endpoint,
so direct ClickHouse queries are no longer needed in the guide.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/performant-dashboards.mdx`:
- Line 8: The import statement on line 8 includes unused
components—ConditionalContent, CustomizePanel, CustomizeGrid, and
SelectField—which should be removed to eliminate dead imports; update the import
list that currently contains CTACards, CTACard, Callout, BulletPointsCard,
ToggleBlock, ConditionalContent, CustomizePanel, CustomizeGrid, SelectField to
only include the actually used symbols (e.g., CTACards, CTACard, Callout,
BulletPointsCard, ToggleBlock) by deleting the unused component names from that
import.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 28a3258 and 35a8619.

📒 Files selected for processing (2)
  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧰 Additional context used
📓 Path-based instructions (2)
apps/framework-docs-v2/content/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

apps/framework-docs-v2/content/**/*.mdx: Use {{ include "shared/path.mdx" }} directives to reuse content fragments, which are processed via processIncludes() during build
Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx

📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)

Guide MDX files in content/guides/ must include frontmatter with title and description fields

Files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
📚 Learning: 2026-01-15T19:41:53.563Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/cdp-analytics/CLAUDE.md:0-0
Timestamp: 2026-01-15T19:41:53.563Z
Learning: Applies to examples/cdp-analytics/web-app/src/features/dashboard/*.tsx : Dashboard components should follow React hooks patterns: use useState for local data, useEffect for data fetching, handle errors with fallback data, and wrap visualizations in Card/CardHeader/CardContent from Radix UI

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/components/dashboard/**/*.ts : Dashboard hooks must call server actions (not moose functions directly) to decouple React components from ClickHouse implementation details

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Use `{{ include "shared/path.mdx" }}` directives to reuse content fragments, which are processed via `processIncludes()` during build

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Validate code snippets in documentation with the `test` directive for TypeScript and Python code blocks

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-15T19:41:53.563Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/cdp-analytics/CLAUDE.md:0-0
Timestamp: 2026-01-15T19:41:53.563Z
Learning: For slow dashboard queries (>500ms) that aggregate across multiple tables or need cohort-based breakdowns, implement a materialized view instead of real-time queries

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.

Applied to files:

  • apps/framework-docs-v2/content/guides/performant-dashboards.mdx
  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/moose/src/query-examples/**/*.ts : Query example functions must use `eventsModel.toSql()` to generate SQL, ensuring consistent query construction through the defined model

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2025-12-17T23:59:22.807Z
Learnt from: callicles
Repo: 514-labs/moosestack PR: 3143
File: apps/framework-docs/llm-docs/python/table-setup.md:331-336
Timestamp: 2025-12-17T23:59:22.807Z
Learning: In apps/framework-docs/llm-docs/**/*.md (LLM documentation), prefer exact class and type names (e.g., "MaterializedView", "OlapTable") over lowercase prose variants (e.g., "materialized view") to help LLMs generate correct code with proper imports and syntax.

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')

Applied to files:

  • apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🔇 Additional comments (6)
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx (2)

1-4: Frontmatter is complete.

Title and description fields are present as required.


10-35: Clear overview and feature list.

The introduction effectively explains QVD files, migration benefits, and pipeline capabilities. Structure is well-organized.

apps/framework-docs-v2/content/guides/performant-dashboards.mdx (4)

134-138: Clarifications improve guidance.

Remote ClickHouse context and MCP-compatible editor examples are helpful additions.


722-728: Clear Boreal connection string instructions.

Step-by-step guidance helps users locate credentials.


1133-1142: Helpful progress checkpoint.

Success callout summarizes completed work and provides clear transition to production deployment.


149-151: Slack invite link is active and valid.

The shared invite link returns a 302 redirect to the moose-community workspace, confirming it's currently functional.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

---

import { CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock } from "@/components/mdx";
import { CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock, ConditionalContent, CustomizePanel, CustomizeGrid, SelectField } from "@/components/mdx";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if newly imported components are used in the file
for comp in ConditionalContent CustomizePanel CustomizeGrid SelectField; do
  echo "=== Checking usage of $comp ==="
  rg "<${comp}" apps/framework-docs-v2/content/guides/performant-dashboards.mdx || echo "Not found"
done

Repository: 514-labs/moosestack

Length of output: 266


Remove unused component imports.

The imports ConditionalContent, CustomizePanel, CustomizeGrid, and SelectField are not used anywhere in the file and should be removed from line 8.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/performant-dashboards.mdx` at line 8,
The import statement on line 8 includes unused components—ConditionalContent,
CustomizePanel, CustomizeGrid, and SelectField—which should be removed to
eliminate dead imports; update the import list that currently contains CTACards,
CTACard, Callout, BulletPointsCard, ToggleBlock, ConditionalContent,
CustomizePanel, CustomizeGrid, SelectField to only include the actually used
symbols (e.g., CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock) by
deleting the unused component names from that import.

514Ben and others added 3 commits February 3, 2026 15:03
Remove draft status to make the guide visible in the guides list
without requiring feature flags.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Keep guide hidden behind show-draft-guides feature flag.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Configure build-time environment variable to include draft guides
in Vercel deployments.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

"env": {
"NEXT_PUBLIC_NAV_VARIANT": "draft"
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vercel.json exposes draft content in production builds

High Severity

The vercel.json change sets NEXT_PUBLIC_NAV_VARIANT to draft in production builds. According to nav-variant.ts, production should default to base (public content only), while draft exposes internal draft guides. This configuration change will expose all draft content (not just this new guide) to users when deployed to production. This appears to be testing configuration accidentally included for the preview deployment.

Fix in Cursor Fix in Web

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/vercel.json`:
- Line 2: The vercel.json currently runs "installCommand": "pnpm install
--filter docs-v2..." from the app directory; change it to run the monorepo root
install instead by setting the installCommand to "pnpm install" (or removing the
custom installCommand so Vercel runs root install), and if necessary update the
vercel.json rootDirectory value so Vercel deploys from the monorepo root while
still targeting the docs-v2 app; look for the "installCommand" and
"rootDirectory" keys in vercel.json and ensure installs happen at the repo root
to keep workspace deps and lockfile consistent.
- Around line 4-6: Remove the hardcoded NEXT_PUBLIC_NAV_VARIANT="draft" from
vercel.json so draft navigation is not forced across all deployments; instead,
delete or unset the NEXT_PUBLIC_NAV_VARIANT entry in vercel.json and configure
visibility via Vercel Toolbar feature flags (e.g., show-draft-guides,
show-beta-guides, show-hosting-section, show-ai-section, show-copy-as-markdown,
show-linear-integration) or use preview-only environment variables for preview
builds so production deployments never expose draft guides.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 35a8619 and 4739016.

📒 Files selected for processing (1)
  • apps/framework-docs-v2/vercel.json
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`

Applied to files:

  • apps/framework-docs-v2/vercel.json
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release

Applied to files:

  • apps/framework-docs-v2/vercel.json
📚 Learning: 2025-12-16T23:08:31.521Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-16T23:08:31.521Z
Learning: Build TypeScript/JavaScript packages using `pnpm build` before integration

Applied to files:

  • apps/framework-docs-v2/vercel.json
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Always install dependencies from the monorepo root using `pnpm install`, never run `pnpm install` directly from `apps/framework-docs-v2/`

Applied to files:

  • apps/framework-docs-v2/vercel.json

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@@ -1,3 +1,8 @@
{
"installCommand": "pnpm install --filter docs-v2..."
"installCommand": "pnpm install --filter docs-v2...",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Install dependencies from the monorepo root.
pnpm install --filter docs-v2... runs from the app directory and conflicts with the repo workflow. Use a root install (and adjust Vercel rootDirectory if needed) to keep workspace deps and lockfile consistent.

💡 Suggested change
-  "installCommand": "pnpm install --filter docs-v2...",
+  "installCommand": "pnpm install",

Based on learnings: Always install dependencies from the monorepo root using pnpm install, never run pnpm install directly from apps/framework-docs-v2/.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"installCommand": "pnpm install --filter docs-v2...",
"installCommand": "pnpm install",
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/vercel.json` at line 2, The vercel.json currently runs
"installCommand": "pnpm install --filter docs-v2..." from the app directory;
change it to run the monorepo root install instead by setting the installCommand
to "pnpm install" (or removing the custom installCommand so Vercel runs root
install), and if necessary update the vercel.json rootDirectory value so Vercel
deploys from the monorepo root while still targeting the docs-v2 app; look for
the "installCommand" and "rootDirectory" keys in vercel.json and ensure installs
happen at the repo root to keep workspace deps and lockfile consistent.

Comment on lines +4 to +6
"env": {
"NEXT_PUBLIC_NAV_VARIANT": "draft"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t force draft navigation in all builds.
Setting NEXT_PUBLIC_NAV_VARIANT="draft" in vercel.json applies to all deployments and can expose draft guides publicly, bypassing the intended feature-flag workflow. Use Vercel Toolbar flags (or preview-only envs) instead.

💡 Suggested change
-  "build": {
-    "env": {
-      "NEXT_PUBLIC_NAV_VARIANT": "draft"
-    }
-  }
+  "build": {}

Based on learnings: Use Vercel Toolbar feature flags for visibility control: show-draft-guides, show-beta-guides, show-hosting-section, show-ai-section, show-copy-as-markdown, show-linear-integration.

🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/vercel.json` around lines 4 - 6, Remove the hardcoded
NEXT_PUBLIC_NAV_VARIANT="draft" from vercel.json so draft navigation is not
forced across all deployments; instead, delete or unset the
NEXT_PUBLIC_NAV_VARIANT entry in vercel.json and configure visibility via Vercel
Toolbar feature flags (e.g., show-draft-guides, show-beta-guides,
show-hosting-section, show-ai-section, show-copy-as-markdown,
show-linear-integration) or use preview-only environment variables for preview
builds so production deployments never expose draft guides.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants