-
Notifications
You must be signed in to change notification settings - Fork 26
Add QVD to ClickHouse data transfer guide #3437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Creates comprehensive documentation for transferring QlikView QVD files to ClickHouse using MooseStack workflows. Covers auto-generating Pydantic models, batch insertion with retry logic, cloud storage configuration, and best practices. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Summary by CodeRabbit
WalkthroughAdds a new MDX guide describing a MooseStack workflow to transfer QlikView QVD files to ClickHouse, updates an existing dashboard guide with UI/import and deployment clarifications, registers the new guide in navigation as a draft (IconDatabaseImport, languages: ["python"]), and sets NEXT_PUBLIC_NAV_VARIANT="draft" in vercel.json. No runtime code changes. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 205-255: The constructor's max_retries is ignored because the
_insert_batch method is decorated with a hardcoded
`@retry`(stop=stop_after_attempt(3)); either remove the unused constructor
parameter or wire max_retries into the retry logic by deleting the static
decorator on _insert_batch and wrapping the insert logic in a Tenacity Retrying/
retry(...) call that uses stop=stop_after_attempt(self.max_retries) (or call
Retrying(...).call(self._do_insert_batch, records) from inside _insert_batch),
so the retry count comes from the instance attribute; update references to
_insert_batch and the constructor parameter accordingly.
- Around line 49-109: The Python code block containing the QvdIntrospector class
(the triple-backtick header currently reading ```python
filename="qvd_introspector.py" copy) needs the documentation test directive
added so the snippet is validated; update the code fence header to include `@test`
(for example: ```python `@test` filename="qvd_introspector.py" copy) for this and
any other Python blocks in the file that lack it so the snippet (including
QvdIntrospector, _load, _extract_schema, _sanitize_name, get_schema,
get_row_count) will be picked up by the docs test runner.
- Around line 91-98: The _sanitize_name function can raise IndexError when the
input normalizes to an empty string; after building sanitized in _sanitize_name,
add a guard that if sanitized == "" (or contains only underscores) set a safe
fallback (e.g., "_" or "_field") before accessing sanitized[0], then apply the
existing check to prefix with "_" if the first character is a digit and finally
return the lowercased result; update references to _sanitize_name accordingly.
- Around line 461-488: In _read_batches, ensure the temporary file (tmp_path) is
removed even if processing raises: after creating the NamedTemporaryFile (with
delete=False) assign tmp_path, then wrap the QvdReader/df processing and
batching logic in a try block and perform os.unlink(tmp_path) in a finally
block; make sure tmp_path is defined before the try and that os is imported so
cleanup always runs regardless of exceptions.
- Around line 231-243: The current _read_batches method loads the entire QVD
into memory via QvdReader(self.file_path).to_pandas(), which can OOM for large
files; change it to stream chunks using pyqvd’s chunking API (QvdTable.from_qvd)
and yield each chunk converted to records. Replace the QvdReader usage in
_read_batches with an iteration like for tbl in
QvdTable.from_qvd(self.file_path, chunk_size=self.batch_size): and for each tbl
yield tbl.to_pandas().to_dict(orient="records"), preserving the existing column
sanitization step (apply self._sanitize_name to tbl column names) so behavior of
file_path, batch_size and _sanitize_name remains consistent.
- Around line 544-559: The is_file_processed function currently builds SQL with
string interpolation which allows injection; change it to use a parameterized
query API instead of f-string: replace the processed_files_table.query call in
is_file_processed with a parameterized client.query.execute (or the table
client's parameterized method) passing the SQL with named placeholders for
file_path and status and a params dict (e.g., {"path": file_path,
"status":"success"}), then return based on the result length; keep
mark_file_processed unchanged. Ensure you reference the same function name
is_file_processed and the table/query call when making the change.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdxapps/framework-docs-v2/src/config/navigation.ts
🧰 Additional context used
📓 Path-based instructions (5)
apps/framework-docs-v2/content/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
apps/framework-docs-v2/content/**/*.mdx: Use{{ include "shared/path.mdx" }}directives to reuse content fragments, which are processed viaprocessIncludes()during build
Validate code snippets in documentation with the@testdirective for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Guide MDX files in
content/guides/must include frontmatter with title and description fields
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Run linting checks before submitting PRs for TypeScript/JavaScript code
**/*.{ts,tsx,js,jsx}: Group imports by external dependencies, internal modules, and types; use named exports from barrel files (index.ts)
Use camelCase for variables and functions, PascalCase for types/classes/components, UPPER_SNAKE_CASE for constants in TypeScript/JavaScript
Prefix unused variables with underscore (e.g.,_unusedParam) to bypass linting errors in TypeScript/JavaScript
Format TypeScript/JavaScript code with Prettier usingexperimentalTernaries: true; auto-formats on commit via Husky + lint-staged
Files:
apps/framework-docs-v2/src/config/navigation.ts
apps/framework-docs-v2/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Component library must use shadcn/ui components located in
src/components/ui/
Files:
apps/framework-docs-v2/src/config/navigation.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Prefer interfaces for objects, types for unions/intersections in TypeScript; explicit return types on public APIs
Files:
apps/framework-docs-v2/src/config/navigation.ts
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build CLI Binaries (linux-arm64-glibc, ubuntu-22-8-core, aarch64-unknown-linux-gnu)
- GitHub Check: Package and Publish Templates
🔇 Additional comments (1)
apps/framework-docs-v2/src/config/navigation.ts (1)
1253-1262: Navigation entry looks good.Correct slug, draft status, icon, and language scoping are consistent with guide onboarding patterns.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| def _sanitize_name(self, name: str) -> str: | ||
| """Convert QVD field names to valid Python identifiers.""" | ||
| # Replace spaces and special characters with underscores | ||
| sanitized = "".join(c if c.isalnum() else "_" for c in name) | ||
| # Ensure it starts with a letter or underscore | ||
| if sanitized[0].isdigit(): | ||
| sanitized = f"_{sanitized}" | ||
| return sanitized.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against empty/invalid field names in _sanitize_name.
sanitized[0] will throw if the field name normalizes to an empty string. Add a fallback to avoid IndexError.
Proposed fix
- if sanitized[0].isdigit():
+ if not sanitized:
+ sanitized = "_field"
+ if sanitized[0].isdigit():
sanitized = f"_{sanitized}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _sanitize_name(self, name: str) -> str: | |
| """Convert QVD field names to valid Python identifiers.""" | |
| # Replace spaces and special characters with underscores | |
| sanitized = "".join(c if c.isalnum() else "_" for c in name) | |
| # Ensure it starts with a letter or underscore | |
| if sanitized[0].isdigit(): | |
| sanitized = f"_{sanitized}" | |
| return sanitized.lower() | |
| def _sanitize_name(self, name: str) -> str: | |
| """Convert QVD field names to valid Python identifiers.""" | |
| # Replace spaces and special characters with underscores | |
| sanitized = "".join(c if c.isalnum() else "_" for c in name) | |
| # Ensure it starts with a letter or underscore | |
| if not sanitized: | |
| sanitized = "_field" | |
| if sanitized[0].isdigit(): | |
| sanitized = f"_{sanitized}" | |
| return sanitized.lower() |
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 91
- 98, The _sanitize_name function can raise IndexError when the input normalizes
to an empty string; after building sanitized in _sanitize_name, add a guard that
if sanitized == "" (or contains only underscores) set a safe fallback (e.g., "_"
or "_field") before accessing sanitized[0], then apply the existing check to
prefix with "_" if the first character is a digit and finally return the
lowercased result; update references to _sanitize_name accordingly.
| def __init__( | ||
| self, | ||
| file_path: str, | ||
| table: OlapTable, | ||
| model_class: Type[BaseModel], | ||
| batch_size: int = 10000, | ||
| max_retries: int = 3 | ||
| ): | ||
| """ | ||
| Initialize the batch inserter. | ||
|
|
||
| Args: | ||
| file_path: Path to the QVD file (local or fsspec-compatible URI) | ||
| table: MooseStack OlapTable instance | ||
| model_class: Pydantic model class for validation | ||
| batch_size: Number of records per batch | ||
| max_retries: Maximum retry attempts for failed batches | ||
| """ | ||
| self.file_path = file_path | ||
| self.table = table | ||
| self.model_class = model_class | ||
| self.batch_size = batch_size | ||
| self.max_retries = max_retries | ||
| self._total_inserted = 0 | ||
| self._total_failed = 0 | ||
|
|
||
| def _read_batches(self) -> Generator[list, None, None]: | ||
| """Read QVD file in batches using pyqvd.""" | ||
| reader = QvdReader(self.file_path) | ||
| df = reader.to_pandas() | ||
|
|
||
| # Sanitize column names to match Pydantic model | ||
| df.columns = [self._sanitize_name(col) for col in df.columns] | ||
|
|
||
| for start in range(0, len(df), self.batch_size): | ||
| end = min(start + self.batch_size, len(df)) | ||
| batch_df = df.iloc[start:end] | ||
| yield batch_df.to_dict(orient="records") | ||
|
|
||
| def _sanitize_name(self, name: str) -> str: | ||
| """Convert column names to valid Python identifiers.""" | ||
| sanitized = "".join(c if c.isalnum() else "_" for c in name) | ||
| if sanitized[0].isdigit(): | ||
| sanitized = f"_{sanitized}" | ||
| return sanitized.lower() | ||
|
|
||
| @retry( | ||
| stop=stop_after_attempt(3), | ||
| wait=wait_exponential(multiplier=1, min=2, max=10) | ||
| ) | ||
| def _insert_batch(self, records: list) -> Dict[str, int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wire max_retries into the retry strategy or remove it The @retry decorator hardcodes stop_after_attempt(3) and ignores the constructor’s max_retries. Pass self.max_retries into a dynamic retry setup (e.g. use a Retrying(stop=stop_after_attempt(self.max_retries), …) inside the method) or drop the unused parameter.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 205
- 255, The constructor's max_retries is ignored because the _insert_batch method
is decorated with a hardcoded `@retry`(stop=stop_after_attempt(3)); either remove
the unused constructor parameter or wire max_retries into the retry logic by
deleting the static decorator on _insert_batch and wrapping the insert logic in
a Tenacity Retrying/ retry(...) call that uses
stop=stop_after_attempt(self.max_retries) (or call
Retrying(...).call(self._do_insert_batch, records) from inside _insert_batch),
so the retry count comes from the instance attribute; update references to
_insert_batch and the constructor parameter accordingly.
| def _read_batches(self) -> Generator[list, None, None]: | ||
| """Read QVD file in batches using pyqvd.""" | ||
| reader = QvdReader(self.file_path) | ||
| df = reader.to_pandas() | ||
|
|
||
| # Sanitize column names to match Pydantic model | ||
| df.columns = [self._sanitize_name(col) for col in df.columns] | ||
|
|
||
| for start in range(0, len(df), self.batch_size): | ||
| end = min(start + self.batch_size, len(df)) | ||
| batch_df = df.iloc[start:end] | ||
| yield batch_df.to_dict(orient="records") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
pyqvd QvdReader streaming chunk read API
💡 Result:
PyQvd’s “streaming”/chunked read is exposed via a chunk_size argument (number of rows/records per chunk). When you set it, the read returns an iterator yielding QvdTable chunks rather than one in-memory table. [1][2]
Recommended public APIs
-
From a file path
from pyqvd import QvdTable for chunk_tbl in QvdTable.from_qvd("data.qvd", chunk_size=100_000): ...
This yields
QvdTableobjects per chunk. [2] -
From a binary stream
from pyqvd import QvdTable with open("data.qvd", "rb") as f: for chunk_tbl in QvdTable.from_stream(f, chunk_size=100_000): ...
Note: chunk iteration requires a seekable stream; non-seekable streams will fail. [2]
Lower-level reader class
pyqvd.io.QvdFileReader(source: str | BinaryIO, chunk_size: int | None)withread() -> QvdTable | Iterator[QvdTable]. [1]
Sources: [1] https://pyqvd.readthedocs.io/stable/api/io.html, [2] https://pyqvd.readthedocs.io/latest/guide/introduction.html
Stream QVD in chunks instead of loading entire file
Use pyqvd’s chunk_size API—e.g.
for tbl in QvdTable.from_qvd(self.file_path, chunk_size=self.batch_size):
yield tbl.to_pandas().to_dict(orient="records")to yield batches and avoid materializing the full DataFrame.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 231
- 243, The current _read_batches method loads the entire QVD into memory via
QvdReader(self.file_path).to_pandas(), which can OOM for large files; change it
to stream chunks using pyqvd’s chunking API (QvdTable.from_qvd) and yield each
chunk converted to records. Replace the QvdReader usage in _read_batches with an
iteration like for tbl in QvdTable.from_qvd(self.file_path,
chunk_size=self.batch_size): and for each tbl yield
tbl.to_pandas().to_dict(orient="records"), preserving the existing column
sanitization step (apply self._sanitize_name to tbl column names) so behavior of
file_path, batch_size and _sanitize_name remains consistent.
| def _read_batches(self) -> Generator[list, None, None]: | ||
| """Read QVD file from any fsspec-compatible filesystem.""" | ||
| # Open file using fsspec (supports s3://, gs://, az://, etc.) | ||
| with fsspec.open( | ||
| self.file_path, | ||
| mode="rb", | ||
| **self.storage_options | ||
| ) as f: | ||
| # Write to temp file for pyqvd compatibility | ||
| import tempfile | ||
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | ||
| tmp.write(f.read()) | ||
| tmp_path = tmp.name | ||
|
|
||
| # Process from temp file | ||
| reader = QvdReader(tmp_path) | ||
| df = reader.to_pandas() | ||
| df.columns = [self._sanitize_name(col) for col in df.columns] | ||
|
|
||
| for start in range(0, len(df), self.batch_size): | ||
| end = min(start + self.batch_size, len(df)) | ||
| batch_df = df.iloc[start:end] | ||
| yield batch_df.to_dict(orient="records") | ||
|
|
||
| # Cleanup | ||
| import os | ||
| os.unlink(tmp_path) | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temp file cleanup can leak on exceptions.
If anything fails after NamedTemporaryFile, os.unlink(tmp_path) is skipped. Wrap processing in try/finally to ensure cleanup.
Proposed fix
- # Process from temp file
- reader = QvdReader(tmp_path)
- df = reader.to_pandas()
- df.columns = [self._sanitize_name(col) for col in df.columns]
-
- for start in range(0, len(df), self.batch_size):
- end = min(start + self.batch_size, len(df))
- batch_df = df.iloc[start:end]
- yield batch_df.to_dict(orient="records")
-
- # Cleanup
- import os
- os.unlink(tmp_path)
+ try:
+ # Process from temp file
+ reader = QvdReader(tmp_path)
+ df = reader.to_pandas()
+ df.columns = [self._sanitize_name(col) for col in df.columns]
+
+ for start in range(0, len(df), self.batch_size):
+ end = min(start + self.batch_size, len(df))
+ batch_df = df.iloc[start:end]
+ yield batch_df.to_dict(orient="records")
+ finally:
+ # Cleanup
+ import os
+ os.unlink(tmp_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _read_batches(self) -> Generator[list, None, None]: | |
| """Read QVD file from any fsspec-compatible filesystem.""" | |
| # Open file using fsspec (supports s3://, gs://, az://, etc.) | |
| with fsspec.open( | |
| self.file_path, | |
| mode="rb", | |
| **self.storage_options | |
| ) as f: | |
| # Write to temp file for pyqvd compatibility | |
| import tempfile | |
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | |
| tmp.write(f.read()) | |
| tmp_path = tmp.name | |
| # Process from temp file | |
| reader = QvdReader(tmp_path) | |
| df = reader.to_pandas() | |
| df.columns = [self._sanitize_name(col) for col in df.columns] | |
| for start in range(0, len(df), self.batch_size): | |
| end = min(start + self.batch_size, len(df)) | |
| batch_df = df.iloc[start:end] | |
| yield batch_df.to_dict(orient="records") | |
| # Cleanup | |
| import os | |
| os.unlink(tmp_path) | |
| ``` | |
| def _read_batches(self) -> Generator[list, None, None]: | |
| """Read QVD file from any fsspec-compatible filesystem.""" | |
| # Open file using fsspec (supports s3://, gs://, az://, etc.) | |
| with fsspec.open( | |
| self.file_path, | |
| mode="rb", | |
| **self.storage_options | |
| ) as f: | |
| # Write to temp file for pyqvd compatibility | |
| import tempfile | |
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | |
| tmp.write(f.read()) | |
| tmp_path = tmp.name | |
| try: | |
| # Process from temp file | |
| reader = QvdReader(tmp_path) | |
| df = reader.to_pandas() | |
| df.columns = [self._sanitize_name(col) for col in df.columns] | |
| for start in range(0, len(df), self.batch_size): | |
| end = min(start + self.batch_size, len(df)) | |
| batch_df = df.iloc[start:end] | |
| yield batch_df.to_dict(orient="records") | |
| finally: | |
| # Cleanup | |
| import os | |
| os.unlink(tmp_path) |
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 461
- 488, In _read_batches, ensure the temporary file (tmp_path) is removed even if
processing raises: after creating the NamedTemporaryFile (with delete=False)
assign tmp_path, then wrap the QvdReader/df processing and batching logic in a
try block and perform os.unlink(tmp_path) in a finally block; make sure tmp_path
is defined before the try and that os is imported so cleanup always runs
regardless of exceptions.
| def is_file_processed(file_path: str) -> bool: | ||
| """Check if a file has already been processed successfully.""" | ||
| result = processed_files_table.query( | ||
| f"SELECT * FROM qvd_processed_files WHERE file_path = '{file_path}' AND status = 'success'" | ||
| ) | ||
| return len(result) > 0 | ||
|
|
||
| def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None): | ||
| """Record file processing result.""" | ||
| processed_files_table.insert([{ | ||
| "file_path": file_path, | ||
| "processed_at": datetime.now(), | ||
| "record_count": record_count, | ||
| "status": status, | ||
| "error_message": error | ||
| }]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
moose-lib OlapTable.query parameterized queries
💡 Result:
In moose-lib, you don’t typically call an OlapTable.query(...) method directly—you run parameterized queries via the provided query client, and you reference the OlapTable in a safe, structured way.
TypeScript (@514labs/moose-lib): use the sql\...`` tagged template
Interpolate tables and values through the sql tag (don’t string-concatenate). Example: (docs.fiveonefour.com)
async ({ limit = 10 }, { client, sql }) => {
const result = await client.query.execute(
sql`SELECT * FROM ${clickhouseTable} LIMIT ${limit}`
);
return await result.json();
}Python (moose_lib): use typed {name: Type} placeholders + params dict
Pass parameters separately using the format-string placeholders (and use Identifier for table names). Example: (docs.fiveonefour.com)
def handler(client, params):
return client.query.execute(
"SELECT * FROM {table: Identifier} LIMIT {limit: Int32}",
{"table": clickhouse_table.name, "limit": params.limit},
)If you share which language you’re using (TS vs Python) and what you want to parameterize (WHERE clauses, IN lists, dates, etc.), I can show the exact pattern for that case.
Citations:
Use parameterized queries instead of string interpolation for the file_path filter
Replace the raw SQL string in is_file_processed with a parameterized call, for example:
result = client.query.execute(
"SELECT * FROM qvd_processed_files WHERE file_path = {path: String} AND status = {status: String}",
{"path": file_path, "status": "success"},
)🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 544
- 559, The is_file_processed function currently builds SQL with string
interpolation which allows injection; change it to use a parameterized query API
instead of f-string: replace the processed_files_table.query call in
is_file_processed with a parameterized client.query.execute (or the table
client's parameterized method) passing the SQL with named placeholders for
file_path and status and a params dict (e.g., {"path": file_path,
"status":"success"}), then return based on the result length; keep
mark_file_processed unchanged. Ensure you reference the same function name
is_file_processed and the table/query call when making the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 29-109: The Python snippet for QvdIntrospector (file
qvd_introspector.py) is missing the `@test` directive required for doc CI; update
the fenced code block that begins with ```python filename="qvd_introspector.py"
copy to ```python `@test` filename="qvd_introspector.py" copy so the
QvdIntrospector class and its methods (_load, _extract_schema, _sanitize_name,
get_schema, get_row_count) are syntax-checked during documentation builds.
- Around line 544-549: The is_file_processed function currently interpolates
file_path into the SQL string which risks SQL injection; change the
processed_files_table.query call to use a parameterized query or the query API's
parameter binding (e.g., use placeholders like %s or ? or named params depending
on the DB client) and pass file_path as a separate argument, or sanitize/escape
it via the DB client's parameterization helper, so the query becomes a
parameterized SELECT against qvd_processed_files with file_path and status bound
safely rather than concatenated.
- Around line 205-212: The retry decorator on QvdBatchInserter._insert_batch
currently hardcodes stop_after_attempt(3); change the implementation so the
actual instance max_retries is used by wrapping the insert logic in a nested
function (e.g., inner_insert or _do_insert) that is decorated with
`@retry`(stop=stop_after_attempt(self.max_retries), wait=..., retry=...) or by
programmatically creating a Tenacity Retrying object using self.max_retries, and
then call that inner function from _insert_batch; ensure you reference
QvdBatchInserter._insert_batch and the instance attribute max_retries so each
instance uses its configured retry count.
- Around line 551-559: The function mark_file_processed currently types error as
str but defaults it to None; update the signature to use Optional[str] for the
error parameter (from typing import Optional) and adjust any calls if necessary
to accept None, ensuring the annotation and default value match; reference the
function name mark_file_processed and the error parameter when making the
change.
- Around line 231-242: The _read_batches method currently calls
QvdReader(self.file_path).to_pandas(), which loads the entire file into memory
and contradicts the “chunked reading” claim; to fix, replace the full-load call
with a chunked reader like QvdFileReader(self.file_path,
chunk_size=self.batch_size).read() or use QvdTable.from_qvd(...,
chunk_size=self.batch_size) and iterate over returned chunks, applying
self._sanitize_name to each chunk's columns and yielding
chunk.to_dict(orient="records"); alternatively, if you cannot switch readers,
update the method docstring and wording to remove the “chunked reading” claim
and keep the existing implementation.
- Around line 461-487: In _read_batches, avoid loading the whole remote file
into memory and ensure the temp file is always removed: when writing the fsspec
file to the NamedTemporaryFile (tmp_path) use shutil.copyfileobj(f, tmp) instead
of tmp.write(f.read()), and wrap the processing/reading of tmp_path (the
QvdReader, df creation and batching loop) in a try/finally so that
os.unlink(tmp_path) runs in the finally block even on errors; reference the
_read_batches function, tmp_path variable, QvdReader use, and os.unlink call
when making the change.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdxapps/framework-docs-v2/src/config/navigation.ts
🧰 Additional context used
📓 Path-based instructions (5)
apps/framework-docs-v2/content/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
apps/framework-docs-v2/content/**/*.mdx: Use{{ include "shared/path.mdx" }}directives to reuse content fragments, which are processed viaprocessIncludes()during build
Validate code snippets in documentation with the@testdirective for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Guide MDX files in
content/guides/must include frontmatter with title and description fields
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Run linting checks before submitting PRs for TypeScript/JavaScript code
**/*.{ts,tsx,js,jsx}: Group imports by external dependencies, internal modules, and types; use named exports from barrel files (index.ts)
Use camelCase for variables and functions, PascalCase for types/classes/components, UPPER_SNAKE_CASE for constants in TypeScript/JavaScript
Prefix unused variables with underscore (e.g.,_unusedParam) to bypass linting errors in TypeScript/JavaScript
Format TypeScript/JavaScript code with Prettier usingexperimentalTernaries: true; auto-formats on commit via Husky + lint-staged
Files:
apps/framework-docs-v2/src/config/navigation.ts
apps/framework-docs-v2/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Component library must use shadcn/ui components located in
src/components/ui/
Files:
apps/framework-docs-v2/src/config/navigation.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Prefer interfaces for objects, types for unions/intersections in TypeScript; explicit return types on public APIs
Files:
apps/framework-docs-v2/src/config/navigation.ts
🧠 Learnings (7)
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Navigation configuration must be manually maintained in `src/config/navigation.ts` and not auto-generated from file structure
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields
Applied to files:
apps/framework-docs-v2/src/config/navigation.ts
🔇 Additional comments (1)
apps/framework-docs-v2/src/config/navigation.ts (1)
1253-1262: Navigation entry looks complete and consistent.Required fields and draft status are set as expected for a new guide.
Based on learnings: Add new guides to
guidesNavigationConfiginsrc/config/navigation.tswith required fields: type, slug, title, icon, languages, and status (start as 'draft').
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| ```bash | ||
| pip install pyqvd fsspec pandas tenacity moose-lib | ||
| ``` | ||
|
|
||
| | Package | Purpose | | ||
| |---------|---------| | ||
| | `pyqvd` | Read QVD files and extract schema metadata | | ||
| | `fsspec` | Abstract filesystem interface for local, S3, GCS, Azure storage | | ||
| | `pandas` | DataFrame manipulation for batch processing | | ||
| | `tenacity` | Retry logic for resilient batch insertions | | ||
| | `moose-lib` | MooseStack workflow and OLAP table integration | | ||
|
|
||
| ## Auto-Generating Pydantic Models from QVD Schema | ||
|
|
||
| QVD files contain embedded schema information. Rather than manually defining Pydantic models, you can introspect the QVD file and generate models automatically. | ||
|
|
||
| ### QvdIntrospector | ||
|
|
||
| The `QvdIntrospector` class reads a QVD file and extracts column names and data types: | ||
|
|
||
| ```python filename="qvd_introspector.py" copy | ||
| from pyqvd import QvdReader | ||
| from typing import Dict, Any, List, Tuple | ||
|
|
||
| class QvdIntrospector: | ||
| """Introspect QVD file schema to extract column definitions.""" | ||
|
|
||
| # Map QVD field types to Python/Pydantic types | ||
| TYPE_MAP = { | ||
| "ASCII": "str", | ||
| "TEXT": "str", | ||
| "INTEGER": "int", | ||
| "REAL": "float", | ||
| "FIX": "float", | ||
| "DUAL": "str", # Dual values stored as strings | ||
| "DATE": "datetime", | ||
| "TIME": "datetime", | ||
| "TIMESTAMP": "datetime", | ||
| "INTERVAL": "float", | ||
| "MONEY": "float", | ||
| } | ||
|
|
||
| def __init__(self, file_path: str): | ||
| self.file_path = file_path | ||
| self._reader = None | ||
| self._schema = None | ||
|
|
||
| def _load(self): | ||
| """Lazy load the QVD file.""" | ||
| if self._reader is None: | ||
| self._reader = QvdReader(self.file_path) | ||
| self._schema = self._extract_schema() | ||
|
|
||
| def _extract_schema(self) -> List[Tuple[str, str]]: | ||
| """Extract column names and types from QVD metadata.""" | ||
| columns = [] | ||
| for field in self._reader.fields: | ||
| field_name = self._sanitize_name(field.name) | ||
| field_type = self.TYPE_MAP.get(field.type, "str") | ||
| columns.append((field_name, field_type)) | ||
| return columns | ||
|
|
||
| def _sanitize_name(self, name: str) -> str: | ||
| """Convert QVD field names to valid Python identifiers.""" | ||
| # Replace spaces and special characters with underscores | ||
| sanitized = "".join(c if c.isalnum() else "_" for c in name) | ||
| # Ensure it starts with a letter or underscore | ||
| if sanitized[0].isdigit(): | ||
| sanitized = f"_{sanitized}" | ||
| return sanitized.lower() | ||
|
|
||
| def get_schema(self) -> List[Tuple[str, str]]: | ||
| """Return list of (column_name, python_type) tuples.""" | ||
| self._load() | ||
| return self._schema | ||
|
|
||
| def get_row_count(self) -> int: | ||
| """Return the number of rows in the QVD file.""" | ||
| self._load() | ||
| return self._reader.num_rows | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add @test directives to Python code blocks.
All Python snippets in this guide lack the @test directive required for doc validation. Please add @test to each Python code block so the snippets are syntax-checked in CI.
As per coding guidelines: Validate code snippets in documentation with the @test directive for TypeScript and Python code blocks.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 29
- 109, The Python snippet for QvdIntrospector (file qvd_introspector.py) is
missing the `@test` directive required for doc CI; update the fenced code block
that begins with ```python filename="qvd_introspector.py" copy to ```python
`@test` filename="qvd_introspector.py" copy so the QvdIntrospector class and its
methods (_load, _extract_schema, _sanitize_name, get_schema, get_row_count) are
syntax-checked during documentation builds.
| def __init__( | ||
| self, | ||
| file_path: str, | ||
| table: OlapTable, | ||
| model_class: Type[BaseModel], | ||
| batch_size: int = 10000, | ||
| max_retries: int = 3 | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
sed -n '150,300p' apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdxRepository: 514-labs/moosestack
Length of output: 5102
Use max_retries in the retry decorator
The @retry on QvdBatchInserter._insert_batch uses stop_after_attempt(3), ignoring the max_retries parameter. Refactor to apply stop_after_attempt(self.max_retries) (e.g. wrap the insert logic in a nested function decorated with your instance’s max_retries).
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 205
- 212, The retry decorator on QvdBatchInserter._insert_batch currently hardcodes
stop_after_attempt(3); change the implementation so the actual instance
max_retries is used by wrapping the insert logic in a nested function (e.g.,
inner_insert or _do_insert) that is decorated with
`@retry`(stop=stop_after_attempt(self.max_retries), wait=..., retry=...) or by
programmatically creating a Tenacity Retrying object using self.max_retries, and
then call that inner function from _insert_batch; ensure you reference
QvdBatchInserter._insert_batch and the instance attribute max_retries so each
instance uses its configured retry count.
| def _read_batches(self) -> Generator[list, None, None]: | ||
| """Read QVD file in batches using pyqvd.""" | ||
| reader = QvdReader(self.file_path) | ||
| df = reader.to_pandas() | ||
|
|
||
| # Sanitize column names to match Pydantic model | ||
| df.columns = [self._sanitize_name(col) for col in df.columns] | ||
|
|
||
| for start in range(0, len(df), self.batch_size): | ||
| end = min(start + self.batch_size, len(df)) | ||
| batch_df = df.iloc[start:end] | ||
| yield batch_df.to_dict(orient="records") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
pyqvd QvdReader to_pandas chunk streaming
💡 Result:
pyqvd doesn’t stream directly into a single growing pandas DataFrame, but it can read a QVD in record chunks and you can convert each chunk to pandas (to_pandas()) and process/write incrementally.
Chunked read → pandas per chunk (recommended)
from pyqvd import QvdTable
for tbl_chunk in QvdTable.from_qvd("big.qvd", chunk_size=100_000):
df = tbl_chunk.to_pandas()
# process df (e.g., write to parquet, aggregate, etc.)chunk_size is “number of records” and from_qvd(..., chunk_size=...) returns an iterator of QvdTable slices. [1]
Same idea via the lower-level reader
from pyqvd.io import QvdFileReader
reader = QvdFileReader("big.qvd", chunk_size=100_000)
for tbl_chunk in reader.read():
df = tbl_chunk.to_pandas()QvdFileReader(..., chunk_size=...) + read() returns either a single QvdTable or an iterator when chunking is enabled. [2]
If you want a true “generator of chunks” API (alternative library)
qvd-utils explicitly exposes read_in_chunks(..., chunk_size=...) returning a generator of dicts that you can wrap into DataFrames. [3]
Sources: PyQvd core API (from_qvd, to_pandas) [1], PyQvd I/O API (QvdFileReader, chunking) [2], qvd-utils PyPI docs (chunk generator) [3].
Implement streaming or update wording. The current reader.to_pandas() call loads the entire QVD into memory, so it doesn’t perform chunked reading. Use QvdFileReader(self.file_path, chunk_size=self.batch_size).read() (or QvdTable.from_qvd(..., chunk_size=...)) to process in batches, or adjust the description to drop the “chunked reading” claim.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 231
- 242, The _read_batches method currently calls
QvdReader(self.file_path).to_pandas(), which loads the entire file into memory
and contradicts the “chunked reading” claim; to fix, replace the full-load call
with a chunked reader like QvdFileReader(self.file_path,
chunk_size=self.batch_size).read() or use QvdTable.from_qvd(...,
chunk_size=self.batch_size) and iterate over returned chunks, applying
self._sanitize_name to each chunk's columns and yielding
chunk.to_dict(orient="records"); alternatively, if you cannot switch readers,
update the method docstring and wording to remove the “chunked reading” claim
and keep the existing implementation.
| def _read_batches(self) -> Generator[list, None, None]: | ||
| """Read QVD file from any fsspec-compatible filesystem.""" | ||
| # Open file using fsspec (supports s3://, gs://, az://, etc.) | ||
| with fsspec.open( | ||
| self.file_path, | ||
| mode="rb", | ||
| **self.storage_options | ||
| ) as f: | ||
| # Write to temp file for pyqvd compatibility | ||
| import tempfile | ||
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | ||
| tmp.write(f.read()) | ||
| tmp_path = tmp.name | ||
|
|
||
| # Process from temp file | ||
| reader = QvdReader(tmp_path) | ||
| df = reader.to_pandas() | ||
| df.columns = [self._sanitize_name(col) for col in df.columns] | ||
|
|
||
| for start in range(0, len(df), self.batch_size): | ||
| end = min(start + self.batch_size, len(df)) | ||
| batch_df = df.iloc[start:end] | ||
| yield batch_df.to_dict(orient="records") | ||
|
|
||
| # Cleanup | ||
| import os | ||
| os.unlink(tmp_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap temp file usage in try/finally and stream copy
Ensure tmp_path is always deleted by wrapping the processing in a try/finally, and replace tmp.write(f.read()) with shutil.copyfileobj(f, tmp) to avoid loading the entire file into memory.
Proposed fix
- with fsspec.open(
- self.file_path,
- mode="rb",
- **self.storage_options
- ) as f:
- # Write to temp file for pyqvd compatibility
- import tempfile
- with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
- tmp.write(f.read())
- tmp_path = tmp.name
-
- # Process from temp file
- reader = QvdReader(tmp_path)
- df = reader.to_pandas()
- df.columns = [self._sanitize_name(col) for col in df.columns]
-
- for start in range(0, len(df), self.batch_size):
- end = min(start + self.batch_size, len(df))
- batch_df = df.iloc[start:end]
- yield batch_df.to_dict(orient="records")
-
- # Cleanup
- import os
- os.unlink(tmp_path)
+ import os
+ import tempfile
+ import shutil
+ with fsspec.open(
+ self.file_path,
+ mode="rb",
+ **self.storage_options
+ ) as f:
+ # Write to temp file for pyqvd compatibility
+ with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp:
+ shutil.copyfileobj(f, tmp)
+ tmp_path = tmp.name
+
+ try:
+ # Process from temp file
+ reader = QvdReader(tmp_path)
+ df = reader.to_pandas()
+ df.columns = [self._sanitize_name(col) for col in df.columns]
+
+ for start in range(0, len(df), self.batch_size):
+ end = min(start + self.batch_size, len(df))
+ batch_df = df.iloc[start:end]
+ yield batch_df.to_dict(orient="records")
+ finally:
+ os.unlink(tmp_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _read_batches(self) -> Generator[list, None, None]: | |
| """Read QVD file from any fsspec-compatible filesystem.""" | |
| # Open file using fsspec (supports s3://, gs://, az://, etc.) | |
| with fsspec.open( | |
| self.file_path, | |
| mode="rb", | |
| **self.storage_options | |
| ) as f: | |
| # Write to temp file for pyqvd compatibility | |
| import tempfile | |
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | |
| tmp.write(f.read()) | |
| tmp_path = tmp.name | |
| # Process from temp file | |
| reader = QvdReader(tmp_path) | |
| df = reader.to_pandas() | |
| df.columns = [self._sanitize_name(col) for col in df.columns] | |
| for start in range(0, len(df), self.batch_size): | |
| end = min(start + self.batch_size, len(df)) | |
| batch_df = df.iloc[start:end] | |
| yield batch_df.to_dict(orient="records") | |
| # Cleanup | |
| import os | |
| os.unlink(tmp_path) | |
| def _read_batches(self) -> Generator[list, None, None]: | |
| """Read QVD file from any fsspec-compatible filesystem.""" | |
| import os | |
| import tempfile | |
| import shutil | |
| with fsspec.open( | |
| self.file_path, | |
| mode="rb", | |
| **self.storage_options | |
| ) as f: | |
| # Write to temp file for pyqvd compatibility | |
| with tempfile.NamedTemporaryFile(suffix=".qvd", delete=False) as tmp: | |
| shutil.copyfileobj(f, tmp) | |
| tmp_path = tmp.name | |
| try: | |
| # Process from temp file | |
| reader = QvdReader(tmp_path) | |
| df = reader.to_pandas() | |
| df.columns = [self._sanitize_name(col) for col in df.columns] | |
| for start in range(0, len(df), self.batch_size): | |
| end = min(start + self.batch_size, len(df)) | |
| batch_df = df.iloc[start:end] | |
| yield batch_df.to_dict(orient="records") | |
| finally: | |
| os.unlink(tmp_path) |
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 461
- 487, In _read_batches, avoid loading the whole remote file into memory and
ensure the temp file is always removed: when writing the fsspec file to the
NamedTemporaryFile (tmp_path) use shutil.copyfileobj(f, tmp) instead of
tmp.write(f.read()), and wrap the processing/reading of tmp_path (the QvdReader,
df creation and batching loop) in a try/finally so that os.unlink(tmp_path) runs
in the finally block even on errors; reference the _read_batches function,
tmp_path variable, QvdReader use, and os.unlink call when making the change.
| def is_file_processed(file_path: str) -> bool: | ||
| """Check if a file has already been processed successfully.""" | ||
| result = processed_files_table.query( | ||
| f"SELECT * FROM qvd_processed_files WHERE file_path = '{file_path}' AND status = 'success'" | ||
| ) | ||
| return len(result) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use parameterized queries instead of string interpolation
Interpolating file_path directly into the SQL string risks injection; switch to a parameterized query or sanitize inputs.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 544
- 549, The is_file_processed function currently interpolates file_path into the
SQL string which risks SQL injection; change the processed_files_table.query
call to use a parameterized query or the query API's parameter binding (e.g.,
use placeholders like %s or ? or named params depending on the DB client) and
pass file_path as a separate argument, or sanitize/escape it via the DB client's
parameterization helper, so the query becomes a parameterized SELECT against
qvd_processed_files with file_path and status bound safely rather than
concatenated.
| def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None): | ||
| """Record file processing result.""" | ||
| processed_files_table.insert([{ | ||
| "file_path": file_path, | ||
| "processed_at": datetime.now(), | ||
| "record_count": record_count, | ||
| "status": status, | ||
| "error_message": error | ||
| }]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for typing imports and context around the snippet
rg "from typing import" -n apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
rg "Optional" -n apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
rg -n -C 10 "mark_file_processed" apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdxRepository: 514-labs/moosestack
Length of output: 1662
🏁 Script executed:
#!/bin/bash
tail -n +500 apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx | head -n 100Repository: 514-labs/moosestack
Length of output: 2847
Annotate error parameter as Optional[str]
Default value None conflicts with the str annotation; update to Optional[str].
Signature update
-def mark_file_processed(file_path: str, record_count: int, status: str, error: str = None):
+def mark_file_processed(
+ file_path: str,
+ record_count: int,
+ status: str,
+ error: Optional[str] = None
+):🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx` around lines 551
- 559, The function mark_file_processed currently types error as str but
defaults it to None; update the signature to use Optional[str] for the error
parameter (from typing import Optional) and adjust any calls if necessary to
accept None, ensuring the annotation and default value match; reference the
function name mark_file_processed and the error parameter when making the
change.
Remove implementation code that confuses users: - Removed QvdIntrospector, QvdModelGenerator, and QvdBatchInserter class implementations - Removed pip install commands with specific package versions - Removed all code users would need to write themselves Replace with usage-focused content: - How to install the pipeline from the registry - Configuration via environment variables - Using the provided init_qvd.py CLI tool - Running workflows and monitoring via API - Error handling and troubleshooting The guide now shows HOW TO USE the pre-built pipeline instead of HOW TO BUILD IT from scratch. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx`:
- Around line 159-172: The Pydantic model class QvdItemModel is being
overwritten by assigning QvdItemModel = OlapTable(...); rename the table
variable (for example QvdItemTable or qvd_item_table) so the Pydantic class
QvdItemModel remains intact, update any references that expect the OlapTable to
use the new table name, and ensure the OlapTable columns and name remain the
same.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧰 Additional context used
📓 Path-based instructions (2)
apps/framework-docs-v2/content/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
apps/framework-docs-v2/content/**/*.mdx: Use{{ include "shared/path.mdx" }}directives to reuse content fragments, which are processed viaprocessIncludes()during build
Validate code snippets in documentation with the@testdirective for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Guide MDX files in
content/guides/must include frontmatter with title and description fields
Files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Validate code snippets in documentation with the `test` directive for TypeScript and Python code blocks
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/moose/src/query-examples/**/*.ts : Query example functions must use `eventsModel.toSql()` to generate SQL, ensuring consistent query construction through the defined model
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
- Add columns prop to CTACards - Change href to ctaLink in CTACard - Add ctaLabel and Icon props - Remove unused BulletPointsCard import
Change QvdItemModel = OlapTable(...) to QvdItemTable = OlapTable(...) to prevent shadowing the Pydantic model class. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
|
Fixed the example code issue identified by CodeRabbit - changed |
Fix critical issues and enhance user guidance: - Add missing component imports (ConditionalContent, CustomizePanel, etc.) - Clarify MCP requirement for AI copilot support - Add community support callout and connection string instructions - Improve section transitions with progress summaries - Fix GitHub capitalization and clarify local/remote setup These changes improve the guide's accuracy and help users successfully migrate their dashboards to ClickHouse. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The tracking information is accessible through the API endpoint, so direct ClickHouse queries are no longer needed in the guide. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/performant-dashboards.mdx`:
- Line 8: The import statement on line 8 includes unused
components—ConditionalContent, CustomizePanel, CustomizeGrid, and
SelectField—which should be removed to eliminate dead imports; update the import
list that currently contains CTACards, CTACard, Callout, BulletPointsCard,
ToggleBlock, ConditionalContent, CustomizePanel, CustomizeGrid, SelectField to
only include the actually used symbols (e.g., CTACards, CTACard, Callout,
BulletPointsCard, ToggleBlock) by deleting the unused component names from that
import.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧰 Additional context used
📓 Path-based instructions (2)
apps/framework-docs-v2/content/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
apps/framework-docs-v2/content/**/*.mdx: Use{{ include "shared/path.mdx" }}directives to reuse content fragments, which are processed viaprocessIncludes()during build
Validate code snippets in documentation with the@testdirective for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
apps/framework-docs-v2/content/guides/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Guide MDX files in
content/guides/must include frontmatter with title and description fields
Files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
📚 Learning: 2026-01-15T19:41:53.563Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/cdp-analytics/CLAUDE.md:0-0
Timestamp: 2026-01-15T19:41:53.563Z
Learning: Applies to examples/cdp-analytics/web-app/src/features/dashboard/*.tsx : Dashboard components should follow React hooks patterns: use useState for local data, useEffect for data fetching, handle errors with fallback data, and wrap visualizations in Card/CardHeader/CardContent from Radix UI
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/components/dashboard/**/*.ts : Dashboard hooks must call server actions (not moose functions directly) to decouple React components from ClickHouse implementation details
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Use `{{ include "shared/path.mdx" }}` directives to reuse content fragments, which are processed via `processIncludes()` during build
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : Validate code snippets in documentation with the `test` directive for TypeScript and Python code blocks
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/**/*.mdx : TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: The moose query layer represents a two-step OLAP migration pattern: Step 1 (not shown) shifts just-in-time joins to write-time via Materialized Views; Step 2 (this demo) accelerates endpoint creation with the query layer. This architecture assumes MVs/OLAP tables already exist and focuses on the read-time query selection and filtering layer
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-15T19:41:53.563Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/cdp-analytics/CLAUDE.md:0-0
Timestamp: 2026-01-15T19:41:53.563Z
Learning: For slow dashboard queries (>500ms) that aggregate across multiple tables or need cohort-based breakdowns, implement a materialized view instead of real-time queries
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdx
📚 Learning: 2026-01-28T14:50:00.978Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1402-1406
Timestamp: 2026-01-28T14:50:00.978Z
Learning: When reviewing WebApp implementations that run in a Node.js backend, prefer Node.js-specific types (e.g., Buffer) and avoid browser-focused cross-platform compatibility considerations. This guideline applies to MDX documentation and code examples across the repository where backend/runtime code is discussed; ensure reviewers distinguish server-side context from frontend/browser constraints and validate type usage accordingly.
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-28T14:55:13.411Z
Learnt from: cjus
Repo: 514-labs/moosestack PR: 3407
File: apps/framework-docs-v2/content/guides/static-report-generation.mdx:1585-1616
Timestamp: 2026-01-28T14:55:13.411Z
Learning: In documentation and tutorial files under apps/framework-docs-v2/content/guides/, prefer explicit, self-contained code examples that may duplicate small snippets for clarity over abstracting with DRY helpers. Ensure examples are easy to understand in isolation, even if it means a bit of repetition, to aid learning and reduce cognitive load for readers.
Applied to files:
apps/framework-docs-v2/content/guides/performant-dashboards.mdxapps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-27T17:36:29.854Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: examples/nextjs-moose/CLAUDE.md:0-0
Timestamp: 2026-01-27T17:36:29.854Z
Learning: Applies to examples/nextjs-moose/moose/src/query-examples/**/*.ts : Query example functions must use `eventsModel.toSql()` to generate SQL, ensuring consistent query construction through the defined model
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2025-12-17T23:59:22.807Z
Learnt from: callicles
Repo: 514-labs/moosestack PR: 3143
File: apps/framework-docs/llm-docs/python/table-setup.md:331-336
Timestamp: 2025-12-17T23:59:22.807Z
Learning: In apps/framework-docs/llm-docs/**/*.md (LLM documentation), prefer exact class and type names (e.g., "MaterializedView", "OlapTable") over lowercase prose variants (e.g., "materialized view") to help LLMs generate correct code with proper imports and syntax.
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Applied to files:
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx
🔇 Additional comments (6)
apps/framework-docs-v2/content/guides/qvd-to-clickhouse.mdx (2)
1-4: Frontmatter is complete.Title and description fields are present as required.
10-35: Clear overview and feature list.The introduction effectively explains QVD files, migration benefits, and pipeline capabilities. Structure is well-organized.
apps/framework-docs-v2/content/guides/performant-dashboards.mdx (4)
134-138: Clarifications improve guidance.Remote ClickHouse context and MCP-compatible editor examples are helpful additions.
722-728: Clear Boreal connection string instructions.Step-by-step guidance helps users locate credentials.
1133-1142: Helpful progress checkpoint.Success callout summarizes completed work and provides clear transition to production deployment.
149-151: Slack invite link is active and valid.The shared invite link returns a 302 redirect to the moose-community workspace, confirming it's currently functional.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| --- | ||
|
|
||
| import { CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock } from "@/components/mdx"; | ||
| import { CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock, ConditionalContent, CustomizePanel, CustomizeGrid, SelectField } from "@/components/mdx"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if newly imported components are used in the file
for comp in ConditionalContent CustomizePanel CustomizeGrid SelectField; do
echo "=== Checking usage of $comp ==="
rg "<${comp}" apps/framework-docs-v2/content/guides/performant-dashboards.mdx || echo "Not found"
doneRepository: 514-labs/moosestack
Length of output: 266
Remove unused component imports.
The imports ConditionalContent, CustomizePanel, CustomizeGrid, and SelectField are not used anywhere in the file and should be removed from line 8.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/content/guides/performant-dashboards.mdx` at line 8,
The import statement on line 8 includes unused components—ConditionalContent,
CustomizePanel, CustomizeGrid, and SelectField—which should be removed to
eliminate dead imports; update the import list that currently contains CTACards,
CTACard, Callout, BulletPointsCard, ToggleBlock, ConditionalContent,
CustomizePanel, CustomizeGrid, SelectField to only include the actually used
symbols (e.g., CTACards, CTACard, Callout, BulletPointsCard, ToggleBlock) by
deleting the unused component names from that import.
Remove draft status to make the guide visible in the guides list without requiring feature flags. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Keep guide hidden behind show-draft-guides feature flag. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Configure build-time environment variable to include draft guides in Vercel deployments. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| "env": { | ||
| "NEXT_PUBLIC_NAV_VARIANT": "draft" | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vercel.json exposes draft content in production builds
High Severity
The vercel.json change sets NEXT_PUBLIC_NAV_VARIANT to draft in production builds. According to nav-variant.ts, production should default to base (public content only), while draft exposes internal draft guides. This configuration change will expose all draft content (not just this new guide) to users when deployed to production. This appears to be testing configuration accidentally included for the preview deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/vercel.json`:
- Line 2: The vercel.json currently runs "installCommand": "pnpm install
--filter docs-v2..." from the app directory; change it to run the monorepo root
install instead by setting the installCommand to "pnpm install" (or removing the
custom installCommand so Vercel runs root install), and if necessary update the
vercel.json rootDirectory value so Vercel deploys from the monorepo root while
still targeting the docs-v2 app; look for the "installCommand" and
"rootDirectory" keys in vercel.json and ensure installs happen at the repo root
to keep workspace deps and lockfile consistent.
- Around line 4-6: Remove the hardcoded NEXT_PUBLIC_NAV_VARIANT="draft" from
vercel.json so draft navigation is not forced across all deployments; instead,
delete or unset the NEXT_PUBLIC_NAV_VARIANT entry in vercel.json and configure
visibility via Vercel Toolbar feature flags (e.g., show-draft-guides,
show-beta-guides, show-hosting-section, show-ai-section, show-copy-as-markdown,
show-linear-integration) or use preview-only environment variables for preview
builds so production deployments never expose draft guides.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
apps/framework-docs-v2/vercel.json
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Add new guides to `guidesNavigationConfig` in `src/config/navigation.ts` with required fields: type, slug, title, icon, languages, and status (start as 'draft')
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Applies to apps/framework-docs-v2/content/guides/**/*.mdx : Guide MDX files in `content/guides/` must include frontmatter with title and description fields
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Use Vercel Toolbar feature flags for visibility control: `show-draft-guides`, `show-beta-guides`, `show-hosting-section`, `show-ai-section`, `show-copy-as-markdown`, `show-linear-integration`
Applied to files:
apps/framework-docs-v2/vercel.json
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Guide promotion workflow: start guides with `status: "draft"` in navigation config, then progress to `status: "beta"`, finally remove status field for public release
Applied to files:
apps/framework-docs-v2/vercel.json
📚 Learning: 2025-12-16T23:08:31.521Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-16T23:08:31.521Z
Learning: Build TypeScript/JavaScript packages using `pnpm build` before integration
Applied to files:
apps/framework-docs-v2/vercel.json
📚 Learning: 2026-01-22T19:02:52.316Z
Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-docs-v2/CLAUDE.md:0-0
Timestamp: 2026-01-22T19:02:52.316Z
Learning: Always install dependencies from the monorepo root using `pnpm install`, never run `pnpm install` directly from `apps/framework-docs-v2/`
Applied to files:
apps/framework-docs-v2/vercel.json
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| @@ -1,3 +1,8 @@ | |||
| { | |||
| "installCommand": "pnpm install --filter docs-v2..." | |||
| "installCommand": "pnpm install --filter docs-v2...", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Install dependencies from the monorepo root.
pnpm install --filter docs-v2... runs from the app directory and conflicts with the repo workflow. Use a root install (and adjust Vercel rootDirectory if needed) to keep workspace deps and lockfile consistent.
💡 Suggested change
- "installCommand": "pnpm install --filter docs-v2...",
+ "installCommand": "pnpm install",Based on learnings: Always install dependencies from the monorepo root using pnpm install, never run pnpm install directly from apps/framework-docs-v2/.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "installCommand": "pnpm install --filter docs-v2...", | |
| "installCommand": "pnpm install", |
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/vercel.json` at line 2, The vercel.json currently runs
"installCommand": "pnpm install --filter docs-v2..." from the app directory;
change it to run the monorepo root install instead by setting the installCommand
to "pnpm install" (or removing the custom installCommand so Vercel runs root
install), and if necessary update the vercel.json rootDirectory value so Vercel
deploys from the monorepo root while still targeting the docs-v2 app; look for
the "installCommand" and "rootDirectory" keys in vercel.json and ensure installs
happen at the repo root to keep workspace deps and lockfile consistent.
| "env": { | ||
| "NEXT_PUBLIC_NAV_VARIANT": "draft" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t force draft navigation in all builds.
Setting NEXT_PUBLIC_NAV_VARIANT="draft" in vercel.json applies to all deployments and can expose draft guides publicly, bypassing the intended feature-flag workflow. Use Vercel Toolbar flags (or preview-only envs) instead.
💡 Suggested change
- "build": {
- "env": {
- "NEXT_PUBLIC_NAV_VARIANT": "draft"
- }
- }
+ "build": {}Based on learnings: Use Vercel Toolbar feature flags for visibility control: show-draft-guides, show-beta-guides, show-hosting-section, show-ai-section, show-copy-as-markdown, show-linear-integration.
🤖 Prompt for AI Agents
In `@apps/framework-docs-v2/vercel.json` around lines 4 - 6, Remove the hardcoded
NEXT_PUBLIC_NAV_VARIANT="draft" from vercel.json so draft navigation is not
forced across all deployments; instead, delete or unset the
NEXT_PUBLIC_NAV_VARIANT entry in vercel.json and configure visibility via Vercel
Toolbar feature flags (e.g., show-draft-guides, show-beta-guides,
show-hosting-section, show-ai-section, show-copy-as-markdown,
show-linear-integration) or use preview-only environment variables for preview
builds so production deployments never expose draft guides.


Summary
Creates comprehensive documentation for transferring QlikView QVD files to ClickHouse using MooseStack workflows (Linear: ENG-2099).
Changes
New Guide:
guides/qvd-to-clickhouse.mdxCovers:
pyqvd,fsspec,pandas,tenacity,moose-lib)QvdIntrospectorandQvdModelGeneratorQvdBatchInserter(includes retry logic, progress tracking, error handling)fsspecNavigation Update
Added entry to
src/config/navigation.ts:IconDatabaseImportdraft(for internal review)pyqvdis Python-specific)Code Examples
All code examples are based on the reference implementation at
registry/pipeline-registry/qvd_to_clickhouseand demonstrate proper MooseStack workflow patterns.Related Documentation
🤖 Generated with Claude Code
Note
Medium Risk
Mostly documentation and navigation changes, but it also alters
vercel.jsonbuild env (NEXT_PUBLIC_NAV_VARIANT) which could change what content is visible in deployed docs if misconfigured.Overview
Adds a new draft guide,
guides/qvd-to-clickhouse, documenting a MooseStack workflow-based pipeline for syncing QVD files into ClickHouse (setup, config/env vars, model generation, running/monitoring, and troubleshooting).Updates the
Improving the Performance of Your Dashboardsguide with clearer local setup guidance (remote ClickHouse seeding, Boreal connection-string instructions), MCP-focused copilot wording, and new progress/help callouts.Exposes the new guide via draft navigation and sets Vercel build env
NEXT_PUBLIC_NAV_VARIANT: "draft"to support draft navigation behavior.Written by Cursor Bugbot for commit 4739016. This will update automatically on new commits. Configure here.