Skip to content

Conversation

@kumar-punit
Copy link
Collaborator

@kumar-punit kumar-punit commented Jan 15, 2026

Add reserved field names (type, subtype, location) that are exclusively managed by NV-Ingest and cannot be used in custom metadata schemas.

Backend:

  • Add reserved field validation in metadata_validation.py
  • Skip reserved fields during schema auto-addition
  • Add 21 tests for reserved field validation

Frontend:

  • Improve error message parsing for Pydantic validation errors
  • Add useCollectionsApi test suite (8 tests)
  • Update component tests with complete mocks

Docs:

  • Document reserved fields in custom-metadata.md
  • Add release note for reserved field validation

Summary by CodeRabbit

  • Documentation

    • Documented reserved metadata field naming constraints: type, subtype, and location are reserved for system use and cannot be used in custom metadata schemas.
  • Bug Fixes

    • Enhanced collection creation error handling to display specific validation error messages instead of generic failure notifications.
  • New Features

    • Added error notifications to the collection creation form to display API validation errors.

✏️ Tip: You can customize this high-level summary in your review settings.

@kumar-punit kumar-punit self-assigned this Jan 15, 2026
@kumar-punit
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)
Validation error: Invalid regex pattern for base branch. Received: "release-**" at "reviews.auto_review.base_branches[2]"
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Walkthrough

The PR introduces reserved metadata fields (type, subtype, location) managed exclusively by NV-Ingest. Changes include backend validation to reject these reserved names in custom schemas, improved API error handling with JSON parsing, frontend state management updates to expose error messages, and comprehensive documentation and test coverage.

Changes

Cohort / File(s) Summary
Documentation
docs/custom-metadata.md, docs/release-notes.md
Added notes documenting reserved field names (type, subtype, location) that cannot be used in custom metadata schemas and will cause validation errors.
Backend Validation Logic
src/nvidia_rag/utils/metadata_validation.py
Registered three new system-managed fields as reserved: type, subtype, and location. Added validation to raise ValueError when custom schemas attempt to use reserved field names.
Backend Integration
src/nvidia_rag/ingestor_server/main.py
Modified collection creation to filter out reserved fields from the user-visible metadata schema. Improved logging formatting in exception handler.
Frontend API Enhancement
frontend/src/api/useCollectionsApi.ts
Enhanced error handling in useCreateCollection to parse JSON response bodies and extract specific error messages from Pydantic validation errors, with fallback to generic message.
Frontend State Management
frontend/src/components/collections/NewCollectionButtons.tsx
Extended useNewCollectionStore public API to expose an error property alongside existing fields for improved error notification display.
Frontend Tests
frontend/src/components/collections/__tests__/NewCollectionButtons.test.tsx, frontend/src/hooks/__tests__/useCollectionsApi.test.tsx
Added test coverage for error handling in collection creation, API response parsing, and store error state integration. Total 335 lines of new tests.
Backend Tests
tests/unit/test_ingestor_server/test_system_managed_fields_integration.py, tests/unit/test_metadata_validation/test_metadata_validator.py
Introduced TestReservedFieldsFiltering and TestReservedFieldNames test suites validating reserved field behavior, rejection of reserved names, and proper field classification (256 lines total). Note: Duplicate test class definition in integration tests.

Sequence Diagram

sequenceDiagram
    participant Client as Client (Browser)
    participant API as API Server
    participant Validator as Metadata Validator
    participant Store as Store/Schema

    Client->>API: Create Collection with metadata_schema
    API->>Validator: Validate custom field names
    alt Field name is reserved (type, subtype, location)
        Validator-->>API: ValueError - Reserved field name
        API-->>Client: 422 Error with detail message
        Client->>Client: Parse error & update store.error
    else Field name is allowed
        Validator-->>API: Validation passes
        API->>Store: Filter out reserved fields
        Store-->>API: User schema (without reserved fields)
        API-->>Client: 200 Success
        Client->>Client: Clear store.error
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

enhancement, documentation

Suggested reviewers

  • shubhadeepd
  • nv-pranjald
  • smasurekar

Poem

🐰 Reserved fields now guard the door,
Type, subtype, location kept in store,
Validation blooms with error cheer,
Tests confirm what's crystal clear!
🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: introducing reserved field validation (type, subtype, location) and improving error handling for API responses.
Docstring Coverage ✅ Passed Docstring coverage is 96.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docs/release-notes.md`:
- Line 51: The release note entry "Reserved field names `type`, `subtype`, and
`location` for NV-Ingest exclusive use in metadata schemas." is placed under
Release 2.3.2 but should be in Release 2.4.0; locate that exact bullet
(currently at line containing that text) and cut it from the 2.3.2 section and
paste it into the Release 2.4.0 section (ensuring formatting/ordering matches
other bullets), then save the file.

In `@src/nvidia_rag/utils/metadata_validation.py`:
- Around line 501-508: The validation currently checks membership using the
original self.name which allows different-cased reserved names to bypass the
check; update the check to normalize the field name (use self.name.lower()) when
testing against SYSTEM_MANAGED_FIELDS and use the normalized value for lookup
(e.g., replace occurrences of "if self.name in SYSTEM_MANAGED_FIELDS" with a
lowercase-normalized membership and lookup) so names like "Type"/"ID" are
treated as reserved; keep the rest of the ValueError behavior unchanged.

In `@tests/unit/test_ingestor_server/test_system_managed_fields_integration.py`:
- Around line 346-350: The test calls pass an unexpected keyword
'embedding_dimension' to create_collection which causes a TypeError; remove the
embedding_dimension=1024 argument from the create_collection calls in
tests/unit/test_ingestor_server/test_system_managed_fields_integration.py
(around the shown block and the similar call at lines ~429-433) and instead rely
on the create_collection default behavior or pass the correct parameter name
that matches the create_collection signature in your codebase (use the exact
parameter name defined by the function if a dimension must be provided).
🧹 Nitpick comments (4)
frontend/src/components/collections/NewCollectionButtons.tsx (1)

58-60: Consider prioritizing API/store errors over local validation text in the Notification.

slotSubheading={nameError || validationMessage || error} will hide error when any local validation is present; please confirm that’s intended (often API errors should win once present).

Also applies to: 257-265

frontend/src/components/collections/__tests__/NewCollectionButtons.test.tsx (1)

117-127: Inconsistent mock shape: missing error field.

Several test cases override mockUseNewCollectionStore but omit the error field that's included in beforeEach and other tests. While this works because the component may handle undefined gracefully, it creates inconsistency. The following test cases are missing the error: null property:

  • Lines 117-127, 135-145, 153-163, 172-182, 195-205, 212-226, 233-243, 251-261, 268-278, 287-302, 310-325, 333-343, 353-363, 373-383, 455-465

For consistency and to avoid potential issues if the component's error handling changes, consider adding error: null to all mock overrides.

frontend/src/api/useCollectionsApi.ts (1)

67-85: Improved error handling with Pydantic validation support.

The error parsing logic correctly handles multiple response formats and gracefully falls back to a default message. The "Value error, " prefix stripping provides cleaner user-facing messages.

One consideration: when errorData.detail is an array with multiple validation errors, only the first error message is shown. If multiple fields fail validation simultaneously, users won't see all issues at once.

💡 Optional: Show all validation errors

If showing all errors is desired:

          if (Array.isArray(errorData.detail)) {
-            // Extract message from Pydantic validation error
-            const msg = errorData.detail[0]?.msg || errorMessage;
-            // Clean up "Value error, " prefix
-            errorMessage = msg.replace(/^Value error, /, '');
+            // Extract messages from Pydantic validation errors
+            const messages = errorData.detail
+              .map((e: { msg?: string }) => e.msg?.replace(/^Value error, /, ''))
+              .filter(Boolean);
+            errorMessage = messages.length > 0 ? messages.join('; ') : errorMessage;
          } else if (errorData.detail) {
frontend/src/hooks/__tests__/useCollectionsApi.test.tsx (1)

44-49: Consider typing the fetchSpy variable.

Using any type loses type safety benefits.

💡 Optional: Add proper type
-  let fetchSpy: any;
+  let fetchSpy: ReturnType<typeof vi.spyOn>;
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a457b11 and 5ed879b.

📒 Files selected for processing (10)
  • docs/custom-metadata.md
  • docs/release-notes.md
  • frontend/src/api/useCollectionsApi.ts
  • frontend/src/components/collections/NewCollectionButtons.tsx
  • frontend/src/components/collections/__tests__/NewCollectionButtons.test.tsx
  • frontend/src/hooks/__tests__/useCollectionsApi.test.tsx
  • src/nvidia_rag/ingestor_server/main.py
  • src/nvidia_rag/utils/metadata_validation.py
  • tests/unit/test_ingestor_server/test_system_managed_fields_integration.py
  • tests/unit/test_metadata_validation/test_metadata_validator.py
🧰 Additional context used
🧬 Code graph analysis (3)
frontend/src/components/collections/__tests__/NewCollectionButtons.test.tsx (1)
frontend/src/components/collections/NewCollectionButtons.tsx (1)
  • NewCollectionButtons (56-307)
frontend/src/hooks/__tests__/useCollectionsApi.test.tsx (1)
frontend/src/api/useCollectionsApi.ts (3)
  • useCollections (29-39)
  • useCreateCollection (57-92)
  • useDeleteCollection (105-128)
tests/unit/test_metadata_validation/test_metadata_validator.py (1)
src/nvidia_rag/utils/metadata_validation.py (3)
  • MetadataField (440-539)
  • value (2393-2396)
  • MetadataSchema (542-600)
🪛 Ruff (0.14.11)
src/nvidia_rag/utils/metadata_validation.py

505-508: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Unit Tests
  • GitHub Check: Integration Tests
🔇 Additional comments (12)
src/nvidia_rag/utils/metadata_validation.py (1)

127-147: SYSTEM_MANAGED_FIELDS reserved entries look consistent with the PR objective.

Good addition of reserved: True to clearly differentiate NV-Ingest-exclusive fields.

src/nvidia_rag/ingestor_server/main.py (2)

174-177: Logging tweak is fine.


1076-1080: Reserved fields are correctly excluded from auto-added metadata schema.

This matches the goal of preventing type/subtype/location from becoming user-visible schema fields.

tests/unit/test_ingestor_server/test_system_managed_fields_integration.py (1)

326-456: Reserved-field assertions are on the right track.

Nice coverage for “not auto-added” + “reserved flag present” + “properties match expectations”.

frontend/src/components/collections/__tests__/NewCollectionButtons.test.tsx (1)

435-451: Good coverage for reserved field error display.

This test correctly validates that the component renders the store's error message for reserved field validation failures. The test uses a realistic error message that aligns with the backend validation documented in the PR.

docs/custom-metadata.md (2)

227-227: Clear documentation of the reserved field constraint.

Good addition to the Field Name Validation rules. This aligns with the backend validation that rejects type, subtype, and location as custom field names.


240-242: Well-placed note block for reserved fields.

The note effectively documents the reserved field restriction with clear guidance on the consequence (validation error). The placement between Field Name Validation and System-Managed Field Behavior sections is logical.

tests/unit/test_metadata_validation/test_metadata_validator.py (2)

3473-3595: Comprehensive test coverage for reserved field validation.

The TestReservedFieldNames class provides thorough coverage including:

  • Positive tests for reserved field detection
  • Negative tests ensuring non-reserved fields work
  • Error message validation with helpful guidance
  • Case sensitivity behavior verification
  • Whitespace trimming edge case
  • Schema-level validation integration

The tests align well with the implementation in metadata_validation.py.


3568-3578: Verify case-sensitive behavior is intentional.

The tests confirm reserved field names are case-sensitive (Type, LOCATION, SubType are allowed). Please verify this is the intended behavior, as users might accidentally use variations like Type thinking they're avoiding the restriction, leading to potential confusion when NV-Ingest later uses the lowercase type field.

frontend/src/hooks/__tests__/useCollectionsApi.test.tsx (3)

22-42: Well-structured test wrapper setup.

Good practice creating a fresh QueryClient per test with retries disabled for deterministic behavior. The wrapper pattern is clean and reusable.


127-162: Excellent test for reserved field error handling.

This test directly validates the Pydantic validation error parsing and "Value error, " prefix stripping that's central to the PR's improved error handling. The test payload and expected error message align with the reserved field validation feature.


164-197: Good edge case coverage for missing msg field.

Tests the fallback path when the Pydantic error object is malformed or missing the msg field, ensuring users still receive a meaningful error message.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@kumar-punit kumar-punit force-pushed the dev/punit/reserved_metadata_fix branch from 5ed879b to 7a2d78a Compare January 15, 2026 18:15
@kumar-punit kumar-punit added the bug Something isn't working label Jan 15, 2026
@kumar-punit kumar-punit force-pushed the dev/punit/reserved_metadata_fix branch from 7a2d78a to 1e39577 Compare January 19, 2026 05:21
Add reserved field names (type, subtype, location) that are exclusively
managed by NV-Ingest and cannot be used in custom metadata schemas.

Backend:
- Add reserved field validation in metadata_validation.py
- Skip reserved fields during schema auto-addition
- Add 21 tests for reserved field validation

Frontend:
- Improve error message parsing for Pydantic validation errors
- Add useCollectionsApi test suite (8 tests)
- Update component tests with complete mocks

Docs:
- Document reserved fields in custom-metadata.md
- Add release note for reserved field validation
@shubhadeepd shubhadeepd force-pushed the dev/punit/reserved_metadata_fix branch from 1e39577 to 18e37cf Compare January 19, 2026 19:28
@shubhadeepd shubhadeepd enabled auto-merge (squash) January 19, 2026 19:29
@shubhadeepd shubhadeepd merged commit 815ca1b into develop Jan 19, 2026
6 checks passed
@shubhadeepd shubhadeepd deleted the dev/punit/reserved_metadata_fix branch January 19, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants