-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Overview
Implement conversion from JSON Structure schemas to Apache Parquet schema.
Requirements
This conversion should:
- Lean on the corresponding Avro conversion (
avrotoparquet) as precedent for output structure, including use of Jinja templates where applicable - Cover the full breadth of the JSON Structure Core spec as defined in draft-vasters-json-structure-core-00
- Follow the patterns established by
structuretocsharpandstructuretopython, including their continued support for Avro schemas
Implementation Guidance
- Review
avrotize/avrotoparquet.pyfor output patterns and template usage - Review
avrotize/structuretocsharp.pyandavrotize/structuretopython.pyfor the JSON Structure handling patterns - Ensure all JSON Structure Core types are supported:
- JSON Primitive Types: string, number, boolean, null
- Extended Primitive Types: binary, int8-128, uint8-128, float8/float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer
- Compound Types: object, array, set, map, tuple, any, choice (both tagged and inline unions)
- Support JSON Structure-specific features:
- Namespaces and definitions
- Type references ($ref)
- Extensions ($extends) and add-ins ($offers/$uses)
- Abstract types
- Required/optional properties
- Type annotations (maxLength, precision, scale, contentEncoding, etc.)
References
- JSON Structure Core Spec: https://www.ietf.org/archive/id/draft-vasters-json-structure-core-00.txt
- Avro precedent:
avrotize/avrotoparquet.py - JSON Structure precedents:
avrotize/structuretocsharp.py,avrotize/structuretopython.py
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request