feat: introducing AWS Glue schema registry discovery#243
feat: introducing AWS Glue schema registry discovery#243Adrian Januzi (adrian-januzi) merged 18 commits intomainfrom
Conversation
Combine `kcp scan schema-registry` and `kcp scan glue-schema-registry` into a single `kcp scan schema-registry` command with a required `--sr-type` flag (confluent | glue). This reduces CLI surface area and establishes a consistent pattern for registry type selection. - Add --sr-type flag with conditional flag validation in PreRunE - Move glue scanner into schema_registry package (no cross-package imports) - Normalize glue success output from slog.Info to fmt.Printf - Remove glue-schema-registry as a standalone command - Update help text with grouped flags by registry type Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix path traversal in HCL file generation by sanitizing schema names and validating output paths stay within target directory - Fix potential nil panic in generateConfluentSchema by propagating errors - Add bounded concurrency (10 workers) to GetAllSchemasWithVersions - Thread context.Context through Glue service layer for cancellation - Fix emoji in error message and use %w wrapping in glue client - Add --schemas filter flag to migrate-schemas for CLI/UI parity - Add user-facing success message for Glue migration generator - Consolidate schema_registry_cluster_id into GlueSchemaVariables - Extract appendVariableBlocks helper to reduce HCL generation duplication - Use aws.ToString/aws.ToInt64 instead of manual nil checks - Extract useToggleSet hook and memoize formatSchema in React UI - Update docs to reflect unified scan schema-registry --sr-type command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consideration: Schema version storage in state fileCurrently A potential improvement would be to store schema definitions on disk separately (e.g., Not blocking for this PR, but worth considering before targeting large production registries. |
|
Having tested the above consideration at 20% the scale (103 schemas, ranging between 1-5 versions), the state file doesn't bloat as greatly as implied especially compared to the already existing cost and metrics data. |
Show schema count and version totals during scan so the command doesn't appear to hang. Uses standardised emoji set (🚀/🔍/✅). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tom Underhill (tjunderhill)
left a comment
There was a problem hiding this comment.
Code Review Findings
Overall this is a well-structured PR — clean separation of concerns, good test coverage, and the backward-compatible UnmarshalJSON is nicely done. A few items worth addressing:
Critical
1. Uncommitted fix for MarkFlagRequired in PreRunE
cmd/scan/schema_registry/cmd_scan_schema_registry.go
There is an uncommitted diff in the working tree that replaces cmd.MarkFlagRequired("url") etc. inside preRunScanSchemaRegistry with explicit empty-string checks. The original code calls MarkFlagRequired inside PreRunE, which does not work as expected — cobra's MarkFlagRequired is designed to be called at command setup time (before Execute()), not inside PreRunE. When called in PreRunE, cobra has already parsed flags and will not retroactively enforce the requirement, meaning a missing --url flag would silently pass validation.
The uncommitted change (direct if url == "" { return fmt.Errorf(...) } checks) is the correct fix and should be committed.
Important
2. NewStateFrom does not copy SchemaRegistries
internal/types/state.go
NewStateFrom copies Regions but not SchemaRegistries. Any code path using NewStateFrom (e.g. cmd/discover/discoverer.go) will silently drop all schema registry data when a new discovery run starts. Worth adding:
workingState.SchemaRegistries = fromState.SchemaRegistries3. No context cancellation on error in concurrent schema fetching
internal/services/glue_schema_registry/glue_schema_registry_service.go
GetAllSchemasWithVersions checks firstErr before starting each goroutine's work, but does not cancel the context on error. Already-started goroutines making AWS API calls will continue running after a failure. With 10 concurrent workers and many schemas, this wastes time and API calls. Fix with context.WithCancel:
ctx, cancel := context.WithCancel(ctx)
defer cancel()
// in the error path:
if firstErr == nil {
firstErr = fmt.Errorf(...)
cancel()
}4. slog.Info in hot loop should be slog.Debug
internal/services/glue_schema_registry/glue_schema_registry_service.go:81
Per CLAUDE.md: slog.Info is "operational detail", slog.Debug is "verbose internal detail". The line slog.Info("fetching versions for schema", ...) runs once per schema inside a concurrent loop. For registries with hundreds of schemas this floods the log. Should be slog.Debug.
5. Glue registry lookup in parseMigrateGlueSchemasOpts ignores region
cmd/create_asset/migrate_schemas/cmd_create_asset_migrate_schemas.go
parseMigrateGlueSchemasOpts matches Glue registries only by RegistryName, but UpsertGlueSchemaRegistry matches by RegistryName + Region. If a user has the same registry name in two regions (valid — Glue names are per-account-per-region), the CLI always finds the first one. Consider adding a --region flag or matching on both fields.
Suggestions
formatSchemaCache(SchemaRegistries.tsx): Module-levelMapis never evicted. Unlikely to matter for a single-session desktop tool, but worth noting.- CLI flag rename:
--urlto--cc-sr-rest-endpointis a breaking change for automation. Call it out in release notes if intentional. Latestpointer sharing: In the concurrent goroutine,latest = &versions[latestIdx]shares memory with theVersionsslice. Considerv := versions[latestIdx]; latest = &vto avoid subtle mutation coupling.- Missing error-path test:
GetAllSchemasWithVersionshas no test for whengetSchemaVersionsfails for one schema. Would strengthen confidence in the concurrency pattern.
- Replace MarkFlagRequired in PreRunE with explicit empty-string checks (cobra ignores MarkFlagRequired after flag parsing) - Copy SchemaRegistries in NewStateFrom to prevent data loss on re-discovery - Add context cancellation on error in concurrent schema fetching - Downgrade hot-loop slog.Info to slog.Debug per logging conventions - Detect ambiguous Glue registry name across multiple regions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…uation When the same Glue registry name exists in multiple regions, users can now pass --region to select the correct one. Without --region, the command errors if ambiguous rather than silently picking the first match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tom Underhill (tjunderhill)
left a comment
There was a problem hiding this comment.
LGTM!
Tom Underhill (tjunderhill)
left a comment
There was a problem hiding this comment.
LGTM!
Description
Adds support for discovering and migrating schemas from AWS Glue Schema Registry to Confluent Cloud. This extends KCP's schema migration capabilities beyond Confluent Schema Registry to cover AWS-native schema registries.
Changes Made
scan glue-schema-registrycommand that discovers schemas and versions from an AWS Glue Schema Registry, persisting results to kcp-state.jsonconfluent_schema) from discovered Glue schemas, including subject configs and schema filesSchemaRegistriesstate to support both Confluent SR and AWS Glue SR with backward-compatible JSON unmarshalingTesting
Test Instructions:
kcp scan glue-schema-registry --registry-name <name> --region <region> --state-file kcp-state.jsonkcp-state.jsonunderschema_registries.aws_gluekcp uiand check the Schema Registries view shows Glue registriesChecklist