Skip to content

fix(nextclade): align dataset metadata and qc config with schema#52

Merged
ivan-aksamentov merged 1 commit intomainfrom
fix/nextclade-align-metadata-and-qc-schema
Mar 6, 2026
Merged

fix(nextclade): align dataset metadata and qc config with schema#52
ivan-aksamentov merged 1 commit intomainfrom
fix/nextclade-align-metadata-and-qc-schema

Conversation

@ivan-aksamentov
Copy link
Member

Fix two categories of schema defects in the mumps Nextclade dataset configs.

Metadata flags: deprecated, experimental, and enabled are at the top level where they are silently ignored. The experimental flag belongs under attributes. The enabled field is not a valid dataset-level property.

QC scoreWeight (genome only): scoreWeight on qc.missingData and qc.mixedSites is not defined in the schema and is silently ignored. It is valid on snpClusters, frameShifts, and stopCodons (those are left unchanged).

Field Action Datasets
deprecated: false Remove (default is false) genome, sh
experimental: true Move to attributes.experimental genome, sh
enabled: true Remove (not in schema) genome, sh
qc.missingData.scoreWeight Remove (not in schema) genome
qc.mixedSites.scoreWeight Remove (not in schema) genome
  • Move experimental into attributes in both datasets
  • Remove deprecated and enabled from both datasets
  • Remove invalid scoreWeight from genome QC config

⚠️ The existing datasets in nextclade_data are already patched in nextstrain/nextclade_data#437, so no immediate resubmission is needed. But without this upstream fix, the issues will reappear on the next dataset submission from this workflow.

ℹ️ See pathogen.json schema for the PathogenJson, PathogenAttributes, QcRulesConfigMissingData, and QcRulesConfigMixedSites definitions.

👀 Note: #40 adds QC config to the SH dataset and uses the same invalid scoreWeight on missingData and mixedSites. Those fields should be omitted when #40 is updated.

  1. Related to: fix: update mumps dataset schema fields nextclade_data#437

@ivan-aksamentov ivan-aksamentov merged commit ec2c00b into main Mar 6, 2026
4 checks passed
@ivan-aksamentov ivan-aksamentov deleted the fix/nextclade-align-metadata-and-qc-schema branch March 6, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant