Skip to content

Add dataset validation and config for super resolution#206

Open
iliasmahboub wants to merge 1 commit intoML4SCI:mainfrom
iliasmahboub:dataset-validation-config
Open

Add dataset validation and config for super resolution#206
iliasmahboub wants to merge 1 commit intoML4SCI:mainfrom
iliasmahboub:dataset-validation-config

Conversation

@iliasmahboub
Copy link
Copy Markdown

Addresses #196.

Adds two files to DeepLense_Physics_Informed_Super_Resolution_Anirudh_Shankar/:

  • validate_dataset.py — Pre-training check that verifies class directories
    exist, LR/HR file counts match per class, .npy array shapes are consistent
    with configured image dimensions and scale factor, and no filename index gaps.
    Exits cleanly with a summary report.

  • dataset_config.yaml — Externalizes dataset parameters currently hardcoded
    across training notebooks: data paths per telescope model, DM substructure
    classes, num_samples, batch size, epochs, learning rate, train/val split,
    image shape, and magnification.

How to test

cd DeepLense_Physics_Informed_Super_Resolution_Anirudh_Shankar
python validate_dataset.py --config dataset_config.yaml

Reports missing directories gracefully when dataset is not downloaded.
With data present, validates full file integrity before training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant