Skip to content

Conversation

@enssow
Copy link
Contributor

@enssow enssow commented Dec 11, 2025

Description

Draft PR to showcase how to enable sharding (experimentation underway to optimise performance)

Issue Number

DRAFT
Closes #1384

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@github-actions github-actions bot added infra Issues related to infrastructure performance Work related to performance improvements labels Dec 11, 2025
@enssow enssow marked this pull request as draft December 11, 2025 14:22
@enssow enssow mentioned this pull request Dec 11, 2025
6 tasks
Copy link
Contributor

@grassesi grassesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if zarr_store would be under general.zarr_store but this is just nitpicking at this point. Please do remove the print statements tho.

@enssow
Copy link
Contributor Author

enssow commented Jan 15, 2026

I think it would be better if zarr_store would be under general.zarr_store but this is just nitpicking at this point. Please do remove the print statements tho.

Sorry, think the commit was not the most recent! prints all gone :) i did think about putting it under a different category but the default_forecast_config doesn't have a general section so it may be best just where it is - let me know what you think?

@enssow
Copy link
Contributor Author

enssow commented Jan 15, 2026

type-check fails due to an import error with tqdm (this can be resolved by directly adding tqdm to the dependencies of common pyproject.toml. However, this then leads to further issues:

ERROR Type `object` is not iterable [not-iterable]
   --> packages/common/src/weathergen/common/io.py:412:24
    |
412 |         for dataset in tqdm(item.datasets):
    |                        ^^^^^^^^^^^^^^^^^^^
    |
 INFO 1 error (42 ignored)

Which is a known issue with pryefly
To discuss what is the best resolution :)

@enssow enssow merged commit be69a74 into develop Jan 15, 2026
5 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in WeatherGen-dev Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infra Issues related to infrastructure initiative Large piece of work covering multiple sprint performance Work related to performance improvements

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

zarr3 compaction

5 participants