Skip to content

add esmf summary parser#12

Merged
edoyango merged 1 commit intomainfrom
esmf-summary
Oct 14, 2025
Merged

add esmf summary parser#12
edoyango merged 1 commit intomainfrom
esmf-summary

Conversation

@edoyango
Copy link
Collaborator

@edoyango edoyango commented Oct 9, 2025

This add's a parser for esmf's text summary which can be parsed "flat" or hierarchically. We probably only need flat for now, but the latter might be useful at some point.

@micaeljtoliveira
Copy link
Member

@edoyango I would suggest you activate the pre-commit hook in your development environment. It will likely save you some time.

Copy link
Member

@micaeljtoliveira micaeljtoliveira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edoyango Thanks for the changes. All good from my side.

@manodeep
Copy link
Collaborator

manodeep commented Oct 9, 2025

@edoyango The linter check is failing - will you please fix that. I am looking through the rest of the code in the meantime

Copy link
Collaborator

@manodeep manodeep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some suggestions trying to reduce the hardcoding. This looks quite substantial and great overall.

The other bit would be that the code quality checks for the linter and coverage need to pass

# check that all keys in correct_dict are in input_dict
if input_dict.keys() < correct_dict.keys():
raise ValueError(f"Missing keys for {region} (depth: {depth}): {set(correct_dict.keys()) - input_dict.keys()}")
extra_keys = set(correct_dict.keys()) - metric_keys
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if metric_keys contains extra keys? (Is that even possible for valid data?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metric_keys is the expected "metric" keys in both input_dict and correct_dict. The function will error if input_dict and correct_dict don't have all the keys in metric_keys. extra_keys would be the region keys.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed extra_keys to region_keys so hopefully it's a bit clearer what they are

Copy link
Collaborator

@minghangli-uni minghangli-uni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I haven’t looked into the code yet, but I’d suggest keeping only hierarchical, since some region names might appear in multiple locations (such as in Runphase or in Init). Otherwise it could lead to unexpected results.

@edoyango
Copy link
Collaborator Author

some region names might appear in multiple locations (such as in Runphase or in Init). Otherwise it could lead to unexpected results.

Are you talking about regions like [ensemble] RunPhase1, [ESM0001] RunPhase1 etc.? In the OM3 output I saw RunPhase1 many times, but they seemed to be prefixed uniquely. Are you saying that they could also have the same prefix too? Do you have an example handy?

@minghangli-uni
Copy link
Collaborator

minghangli-uni commented Oct 10, 2025

For example, [ATM-TO-MED] RunPhase1 exists in both [ensemble] Init 1 and [ensemble] RunPhase1.

@micaeljtoliveira
Copy link
Member

For example, [ATM-TO-MED] RunPhase1 exists in both [ensemble] Init 1 and [ensemble] RunPhase1.

I think that's fine. It's very common case when profiling a code that the same block of code is called from different places, with different data and, therefore, with very different timings. The hierarchical data gives you more information, which is good, but the flat one is still useful in many cases.

@edoyango
Copy link
Collaborator Author

True. Thanks for pointing out the issue.

@manodeep @micaeljtoliveira to either of you have a suggestion for how to handle this? I'm thinking for the flat case, we simply add a _N suffix for now. We could add the bracketed prefixes, but that would probably be too verbose since there's around 10 levels?

@micaeljtoliveira
Copy link
Member

@manodeep @micaeljtoliveira to either of you have a suggestion for how to handle this? I'm thinking for the flat case, we simply add a _N suffix for now. We could add the bracketed prefixes, but that would probably be too verbose since there's around 10 levels?

As I wrote above, I think it's fine to aggregate calls to the same region but from different places in the call-tree together when working with the flat case. We loose some information, but what is left can still be useful.

@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (815072f) to head (0d6b41c).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main       #12   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9        10    +1     
  Lines          285       345   +60     
=========================================
+ Hits           285       345   +60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@edoyango
Copy link
Collaborator Author

some region names might appear in multiple locations (such as in Runphase or in Init). Otherwise it could lead to unexpected results.

@minghangli-uni @micaeljtoliveira @manodeep I've added some code to aggregate regions that occur more than one, if PETs and PEs are the same (errors if they're not). This assumes regions with identical names is referring to the same code. Can you all have a look before I merge?

Copy link
Member

@micaeljtoliveira micaeljtoliveira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edoyango LGTM, just one very small thing to fix.

Copy link
Collaborator

@minghangli-uni minghangli-uni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @edoyango

Co-authored-by: Manodeep Sinha <manodeep@gmail.com>
Co-authored-by: Micael Oliveira <micael.oliveira@anu.edu.au>
@edoyango edoyango merged commit 580517e into main Oct 14, 2025
8 checks passed
@edoyango edoyango deleted the esmf-summary branch October 14, 2025 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants