Fix performance regression in libMesh unstructured mesh tallies#3577
Fix performance regression in libMesh unstructured mesh tallies#3577pshriwise merged 5 commits intoopenmc-dev:developfrom
Conversation
|
Here are some more quantitative performance metrics taken on my personal laptop (Intel Core i7 11800H) using
There is a severe per-thread performance penalty to libMesh tallies when |
pshriwise
left a comment
There was a problem hiding this comment.
Thanks for creating this fix promptly, @nuclearkevin! I agree with the approach to insulate standard LibMesh mesh tallies from adaptive capabilities.
paulromano
left a comment
There was a problem hiding this comment.
Looks good to me too. Thanks @nuclearkevin!
Description
#3185 added support for adaptive libMesh tallies initialized by an external C++ driver application (e.g. Cardinal) to support a project investigating adaptive mesh refinement applied to unstructured mesh tallies in Monte Carlo particle transport. Unfortunately, this addition also introduced a bug which yielded a massive performance penalty during tallying for users of both OpenMC and Cardinal. The crux of the issue is the function
was changed to
Without getting too deep into the weeds of how libMesh is designed and the process behind adaptive mesh refinement, the return value of
n_elem()is cached when the libMesh mesh is constructed, whilen_active_elem()(required for AMR) is not. This necessitates what is essentially a loop over all elements every timen_active_elem()is called, which in the case of transport is every time a libMesh element accumulates a hit due to this function:This Implementation
To fix this performance regression, I've reverted the
LibMeshclass back to it's previous state. All of the adaptivity code necessary for mesh tally AMR in Cardinal has been moved to a new class calledAdaptiveLibMeshwhich inherits fromLibMeshand overrides the few functions required to get adaptivity working.n_active_elem()is cached on construction to avoid any performance hits in transport. This should insulate OpenMC users from any other changes made to support the AMR project in Cardinal, such as #3553.Simpler Implementation
If a simpler fix is desired, I can revert these changes and simply implement the cache for
n_active_elem()inLibMesh. I decided to go all the way in an attempt to guard against future performance regressions.FYI @pshriwise @aprilnovak
Checklist
I have followed the style guidelines for Python source files (if applicable)I have made corresponding changes to the documentation (if applicable)I have added tests that prove my fix is effective or that my feature works (if applicable)