Unify model naming convention #123

rul048 · 2025-12-10T02:59:36Z

Summary

This PR standardizes how MatCalc loads universal MLIPs across multiple backend libraries (MatGL, MACE, GRACE, etc.) as a solution of Issue #120.

Different backend libraries currently use inconsistent or non-canonical model names (e.g., "small-omat-0", "medium-mpa-0", "GRACE-1L-OMAT-medium-base", "TensorNet-MatPES-PBE-v2025.1-PES"), which leads to ambiguity, user confusion, and repeated conversion logic scattered across the codebase.

To fix this, the PR introduces a unified model naming convention. The canonical identifier format is:

Architecture-Dataset-Functional-Version-Size

Examples:

TensorNet-MatPES-r2SCAN-v2025.1-S
MACE-MP-PBE-0-M
GRACE-OMAT-PBE-0-L

The alias conversion table (ID_TO_ALIAS) can map the commonly used case-insensitive abbreviations or allies (eg., MACE-MP-0, MACE-MP-PBE-0, MACE-MP-0-M) entered by the user to the canonical identifiers used in MatCalc. And the backend name conversion table (ID_TO_NAME) can map these identifiers to the model names that each backend library actually uses so that users can write:

load_universal("mace")
load_universal("mace-mp-0")
load_universal("tensornet")
load_universal("grace-oam")

Use abbreviations will load the most advanced model in the model family.

Major changes:

feature 1: Introduced a unified, backend-agnostic model naming convention (----).
feature 2: Centralized model resolution using ID_TO_NAME (canonical ID -> backend name) and ID_TO_ALIAS (user aliases -> canonical ID).
feature 3: Updated load_universal() to support flexible MACE loading via name.startswith("mace"), enabling both load_universal("mace") (SOTA model) and load_universal("mace-mp-0-m").
feature 4: Updated load_universal() to support flexible GRACE loading via name.startswith("grace"), enabling both load_universal("grace") (SOTA model), load_universal("grace-oam-pbe-0-l") and load_universal("GRACE-2L-OMAT-large-ft-AM").
feature 5: Expanded alias coverage and added tests for MACE and GRACE model resolution.
fix 1: Removed TensorPotential, as it is a backend library name (GRACE) rather than a model name.

Todos

If this is work in progress, what else needs to be done?

feature 6: Expand alias coverage to handle additional legacy and inconsistent model names.

Checklist

Google format doc strings added. Check with ruff.
Type annotations included. Check with mypy.
Tests added for new features/fixes.
If applicable, new classes/functions/modules have duecredit @due.dcite decorators to reference relevant papers by DOI (example)

Tip: Install pre-commit hooks to auto-check types and linting before every commit:

pip install -U pre-commit
pre-commit install

shyuep · 2025-12-10T03:23:37Z

Good start. But I don't think we should use model aliases in such a manner. It is very confusing that MACE-mp-0-small maps to "small". It is completely unintuitive. An "alias" is a shortened name for a commonly used model (and you should not have too many of these since commonly used by definition means a few).

This is what I propose:

Matcalc defines the correct way to name a model. E.g., < architecture>-<dataset>-version-S/M/L (small medium large). Example: MACE-MP-0-M
Model aliases provide shortened names for the most common models. E.g., MACE-MP-0 maps to MACE-MP-0-M. TensorNet-PBE maps to TensorNet-MatPES-PBE-v2025.1. We can be less strict with model aliases since everyone has their pet naming convention.
Within the model loading code, we parse the full name (e.g., MACE-MP-0-M) to actually load the correct model using the API provided by the model developer. E.g., Loading either MACE-MP-0 will correctly alias to MACE-MP-0-M, which in the relevant section, the code loads with the medium keyword.

I would even argue that for <dataset>, we want to be explicit in the functional (even though for most datasets, it is implied).

All names need to be processed in a case-insensitive manner.

@Andrew-S-Rosen Welcome your views too.

Andrew-S-Rosen · 2025-12-10T04:08:44Z

Thanks for the ping and for tackling this, @rul048. While conceptually "simple", I think #120 is incredibly important to address and am glad you are working on this. I agree with your comments, @shyuep.

Matcalc defines the correct way to name a model. E.g., < architecture>--version-S/M/L (small medium large). Example: MACE-MP-0-M

I do not have much of an opinion so long as it is internally consistent and captures all the necessary nuance.

TensorNet-PBE maps to TensorNet-MatPES-PBE-v2025.1. We can be less strict with model aliases since everyone has their pet naming convention.

Presumably, if a TensorNet-MatPES-PBE-v2026.1 were to come out next month, then TensorNet-PBE would be remapped to this newer version in a future release of matgl. On one hand, it makes sense to return the "best" version for the user. Of course, this downside of this is that when the alias is remapped, then there will be breaking changes if the user upgrades without reading a CHANGELOG. I don't see a way around that. You get what you ask for with the alias.

Within the model loading code, we parse the full name (e.g., MACE-MP-0-M) to actually load the correct model using the API provided by the model developer. E.g., Loading either MACE-MP-0 will correctly alias to MACE-MP-0-M, which in the relevant section, the code loads with the medium keyword.

Yup.

Change names from list comprehension to set comprehension. Signed-off-by: Runze Liu <[email protected]>

rul048 · 2025-12-22T02:59:56Z

Compared to the previous version, the latest commit splits model resolution into two explicit stages. It introduces two separate mappings: ID_TO_ALIAS (user-facing aliases -> canonical IDs) and ID_TO_NAME (canonical IDs -> backend-specific model names). This makes the canonical identifier the shared “source of truth” inside MatCalc, while aliases become purely a user-compatibility layer, and backend names become purely an implementation detail.

Along with this change, the PR description now formalizes a simpler canonical identifier format at the MatCalc level (Architecture-Dataset-Functional-Version-Size) rather than treating the canonical name as a variable-length.

shyuep · 2025-12-23T22:31:53Z

src/matcalc/utils.py

+
+# Keys must be lowercase and represent canonical identifiers
+# Values are the actual model names passed to the backend libraries.
+ID_TO_NAME = {


I wouldn't bother ensuring IDs are lower case here. It is better to use the proper name capitalization. When checking, we can make it a non-case-sensitive check

Agreed. I’ve kept the canonical capitalization for the IDs (eg., TensorNet-MatPES-r2SCAN-v2025.1-S), and the resolution is now case-insensitive when matching user input.

shyuep · 2025-12-23T22:32:20Z

src/matcalc/utils.py

 }

+# Common aliases and abbreviations will load the most advanced or widely used model.
+ID_TO_ALIAS = {


This is non-intuitive. It should be ALIAS_TO_ID. And it is a many-to-one mapping.

src/matcalc/utils.py

Signed-off-by: Runze Liu <[email protected]>

Unify model naming convention

f9e835c

rul048 changed the title ~~Unify model naming convention~~ [WIP] Unify model naming convention Dec 10, 2025

rul048 and others added 4 commits December 9, 2025 20:45

Fix the pytest

9ce8db0

Fix the pytest

84db012

pre-commit auto-fixes

dc095a9

Fix the pytest

35deade

Change names from list comprehension to set comprehension. Signed-off-by: Runze Liu <[email protected]>

Andrew-S-Rosen mentioned this pull request Dec 15, 2025

Make it clearer how to pick specific MLIPs and do so in a future-proof manner Quantum-Accelerators/quacc#2780

Open

rul048 and others added 2 commits December 19, 2025 16:46

Resolve ID and alias

7742415

pre-commit auto-fixes

e1be217

rul048 changed the title ~~[WIP] Unify model naming convention~~ Unify model naming convention Dec 22, 2025

shyuep reviewed Dec 23, 2025

View reviewed changes

rul048 and others added 7 commits December 23, 2025 16:34

Fix

3d21d37

Fix

620e984

Fix

cc684fa

pre-commit auto-fixes

0fdf829

Fix

260035a

Fix

192a2c2

pre-commit auto-fixes

95a9578

rul048 requested a review from shyuep December 25, 2025 05:34

shyuep reviewed Dec 25, 2025

View reviewed changes

src/matcalc/utils.py Outdated Show resolved Hide resolved

rul048 and others added 7 commits January 5, 2026 15:38

Fix

3df256f

pre-commit auto-fixes

f1379fb

Fix

655f91d

Fix

7dc737d

Merge branch 'materialyzeai:main' into enhance/naming

fa5b3d4

Refactor NEB import handling in _neb.py

4b365a6

Signed-off-by: Runze Liu <[email protected]>

Fix indentation for NEB assignment

f5c4a4b

Signed-off-by: Runze Liu <[email protected]>

pre-commit-ci bot and others added 5 commits January 6, 2026 01:20

pre-commit auto-fixes

b10bb63

Simplify NEB import in _neb.py

0aa1633

Signed-off-by: Runze Liu <[email protected]>

pre-commit auto-fixes

d00089e

Merge branch 'materialyzeai:main' into enhance/naming

cdafd5b

Merge branch 'main' into enhance/naming

4f5f330

Signed-off-by: Runze Liu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unify model naming convention #123

Unify model naming convention #123

Uh oh!

rul048 commented Dec 10, 2025 •

edited

Loading

Uh oh!

shyuep commented Dec 10, 2025 •

edited

Loading

Uh oh!

Andrew-S-Rosen commented Dec 10, 2025

Uh oh!

rul048 commented Dec 22, 2025 •

edited

Loading

Uh oh!

shyuep Dec 23, 2025

Uh oh!

rul048 Dec 24, 2025

Uh oh!

shyuep Dec 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Unify model naming convention #123

Are you sure you want to change the base?

Unify model naming convention #123

Uh oh!

Conversation

rul048 commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Major changes:

Todos

Checklist

Uh oh!

shyuep commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Andrew-S-Rosen commented Dec 10, 2025

Uh oh!

rul048 commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shyuep Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

rul048 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

shyuep Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rul048 commented Dec 10, 2025 •

edited

Loading

shyuep commented Dec 10, 2025 •

edited

Loading

rul048 commented Dec 22, 2025 •

edited

Loading