Skip to content

Pass for Nvidia ModelOpt graph surgery framework#2377

Open
hthadicherla wants to merge 3 commits intomicrosoft:mainfrom
hthadicherla:hthadicherla/graph-surgery-pass
Open

Pass for Nvidia ModelOpt graph surgery framework#2377
hthadicherla wants to merge 3 commits intomicrosoft:mainfrom
hthadicherla:hthadicherla/graph-surgery-pass

Conversation

@hthadicherla
Copy link
Copy Markdown

@hthadicherla hthadicherla commented Mar 31, 2026

Describe your changes

Add NVModelOptGraphSurgery pass to integrate NVIDIA ModelOpt graph surgeries into Olive. Supports all existing surgeries in ModelOpt like GQA fusion, DQ-Transpose and all future surgeries that will be added in ModelOpt

Changes:

  • New pass: olive/passes/onnx/nvmo_graph_surgery.py
  • Pass registration in olive_config.json
  • Unit tests: test/passes/onnx/test_nvmo_graph_surgery.py
  • Documentation in pass.rst and onnx-transformations.md

Usage

Example

{
    "type": "NVModelOptGraphSurgery",
    "config": {
        "surgery_type": "replace-gqa" # Surgery-key in Modelopt,
        "surgery_params": {
             # Surgery specific parameters of that particular surgery, example below
            "hf_model_id": "meta-llama/Llama-2-7b-hf",
            "io_dtype": "float16"
        }
    }
}

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

Release note: Added NVModelOptGraphSurgery pass for running NVIDIA ModelOpt graph surgeries on ONNX models.

Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant