Skip to content

Conversation

@justinrosner
Copy link
Contributor

Motivation

This PR adds in new scaled wmma instructions that are available on gfx1250.

This implements: https://github.com/ROCm/rocMLIR-internal/issues/2133

Note: This PR is based off of the changes in #2094, so we will have to wait for that to go in before this can be merged.

Technical Details

Upstream changes needed for this:

rocMLIR changes:

  • Add extra logic in in WmmaInsnGroup/AccelEmitter for scaled types
    • Right now we will default to using the non-scaled wmma instructions when all types are fp8/bf8
  • Updates to AmdArchDb to allow for fp4 wmma types

Test Plan

  • Nightly CI
  • gfx1250 emulation tests

Test Result

  • Nightly CI
  • gfx1250 emulation tests

Submission Checklist

@justinrosner justinrosner force-pushed the justinr-wmma-instructions branch 4 times, most recently from 9b6ea04 to e4a9494 Compare December 12, 2025 01:22
Base automatically changed from justinr-wmma-instructions to develop December 12, 2025 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants