Skip to content

Conversation

@pfultz2
Copy link
Collaborator

@pfultz2 pfultz2 commented Dec 5, 2025

Motivation

Technical Details

This adds the memory_coloring pass to remove any memory allocations. It also uses a bundle of 10 to get better result due to overhead of multiple kernels.

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@pfultz2 pfultz2 requested a review from causten as a code owner December 5, 2025 22:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the benchmarking process for GPU kernel tuning (particularly for splitk operations) by improving the compilation and timing of benchmark kernels. The changes add memory optimization passes and adjust timing parameters to get more accurate performance measurements.

Key Changes:

  • Added memory_coloring pass to eliminate redundant memory allocations during benchmarking
  • Increased benchmark bundle size from 1 to 10 to better amortize kernel launch overhead
  • Added eliminate_identity pass and add_return call to properly structure benchmark modules

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@umangyadav
Copy link
Member

umangyadav commented Dec 6, 2025

Would it possible for you try this PR with ROCm/rocMLIR#2156 in CI ?

@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4486   +/-   ##
========================================
  Coverage    92.21%   92.21%           
========================================
  Files          561      561           
  Lines        27228    27228           
========================================
  Hits         25108    25108           
  Misses        2120     2120           

see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants