Skip to content

Conversation

@csyonghe
Copy link
Collaborator

No description provided.

@csyonghe csyonghe requested a review from a team as a code owner January 21, 2026 22:27
@csyonghe csyonghe added the pr: non-breaking PRs without breaking changes label Jan 21, 2026
@csyonghe
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

🌈 Formatted, please merge the changes from this PR

@csyonghe csyonghe enabled auto-merge January 22, 2026 03:28
Copy link
Contributor

@kaizhangNV kaizhangNV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Just one suggestion, hopefully we can make the flash attention tests as a verifiable executable test, instead of just a compile test.

should only perform uniform operations for portability. If your code specifies a combination
that is not supported by the device, the behavior is undefined.

Additionally, while only `MemoryScope.Subgroup` (warp-level cooperation) is supported on CUDA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can also support this on cuda as well.

@csyonghe csyonghe added this pull request to the merge queue Jan 22, 2026
Merged via the queue into shader-slang:master with commit 849a5d1 Jan 22, 2026
65 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: non-breaking PRs without breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants