Skip to content

Conversation

@N7Alpha
Copy link

@N7Alpha N7Alpha commented Feb 13, 2025

After looking into it the __CUDA_ARCH__ check did not make sense to me, so I just decided to make the _syncthreads() call unconditional. I left a comment justifying the __syncthreads() call since it does not appear in the pseudocode of the original paper surprisingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant