-
Notifications
You must be signed in to change notification settings - Fork 22
Add trunc and interleaved loading in neon and fallback
#17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
trunc, wrapping_sub, and interleaved loading.
trunc, wrapping_sub, and interleaved loading.trunc, wrapping_sub, and interleaved loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments. I am happy with trunc and interleaved loading. However, I think wrapping_sub may be something we don't need to explicitly add. I assume that a followup PR would then need to implement wrapping_* variants of the other math operations.
Maybe it may be useful for @raphlinus to weigh in. Using Highway's design philosophy, they do not separate sub from wrapping_sub. excerpt:
- Ensuring signed integer overflow has well-defined semantics (wraparound).
Link
My question is - do we want to provide explicitly different API's for wrapping and non wrapping operations?
Great work!
Edit: I also want to add a caveat that I'm not even close to a reliably source here, hence why I defer to Raph. I won't block review though. I'll try get a chance to review tomorrow.
|
I'm just not sure that using Other than that, for me it would be fine to merge the two. |
ajakubowicz-canva
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick note that I've reviewed trunc and wrapping_sub and that work LGTM.
I just have interleaved to review but ran out of time. If there's urgency, you could split out those instructions and land incrementally. Otherwise I'll complete review after weekend!
ajakubowicz-canva
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow review.
I'm happy with trunc and interleaved but on further thought I don't think we should have sub and wrapping_sub. I think there's a high chance if we add wrapping_sub in this PR, we'll want to remove it in the future (or add more explicit subtraction variants).
trunc, wrapping_sub, and interleaved loadingtrunc and interleaved loading in neon and fallback
ajakubowicz-canva
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :D
Great PR!
This adds a couple of more features required for the SIMD version of
vello_cpu, in particular:truncinstruction.wrapping_subinstruction.