Add more features #14

LaurenzV · 2025-06-22T18:04:16Z

This PR adds a bunch of additional features which already allow us to implement

Basic alpha compositing of fills/strips.
Rendering of blurred rectangles.
Rendering of gradients.

in vello_cpu, using exclusively fearless_simd. The changes include:

Widening and narrowing instructions for u8 and u16.
512-bit vector types
I removed zip/unzip and instead added a zip1/zip2 instruction for zipping the upper/lower part of the vectors.
The fract and msub methods
The ability to reinterpret integer types as u8

Test cases pass with both, fallback and NEON.

ajakubowicz-canva

So many great improvements in this PR!!! 🎉 I haven't gotten through everything but from my first pass I think the comments I've left are the major points. Anything else will probably be a nit or can be done in a followup.

fearless_simd/src/generated/wasm.rs

ajakubowicz-canva · 2025-06-23T04:32:35Z

fearless_simd_gen/src/ops.rs

    ("min", OpSig::Binary),
    ("min_precise", OpSig::Binary),
    ("madd", OpSig::Ternary),
+    ("msub", OpSig::Ternary),


I don't think we need to add msub as this should be something that LLVM can optimize into.

Excerpt from a DM to Raph:

As a guiding principle, I feel like anything that can be accessed through llvm optimizations doesn't need an intrinsic API method [...] and in some cases the right thing to do might be to file issues against llvm to get more optimizations. ~@raphlinus (via DM)

How so? Isn't this the same as for fused multiply-add, where the semantics are not exactly the same as for add + mul, and thus the compiler cannot make this optimization by default?

I'll need Raph to chime in to answer your question because I'm also unclear about why madd isn't a similar consideration. I won't block on this.

We discussed in office hours. I am going to try my best to summarize the discussion. Essentially, we already have a fused multiply-add, and I believe the intent is that a consumer of fearless_simd can express a fused multiply-sub by negating the fused multiply-add. Then LLVM should be able to optimize the negated fused multiply-add into the right underlying SIMD intrinsic.

This is preferred as we can keep our API surface smaller. It's also preferred because there's a complexity of which underling implementations to use on different architectures, so it's much simpler to leverage LLVM in these cases.

Finally we discussed removing msub and replacing usages with negeted fused multiply-adds.

@LaurenzV Does that summary capture the office hours discussion? Also, than you for bearing with me on this one 😄

fearless_simd_gen/src/ops.rs

AndrewJakubowicz · 2025-06-23T08:02:54Z

Test cases pass with both, fallback and NEON.

Just clarifying that this relates to the Vello test suite right?
Thank you!

LaurenzV · 2025-06-23T08:03:28Z

Just clarifying that this relates to the Vello test suite right?

Yep.

ajakubowicz-canva · 2025-06-23T08:57:11Z

Just needs a fmt to pass CI.

ajakubowicz-canva

After CI green LGTM.

LaurenzV added 2 commits June 22, 2025 20:00

Implement more functionality

21ab444

Regenerate code

e91387e

LaurenzV requested a review from ajakubowicz-canva June 22, 2025 18:05

LaurenzV added 5 commits June 22, 2025 20:09

Fix signature of shift

5812ee6

Fix fract method body

d6a67fc

Remove narrow from SIMD trait

d4da412

Fix fract for no_std

27f7289

Reformat

a1cee9b

ajakubowicz-canva mentioned this pull request Jun 23, 2025

[WASM] implement mul_u8x16 and mul_i8x16 #12

Merged

ajakubowicz-canva reviewed Jun 23, 2025

View reviewed changes

Rename zip1/zip2

4a90a45

LaurenzV force-pushed the vello_cpu_part1 branch from 11341f6 to 4a90a45 Compare June 23, 2025 08:04

ajakubowicz-canva approved these changes Jun 23, 2025

View reviewed changes

Reformat

c5abe28

LaurenzV merged commit b07fbff into main Jun 23, 2025
6 checks passed

LaurenzV deleted the vello_cpu_part1 branch June 23, 2025 10:06

ajakubowicz-canva mentioned this pull request Jun 25, 2025

Remove msub operation #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more features #14

Add more features #14

Uh oh!

LaurenzV commented Jun 22, 2025

Uh oh!

ajakubowicz-canva left a comment

Uh oh!

Uh oh!

Uh oh!

ajakubowicz-canva Jun 23, 2025

Uh oh!

LaurenzV Jun 23, 2025

Uh oh!

AndrewJakubowicz Jun 23, 2025

Uh oh!

ajakubowicz-canva Jun 25, 2025

Uh oh!

Uh oh!

AndrewJakubowicz commented Jun 23, 2025

Uh oh!

LaurenzV commented Jun 23, 2025

Uh oh!

ajakubowicz-canva commented Jun 23, 2025

Uh oh!

ajakubowicz-canva left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add more features #14

Add more features #14

Uh oh!

Conversation

LaurenzV commented Jun 22, 2025

Uh oh!

ajakubowicz-canva left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ajakubowicz-canva Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

LaurenzV Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

AndrewJakubowicz Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

ajakubowicz-canva Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AndrewJakubowicz commented Jun 23, 2025

Uh oh!

LaurenzV commented Jun 23, 2025

Uh oh!

ajakubowicz-canva commented Jun 23, 2025

Uh oh!

ajakubowicz-canva left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants