Commit bf99e21
authored
Don't use load/store intrinsics (#185)
This is an interesting one! The remaining performance gap in
QuState/PhastFT#58 seems to come from subpar
performance when loading constants.
I noticed that in Rust's `stdarch`, which defines all the SIMD
intrinsics, the x86 load/store intrinsics lower to raw memory operations
(`ptr::copy_nonoverlapping`). The AArch64 load/store intrinsics, on the
other hand, *do* map to corresponding LLVM intrinsics!
My hypothesis is that the LLVM intrinsics are not lowered until much
later in the compilation pipeline, resulting in much fewer optimization
opportunities and much worse codegen. If this is the case, we should
just use memory operations directly. This also simplifies the code that
we generate by quite a bit.1 parent d969e5f commit bf99e21
File tree
9 files changed
+1218
-1388
lines changed- fearless_simd_gen/src
- arch
- fearless_simd/src/generated
9 files changed
+1218
-1388
lines changedLarge diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | 101 | | |
124 | 102 | | |
125 | 103 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
269 | | - | |
270 | | - | |
271 | | - | |
| 269 | + | |
272 | 270 | | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
| 271 | + | |
| 272 | + | |
289 | 273 | | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
296 | 294 | | |
297 | 295 | | |
298 | 296 | | |
| |||
333 | 331 | | |
334 | 332 | | |
335 | 333 | | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | | - | |
350 | | - | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
351 | 337 | | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
369 | 348 | | |
370 | 349 | | |
371 | 350 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | 7 | | |
9 | 8 | | |
10 | 9 | | |
| |||
462 | 461 | | |
463 | 462 | | |
464 | 463 | | |
465 | | - | |
466 | | - | |
467 | | - | |
468 | | - | |
469 | | - | |
470 | | - | |
471 | | - | |
| 464 | + | |
472 | 465 | | |
473 | 466 | | |
474 | 467 | | |
475 | 468 | | |
476 | 469 | | |
477 | | - | |
478 | | - | |
479 | | - | |
| 470 | + | |
480 | 471 | | |
481 | 472 | | |
482 | 473 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
613 | 613 | | |
614 | 614 | | |
615 | 615 | | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
| 616 | + | |
621 | 617 | | |
622 | 618 | | |
623 | 619 | | |
624 | 620 | | |
625 | 621 | | |
626 | | - | |
627 | | - | |
628 | | - | |
629 | | - | |
630 | | - | |
| 622 | + | |
631 | 623 | | |
632 | 624 | | |
633 | 625 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
| 173 | + | |
180 | 174 | | |
181 | 175 | | |
182 | 176 | | |
183 | 177 | | |
184 | 178 | | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
| 179 | + | |
190 | 180 | | |
191 | 181 | | |
192 | 182 | | |
| |||
0 commit comments