Take first task group for further execution#154419
Take first task group for further execution#154419zetanumbers wants to merge 1 commit intorust-lang:mainfrom
Conversation
|
r? @jieyouxu rustbot has assigned @jieyouxu. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
I wonder what variance, and noise level, you're seeing on your benchmarking machine? BTZ, does this remove some small overhead a few times, or does it translate to good results on bigger benchmarks as well? |
|
@rustbot reroll |
Here's baseline compiler running against itself:
I have run these benchmarks on various changes before and never seen all greens like above. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
f0c55d6 to
576a727
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
I wasn't able to reproduce the improvements. Perhaps different scheduling on Windows is the cause? The change seems unlikely to be a regression anyway. Results with 7 threads:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I did a benchmark run with 7 threads in a Linux VM and that does look like an improvement:
|
|||||||||||||||||||||||||||||||||||||||||||||||
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Take first task group for further execution
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (7162208): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 483.836s -> 484.409s (0.12%) |
Continuing from #153768 (comment).
I thought that storing a first group of tasks for immediate execution instead of pushing and immediately poping it from rayon's local task queue in par_slice would avoid overwhelming work stealing potentially blocking the original thread. So I've implemented this change.
8 threads benchmarks: