[wip] experiment with reversing alignment and size validation order#3133
[wip] experiment with reversing alignment and size validation order#3133jswrenn wants to merge 1 commit intoGcf66958135e905f7cd7d2fac87d9f881e5e5185ffrom
Conversation
| Instructions: 2200 | ||
| Total Cycles: 783 | ||
| Total uOps: 2500 |
| Instructions: 2100 | ||
| Total Cycles: 673 | ||
| Total uOps: 2200 |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on optimizing the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| Iterations: 100 | ||
| Instructions: 2000 | ||
| Total Cycles: 682 | ||
| Total Cycles: 683 |
There was a problem hiding this comment.
extremely marginal regression
| Instructions: 1300 | ||
| Total Cycles: 405 | ||
| Total uOps: 1300 |
| Total Cycles: 338 | ||
| Total uOps: 1100 | ||
| Instructions: 1000 | ||
| Total Cycles: 404 |
| Instructions: 2000 | ||
| Total Cycles: 604 | ||
| Total uOps: 2000 |
| Instructions: 2500 | ||
| Total Cycles: 1008 | ||
| Total uOps: 2800 |
| Instructions: 2000 | ||
| Total Cycles: 639 | ||
| Total uOps: 2100 | ||
| Total Cycles: 641 |
There was a problem hiding this comment.
marginal regression?
| Instructions: 2800 | ||
| Total Cycles: 910 | ||
| Total uOps: 3100 |
| Instructions: 1300 | ||
| Total Cycles: 407 | ||
| Total uOps: 1400 |
| Iterations: 100 | ||
| Instructions: 2300 | ||
| Total Cycles: 791 | ||
| Total Cycles: 797 |
There was a problem hiding this comment.
marginal regression?
| Instructions: 2500 | ||
| Total Cycles: 1105 | ||
| Total uOps: 2700 |
There was a problem hiding this comment.
Code Review
This pull request experiments with reordering the alignment and size validation checks in validate_cast_and_convert_metadata. While the logical correctness is maintained, the performance impact is mixed. Some benchmarks show significant improvements (e.g., ref_from_bytes_dynamic_size), while others show significant regressions (e.g., ref_from_suffix_with_elems_dynamic_padding). The original order performs the cheaper alignment check before the more expensive size validation, which is a good heuristic for failing fast. The new order might allow for better code generation in some cases by grouping checks differently, but it seems to hurt performance in others. Given the mixed results, it would be beneficial to understand the trade-offs better. Perhaps there's a way to get the benefits without the regressions.
| // Alignment checks go in their own block to avoid introducing variables | ||
| // into the top-level scope. | ||
| { | ||
| // We check alignment for `addr` (for prefix casts) or `addr + | ||
| // bytes_len` (for suffix casts). For a prefix cast, the correctness | ||
| // of this check is trivial - `addr` is the address the object will | ||
| // live at. | ||
| // | ||
| // For a suffix cast, we know that all valid sizes for the type are | ||
| // a multiple of the alignment (and by safety precondition, we know | ||
| // `DstLayout` may only describe valid Rust types). Thus, a | ||
| // validly-sized instance which lives at a validly-aligned address | ||
| // must also end at a validly-aligned address. Thus, if the end | ||
| // address for a suffix cast (`addr + bytes_len`) is not aligned, | ||
| // then no valid start address will be aligned either. | ||
| let offset = match cast_type { | ||
| CastType::Prefix => 0, | ||
| CastType::Suffix => bytes_len, | ||
| }; | ||
|
|
||
| // Addition is guaranteed not to overflow because `offset <= | ||
| // bytes_len`, and `addr + bytes_len <= usize::MAX` is a | ||
| // precondition of this method. Modulus is guaranteed not to divide | ||
| // by 0 because `align` is non-zero. | ||
| #[allow(clippy::arithmetic_side_effects)] | ||
| if (addr + offset) % self.align.get() != 0 { | ||
| return Err(MetadataCastError::Alignment); | ||
| } | ||
| } |
There was a problem hiding this comment.
Moving the alignment check after the size validation has mixed performance results across the benchmarks. While some cases like ref_from_bytes_dynamic_size improve, others like ref_from_suffix_with_elems_dynamic_padding regress significantly.
The original order follows the fail-fast principle by performing the cheaper alignment check first. This reordering seems to be beneficial only in certain scenarios. It might be worth investigating what distinguishes the improved cases from the regressed ones to see if a more nuanced approach is possible, perhaps conditionally ordering the checks based on properties of the DstLayout.
ec58f2e to
799dad9
Compare
67f28a5 to
a08a5c9
Compare
799dad9 to
8bd6bea
Compare
a08a5c9 to
c30e455
Compare
8bd6bea to
0de1acc
Compare
gherrit-pr-id: G74bf0db1b7c30669171b5985ce24849a264746e7
c30e455 to
ac240ed
Compare
0de1acc to
f03564b
Compare
FromZeros#3114Latest Update: v6 — Compare vs v5
📚 Full Patch History
Links show the diff between the row version and the column version.
⬇️ Download this PR
Branch
git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git checkout -b pr-G74bf0db1b7c30669171b5985ce24849a264746e7 FETCH_HEADCheckout
git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git checkout FETCH_HEADCherry Pick
git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git cherry-pick FETCH_HEADPull
Stacked PRs enabled by GHerrit.