Cap iteration count in `ScalarReplacementOfAggregates` by blackms · Pull Request #155290 · rust-lang/rust

blackms · 2026-04-14T14:01:55Z

Problem

ScalarReplacementOfAggregates (compiler/rustc_mir_transform/src/sroa.rs) runs a fixed-point loop { … } that keeps flattening aggregate locals until no further "dead" locals are produced. Inside, compute_flattening calls iter_fields, which normalizes each field type via try_normalize_erasing_regions before feeding it back as a new local.

On the MCVE from the issue:

trait Apply { type Output<T>; }
struct Identity;
impl Apply for Identity { type Output<T> = T; }
struct Thing<A: Apply>(A::Output<Self>);

fn foo<A: Apply>() { let _x: Thing<A>; }
fn main() { foo::<Identity>(); }

normalizing Thing<Identity>'s field type yields Thing<Identity> again — the enclosing struct itself. An E0391 layout-cycle error is emitted up front, but MIR optimization still runs, and every SROA iteration allocates exactly one fresh local of type Thing<Identity> which is itself a fresh flattening candidate. all_dead_locals is therefore non-empty forever, and the loop never terminates.

Reproduced locally with rustc 1.94.1 stable: after the E0391 is printed the compiler hangs indefinitely (killed after 12 s during repro).

Fix

Replace the unbounded loop with for _ in 0..MAX_ITERATIONS where const MAX_ITERATIONS: usize = 16. Per @scottmcm on the issue thread, 2 iterations typically capture all of SROA's value and non-contrived code never exceeds 10, so 16 leaves comfortable headroom while guaranteeing termination on tainted input. The loop body (including the break on empty all_dead_locals) is unchanged, so healthy code reaches its fixed point exactly as before.

I considered adding an early-bail on body.tainted_by_errors / tcx.dcx().has_errors() instead, but opted against it: it's a larger behavioural change (skipping SROA entirely on tainted bodies) and the same non-termination shape could theoretically be reached without the tainting flag being set at pass entry. A structural iteration cap is a simpler, more local invariant and is robust to both.

Test

Added tests/ui/layout/hang-sroa-layout-cycle.rs next to the existing layout-cycle.rs / post-mono-layout-cycle.rs tests. Uses //@ build-fail with //@ compile-flags: -Copt-level=3 (SROA is gated on mir_opt_level >= 2) and asserts the E0391 cycle error via //~^ ERROR. Without the fix compiletest times out on this test; with the fix it completes and matches the expected stderr.

r? mir-opt

rustbot · 2026-04-14T14:01:59Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Kivooeo · 2026-04-14T14:16:16Z

how llm was used for this pr?
am i read this correct?

  In practice, 2
   // iterations capture nearly all the value and more than 10 is never
   // needed for non-contrived code.

it says that 2 is nearly covers all cases and more than 10 is never needed, so you go 16? why?

The SROA pass repeatedly flattens aggregate locals into their field locals, looping until a fixed point is reached. When the MIR body involves a type whose layout is cyclic (e.g. `struct T(T)` reached via an associated type projection), the layout cycle error is emitted but compilation continues into MIR optimization. Inside SROA, `iter_fields` then keeps returning a single field whose normalized type is the enclosing struct itself, so each iteration allocates a fresh replacement local and the fixed-point loop never terminates. Bound the loop to a small constant. In practice two iterations capture essentially all of SROA's value and non-contrived code does not exceed ten, so 10 is sufficient and guarantees termination on tainted input.

blackms · 2026-04-14T14:49:13Z

Thanks for the review @Kivooeo!

how llm was used for this pr?

Full disclosure: I used a Claude-based coding agent team for the initial investigation — reading the SROA pass, forming the root-cause hypothesis, drafting the fix from @scottmcm's comment, and sketching the test. I then validated manually before opening the PR: reproduced the hang locally with rustc 1.94.1 stable on the MCVE (killed after 12 s after the E0391), read the full diff myself, and checked the test placement against sibling files (layout-cycle.rs, post-mono-layout-cycle.rs). Happy to answer any question about the reasoning or rewrite anything if something reads as machine-y.

it says that 2 is nearly covers all cases and more than 10 is never needed, so you go 16? why?

Fair point, I was being overcautious. Dropped the cap to 10 in the force-push just now — that's the number @scottmcm explicitly called out as "never needed" in non-contrived code, so using exactly that value makes the relationship between comment and constant obvious. Comment updated to match.

Also force-pushed a fix for the CI failure: when I renamed the test from hang-sroa-layout-cycle-153205.rs to hang-sroa-layout-cycle.rs I forgot to update the $DIR/… path and line number in the .stderr fixture (the .rs file gained a blank line from the URL rewrite). That's what aarch64-gnu-llvm-21-1 was complaining about; should be green on the new push.

scottmcm · 2026-04-14T22:37:02Z

TBH, I completely made up those numbers. They're an instinct, not something I researched or data-collected.

So it'd probably be good to justify them somehow -- how often does it really loop in practice?

That said, we might as well see what perf thinks about it anyway.

@bors try @rust-timer queue

Cap iteration count in `ScalarReplacementOfAggregates`

rust-bors · 2026-04-15T00:45:59Z

☀️ Try build successful (CI)
Build commit: d96b019 (d96b019342c4bec517214647f0d2880a53721cd1, parent: a5c825cd824ee0ef9463021078a2f464b4cc1a0d)

rust-timer · 2026-04-15T01:26:15Z

Finished benchmarking commit (d96b019): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-1.4%, -0.2%]	5
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (secondary -4.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.5%	[-4.5%, -4.5%]	1
All ❌✅ (primary)	-	-	0

Cycles

Results (primary -2.6%, secondary 3.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.3%	[3.3%, 3.3%]	1
Improvements ✅ (primary)	-2.6%	[-2.6%, -2.6%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.6%	[-2.6%, -2.6%]	1

Binary size

Results (primary 0.1%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.1%]	15
Regressions ❌ (secondary)	0.1%	[0.0%, 0.1%]	20
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	1
All ❌✅ (primary)	0.1%	[0.0%, 0.1%]	15

Bootstrap: 489.296s -> 490.526s (0.25%)
Artifact size: 394.04 MiB -> 394.22 MiB (0.05%)

saethlin · 2026-04-15T02:21:17Z

r? mir-opt

blackms · 2026-04-15T05:46:48Z

Perf came back clean — 0 regressions, 5 secondary improvements around −0.5%, no primary changes. Thanks for running it.

On the cap value: 10 is structural rather than empirical. SROA only keeps looping when all_dead_locals is non-empty, which means some local got flattened into fresh locals that are themselves eligible for flattening. That recursion depth is bounded by the aggregate-nesting depth of the types in the body (tuple-of-struct, struct-of-tuple, etc.), and real code rarely goes past 2–3 levels. Happy to instrument the pass with a counter and run it across a crater sample if you want empirical numbers, but given the try came back clean I'm not sure it buys much beyond what you already have.

jieyouxu · 2026-04-15T06:53:13Z

See #155296 (comment)

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 14, 2026

rustbot assigned saethlin Apr 14, 2026

This comment has been minimized.

Sign in to view

blackms force-pushed the fix-sroa-hang-153205 branch from aa5cf73 to e6d8c3c Compare April 14, 2026 14:48

This comment has been minimized.

Sign in to view

rust-bors bot pushed a commit that referenced this pull request Apr 14, 2026

Auto merge of #155290 - blackms:fix-sroa-hang-153205, r=<try>

d96b019

Cap iteration count in `ScalarReplacementOfAggregates`

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 14, 2026

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 15, 2026

rustbot assigned wesleywiser and unassigned saethlin Apr 15, 2026

jieyouxu unassigned wesleywiser Apr 15, 2026

jieyouxu closed this Apr 15, 2026

rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 15, 2026

rust-lang locked and limited conversation to collaborators Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cap iteration count in `ScalarReplacementOfAggregates`#155290

Cap iteration count in `ScalarReplacementOfAggregates`#155290
blackms wants to merge 1 commit intorust-lang:mainfrom
blackms:fix-sroa-hang-153205

blackms commented Apr 14, 2026

Uh oh!

rustbot commented Apr 14, 2026

Uh oh!

Kivooeo commented Apr 14, 2026

Uh oh!

This comment has been minimized.

blackms commented Apr 14, 2026

Uh oh!

scottmcm commented Apr 14, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Apr 15, 2026

Uh oh!

This comment has been minimized.

rust-timer commented Apr 15, 2026

Uh oh!

saethlin commented Apr 15, 2026

Uh oh!

blackms commented Apr 15, 2026

Uh oh!

jieyouxu commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Uh oh!

Conversation

blackms commented Apr 14, 2026

Problem

Fix

Test

Uh oh!

rustbot commented Apr 14, 2026

Uh oh!

Kivooeo commented Apr 14, 2026

Uh oh!

This comment has been minimized.

blackms commented Apr 14, 2026

Uh oh!

scottmcm commented Apr 14, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Apr 15, 2026

Uh oh!

This comment has been minimized.

rust-timer commented Apr 15, 2026

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

saethlin commented Apr 15, 2026

Uh oh!

blackms commented Apr 15, 2026

Uh oh!

jieyouxu commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants