turbo-tasks: Reduce allocations on cache hits by lukesandberg · Pull Request #92756 · vercel/next.js

lukesandberg · 2026-04-13T23:37:48Z

What?

Reduce heap allocations when turbo-tasks functions get cache hits (~85% of calls).

Why?

Every turbo-tasks function call (generated by #[turbo_tasks::function]) was boxing its arguments into Box<dyn MagicAny> before looking up the task cache. This allocation is wasted on cache hits, which are the overwhelmingly common case.

How?

Deferred boxing via StackMagicAny trait object:

Introduce a StackMagicAny trait that abstracts over a stack-resident Option<T>:

as_ref(&self) -> &dyn MagicAny — borrow the argument for hash/equality (cache lookup)
take_box(&mut self) -> Box<dyn MagicAny> — move the value to the heap (zero clones)
as_any_mut(&mut self) -> &mut dyn Any — downcast to concrete type without boxing

The data flow:

Callsite (macro-generated): creates StackMagicAnySlot::new((args...)) on the stack, calls dynamic_call(..., &mut arg)
dynamic_call: checks resolution via arg.as_ref(), routes to native_call (resolved) or boxes via arg.take_box() for LocalTaskSpec (unresolved)
Backend get_or_create_task_inner: does a read-only raw_get lookup using hash_from_components + eq_components with the borrowed &dyn MagicAny. On cache hit (~85%), returns immediately — zero heap allocation. On cache miss, re-checks under write lock using the same borrowed reference, and only calls arg.take_box() in the vacant entry case (true cache miss).

Boxing is now deferred past all of these:

Memory cache hit — the common case, no allocation at all
Backing storage hit — found in persistent storage, no allocation needed
Lost race under write lock — another thread inserted while we upgraded; we use their task_id, still no allocation
Trait method dispatch (no filtering) — filter_owned is now Option<FilterOwnedArgsFunctor>; when None (the common case where all args are used), the original &mut dyn StackMagicAny passes straight through to dynamic_call without boxing

Optimized filter_owned for traits:

When trait methods do need argument filtering (unused _-prefixed parameters), the old path did take_box() → downcast_args_owned() → dereference → repack. This is an unnecessary heap round-trip. The new downcast_stack_args_owned() function uses as_any_mut() to downcast directly to &mut StackMagicAnySlot<T> and calls take() on the inner Option, skipping the intermediate Box entirely.

Additional changes:

Backend::get_or_create_*_task now takes decomposed parameters (native_fn, this, &mut dyn StackMagicAny) instead of a pre-constructed CachedTaskType
Persistent and transient task creation merged into a shared get_or_create_task_inner(transient: bool)
connect_child uses eagerly-set persistent_task_type from initialize_new_task
OwnedMagicAny adapter wraps already-boxed args (from async resolution tasks) to fit the StackMagicAny interface
Both dynamic_call and trait_call take &mut dyn StackMagicAny (trait dispatch also benefits)
CachedTaskType::hash_encode now delegates to hash_encode_components (deduplicated)
Removed try_native_call, native_call_if_consistent, try_get_or_create_* — the deferred boxing approach subsumes these

Binary size

Binary size is neutral (linux-x86_64, --release, stripped + gzipped: 30.9 MB on both canary and this branch).

Overhead benchmark (turbo-tasks-backend, median, lower is better)

Measured on an isolated Firecracker microVM (linux-x86_64). Variance is nontrivial on this environment, but the direction is consistently positive across all turbo-tasks benchmarks.

Benchmark	Canary	This PR	Delta
`turbo-cached-same-keys/1`	512.6 ns	490.2 ns	-4.4%
`turbo-cached-same-keys/10`	493.4 ns	483.8 ns	-2.0%
`turbo-cached-same-keys/100`	502.1 ns	484.9 ns	-3.4%
`turbo-cached-same-keys/1000`	1.31 µs	774.2 ns	-41.0%
`turbo-cached-different-keys/1`	1.02 µs	1.00 µs	-1.4%
`turbo-cached-different-keys/10`	1.13 µs	1.11 µs	-1.8%
`turbo-cached-different-keys/100`	1.29 µs	1.24 µs	-3.8%
`turbo-cached-different-keys/1000`	2.37 µs	2.32 µs	-2.5%
`turbo-uncached/1`	23.43 µs	20.47 µs	-12.6%
`turbo-uncached/10`	32.23 µs	29.71 µs	-7.8%
`turbo-uncached/100`	126.01 µs	124.95 µs	-0.8%
`turbo-uncached/1000`	1.079 ms	1.048 ms	-2.9%
`turbo-uncached-parallel/1`	6.16 µs	5.67 µs	-8.0%
`turbo-uncached-parallel/10`	5.47 µs	5.12 µs	-6.4%
`turbo-uncached-parallel/100`	14.65 µs	14.55 µs	-0.7%
`turbo-uncached-parallel/1000`	132.09 µs	129.13 µs	-2.2%

codspeed-hq · 2026-04-14T00:00:18Z

Merging this PR will not alter performance

✅ 17 untouched benchmarks
⏩ 3 skipped benchmarks¹

_{Comparing fewer_boxes (59875f4) with canary (73e89d9)}

3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

nextjs-bot · 2026-04-14T00:08:05Z

Stats from current PR

✅ No significant changes detected

📊 All Metrics

📖 Metrics Glossary

Dev Server Metrics:

Listen = TCP port starts accepting connections
First Request = HTTP server returns successful response
Cold = Fresh build (no cache)
Warm = With cached build artifacts

Build Metrics:

Fresh = Clean build (no .next directory)
Cached = With existing .next directory

Change Thresholds:

Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
Size: Changes < 1KB AND < 1% are insignificant
All other changes are flagged to catch regressions

⚡ Dev Server

Metric	Canary	PR	Change	Trend
Cold (Listen)	455ms	455ms	✓	▁▁▁▁█
Cold (Ready in log)	442ms	442ms	✓	▂▁▂▂█
Cold (First Request)	832ms	835ms	✓	█▇▁▁▂
Warm (Listen)	457ms	456ms	✓	▁▁▁▁█
Warm (Ready in log)	442ms	443ms	✓	▃▂▁▁█
Warm (First Request)	345ms	344ms	✓	█▅▆▃▁

📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric	Canary	PR	Change	Trend
Cold (Listen)	456ms	455ms	✓	▁▁▁██
Cold (Ready in log)	439ms	439ms	✓	▇▂▄▅▁
Cold (First Request)	1.940s	1.962s	✓	▇▅▁▁▁
Warm (Listen)	456ms	455ms	✓	█▅▅▅▁
Warm (Ready in log)	438ms	439ms	✓	▇▃▇▃▁
Warm (First Request)	1.935s	1.953s	✓	█▅▃▃▁

⚡ Production Builds

Metric	Canary	PR	Change	Trend
Fresh Build	3.987s	3.973s	✓	▂▁▄▃█
Cached Build	3.994s	4.020s	✓	▁▁▃▃█

📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric	Canary	PR	Change	Trend
Fresh Build	14.478s	14.470s	✓	█▅▇▅▃
Cached Build	14.576s	14.651s	✓	█▄██▁
node_modules Size	494 MB	494 MB	✓	███▅█

📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles

	Canary	PR	Change
01lykb9j9u19z.js gzip	155 B	N/A	-
037b-ksaecgir.js gzip	158 B	N/A	-
051awit1pzxy9.js gzip	156 B	N/A	-
07rxhp_1_g4mu.js gzip	13.1 kB	N/A	-
08avva-dy02e7.js gzip	10.4 kB	N/A	-
0b9mvru8gaawc.js gzip	156 B	N/A	-
0cz1d0mv5g_q7.js gzip	39.4 kB	39.4 kB	✓
0dfverolgqlu_.js gzip	155 B	N/A	-
0fli3_wppnim5.js gzip	12.9 kB	N/A	-
0guupo6x26xoo.js gzip	70.8 kB	N/A	-
0k09jwjeb-tki.js gzip	13.8 kB	N/A	-
0kb7_ep3r1z0_.js gzip	10.1 kB	N/A	-
0kpuma6a8t2qh.js gzip	168 B	N/A	-
0kw8xgqdrilf6.js gzip	8.56 kB	N/A	-
0ojkk2e654xsc.js gzip	8.59 kB	N/A	-
0sbq9bkqvh45e.js gzip	152 B	N/A	-
0wxpyd8r-vipl.js gzip	1.47 kB	N/A	-
0xnfs20vs3ysc.js gzip	153 B	N/A	-
0xy2fhla48_rd.js gzip	9.24 kB	N/A	-
10wqsvi2mgfmi.js gzip	9.82 kB	N/A	-
16lhqjoqbznyg.js gzip	220 B	220 B	✓
16vepdkipri3r.js gzip	8.51 kB	N/A	-
17n96uu6y1pxq.js gzip	8.6 kB	N/A	-
18y4_8-9or0mn.js gzip	8.51 kB	N/A	-
1elt1qium-r2m.css gzip	115 B	115 B	✓
1gq145j3kps-h.js gzip	8.62 kB	N/A	-
1l5or8vq6a69s.js gzip	154 B	N/A	-
1nsh-mbn0e-se.js gzip	8.56 kB	N/A	-
1tsrrp1tdngti.js gzip	13.3 kB	N/A	-
1zf460s-ga2zh.js gzip	154 B	N/A	-
2__-e_ym8n788.js gzip	450 B	N/A	-
2-ivsrs9yb0b0.js gzip	156 B	N/A	-
22o6xd9_ywdu6.js gzip	233 B	N/A	-
26ui6d5bv607a.js gzip	49.3 kB	N/A	-
2kvj8yrfznmwx.js gzip	5.69 kB	N/A	-
2mlsou7_9la1i.js gzip	154 B	N/A	-
2p854ctj-qiki.js gzip	65.5 kB	N/A	-
2qv7m7xjnokgr.js gzip	8.58 kB	N/A	-
341itofhl0awt.js gzip	160 B	N/A	-
342ijzvrpe53h.js gzip	2.29 kB	N/A	-
44un3--wmqiyh.js gzip	7.61 kB	N/A	-
turbopack-04..qhq7.js gzip	4.17 kB	N/A	-
turbopack-0a..u7nt.js gzip	4.19 kB	N/A	-
turbopack-0b..u348.js gzip	4.19 kB	N/A	-
turbopack-0e..wy2i.js gzip	4.19 kB	N/A	-
turbopack-0l..dasq.js gzip	4.19 kB	N/A	-
turbopack-1-..xex1.js gzip	4.2 kB	N/A	-
turbopack-1d..whps.js gzip	4.19 kB	N/A	-
turbopack-1e..pdkr.js gzip	4.18 kB	N/A	-
turbopack-3_..jxf5.js gzip	4.19 kB	N/A	-
turbopack-33..vmhx.js gzip	4.19 kB	N/A	-
turbopack-36..ne43.js gzip	4.19 kB	N/A	-
turbopack-3s..yafr.js gzip	4.19 kB	N/A	-
turbopack-3y..cu3g.js gzip	4.19 kB	N/A	-
turbopack-41..gi52.js gzip	4.19 kB	N/A	-
06eqw0ze8c7k4.js gzip	N/A	65.5 kB	-
0arkbdqpxc37i.js gzip	N/A	8.6 kB	-
0bz-xifewa17d.js gzip	N/A	8.63 kB	-
0fbm505yboynb.js gzip	N/A	49.3 kB	-
0tvekitj587fh.js gzip	N/A	8.51 kB	-
0xz7kqe1wjdqh.js gzip	N/A	169 B	-
0yvk6-wi8e9wh.js gzip	N/A	13.3 kB	-
0z83a1om5rvtt.js gzip	N/A	7.61 kB	-
1-jqyfc89tixo.js gzip	N/A	1.46 kB	-
14t1kneseb8th.js gzip	N/A	2.3 kB	-
15sb1-dsqfk_j.js gzip	N/A	8.59 kB	-
1ab2xruymo-oj.js gzip	N/A	449 B	-
1hxb-q1ungqh_.js gzip	N/A	70.8 kB	-
1tu25qtsmfhar.js gzip	N/A	9.82 kB	-
1vein_gnv3mwr.js gzip	N/A	8.56 kB	-
1wzrm0xjjbzn5.js gzip	N/A	10.1 kB	-
1z3g0uaqtv9_3.js gzip	N/A	8.56 kB	-
2-e64t22r1kgw.js gzip	N/A	153 B	-
21uyslsd4odmk.js gzip	N/A	158 B	-
25a1yz7zua29z.js gzip	N/A	13.8 kB	-
27o93knux3hfn.js gzip	N/A	156 B	-
2bi5hx402juv-.js gzip	N/A	8.58 kB	-
2hy56297fog9u.js gzip	N/A	8.52 kB	-
2r7z7i6hgc457.js gzip	N/A	155 B	-
2u_rpxq3tzytl.js gzip	N/A	233 B	-
2upap--8h9cvf.js gzip	N/A	157 B	-
323ki47w-n3e9.js gzip	N/A	157 B	-
35a8pvb74ba9h.js gzip	N/A	155 B	-
368lim5wq0o0r.js gzip	N/A	12.9 kB	-
3asf9b6dh7q99.js gzip	N/A	155 B	-
3d2-cjrz3nqd1.js gzip	N/A	161 B	-
3drqjohogojbw.js gzip	N/A	5.69 kB	-
3g8l1m2-o-ewi.js gzip	N/A	13.1 kB	-
3hhvtftvowwye.js gzip	N/A	156 B	-
3jmkxsnxg0nrh.js gzip	N/A	10.4 kB	-
3q11pdkvyja5e.js gzip	N/A	161 B	-
3r03tqt-li-wg.js gzip	N/A	158 B	-
3wpp8nvyoj121.js gzip	N/A	9.24 kB	-
turbopack-03..7g4c.js gzip	N/A	4.19 kB	-
turbopack-0f..tm9h.js gzip	N/A	4.19 kB	-
turbopack-0i..hzz_.js gzip	N/A	4.19 kB	-
turbopack-0m..zm5y.js gzip	N/A	4.2 kB	-
turbopack-0x..34yg.js gzip	N/A	4.19 kB	-
turbopack-16..zbi5.js gzip	N/A	4.19 kB	-
turbopack-1h..u9e1.js gzip	N/A	4.19 kB	-
turbopack-28..n7f5.js gzip	N/A	4.19 kB	-
turbopack-2h..wq2d.js gzip	N/A	4.19 kB	-
turbopack-2n..pnni.js gzip	N/A	4.19 kB	-
turbopack-2u..gsgj.js gzip	N/A	4.19 kB	-
turbopack-2x..crbl.js gzip	N/A	4.19 kB	-
turbopack-36..2jpx.js gzip	N/A	4.19 kB	-
turbopack-3d..8xnw.js gzip	N/A	4.17 kB	-
Total	465 kB	465 kB	⚠️ +62 B

Server

Middleware

	Canary	PR	Change
middleware-b..fest.js gzip	717 B	721 B	✓
Total	717 B	721 B	⚠️ +4 B

Build Details

Build Manifests

	Canary	PR	Change
_buildManifest.js gzip	432 B	434 B	✓
Total	432 B	434 B	⚠️ +2 B

📦 Webpack

Client

Main Bundles

	Canary	PR	Change
2637-HASH.js gzip	4.63 kB	N/A	-
7724.HASH.js gzip	169 B	N/A	-
8274-HASH.js gzip	61.4 kB	N/A	-
8817-HASH.js gzip	5.59 kB	N/A	-
c3500254-HASH.js gzip	62.8 kB	N/A	-
framework-HASH.js gzip	59.7 kB	59.7 kB	✓
main-app-HASH.js gzip	254 B	255 B	✓
main-HASH.js gzip	39.4 kB	39.4 kB	✓
webpack-HASH.js gzip	1.68 kB	1.68 kB	✓
5887-HASH.js gzip	N/A	5.61 kB	-
6522-HASH.js gzip	N/A	60.8 kB	-
6779-HASH.js gzip	N/A	4.63 kB	-
8854.HASH.js gzip	N/A	169 B	-
eab920f9-HASH.js gzip	N/A	62.8 kB	-
Total	236 kB	235 kB	✅ -643 B

Polyfills

	Canary	PR	Change
polyfills-HASH.js gzip	39.4 kB	39.4 kB	✓
Total	39.4 kB	39.4 kB	✓

Pages

	Canary	PR	Change
_app-HASH.js gzip	193 B	193 B	✓
_error-HASH.js gzip	182 B	182 B	✓
css-HASH.js gzip	333 B	334 B	✓
dynamic-HASH.js gzip	1.81 kB	1.8 kB	✓
edge-ssr-HASH.js gzip	255 B	255 B	✓
head-HASH.js gzip	353 B	349 B	🟢 4 B (-1%)
hooks-HASH.js gzip	384 B	382 B	✓
image-HASH.js gzip	581 B	581 B	✓
index-HASH.js gzip	260 B	259 B	✓
link-HASH.js gzip	2.51 kB	2.51 kB	✓
routerDirect..HASH.js gzip	316 B	318 B	✓
script-HASH.js gzip	386 B	386 B	✓
withRouter-HASH.js gzip	313 B	314 B	✓
1afbb74e6ecf..834.css gzip	106 B	106 B	✓
Total	7.98 kB	7.97 kB	✅ -10 B

Server

Edge SSR

	Canary	PR	Change
edge-ssr.js gzip	126 kB	126 kB	✓
page.js gzip	273 kB	273 kB	✓
Total	399 kB	399 kB	✅ -382 B

Middleware

	Canary	PR	Change
middleware-b..fest.js gzip	617 B	618 B	✓
middleware-r..fest.js gzip	156 B	156 B	✓
middleware.js gzip	44.2 kB	44.4 kB	✓
edge-runtime..pack.js gzip	842 B	842 B	✓
Total	45.9 kB	46.1 kB	⚠️ +195 B

Build Details

Build Manifests

	Canary	PR	Change
_buildManifest.js gzip	721 B	720 B	✓
Total	721 B	720 B	✅ -1 B

Build Cache

	Canary	PR	Change
0.pack gzip	4.38 MB	4.38 MB	✓
index.pack gzip	113 kB	111 kB	🟢 2.06 kB (-2%)
index.pack.old gzip	117 kB	112 kB	🟢 4.4 kB (-4%)
Total	4.61 MB	4.6 MB	✅ -7.58 kB

🔄 Shared (bundler-independent)

Runtimes

	Canary	PR	Change
app-page-exp...dev.js gzip	347 kB	347 kB	✓
app-page-exp..prod.js gzip	192 kB	192 kB	✓
app-page-tur...dev.js gzip	346 kB	346 kB	✓
app-page-tur..prod.js gzip	192 kB	192 kB	✓
app-page-tur...dev.js gzip	343 kB	343 kB	✓
app-page-tur..prod.js gzip	190 kB	190 kB	✓
app-page.run...dev.js gzip	343 kB	343 kB	✓
app-page.run..prod.js gzip	190 kB	190 kB	✓
app-route-ex...dev.js gzip	77 kB	77 kB	✓
app-route-ex..prod.js gzip	52.5 kB	52.5 kB	✓
app-route-tu...dev.js gzip	77.1 kB	77.1 kB	✓
app-route-tu..prod.js gzip	52.6 kB	52.6 kB	✓
app-route-tu...dev.js gzip	76.7 kB	76.7 kB	✓
app-route-tu..prod.js gzip	52.3 kB	52.3 kB	✓
app-route.ru...dev.js gzip	76.6 kB	76.6 kB	✓
app-route.ru..prod.js gzip	52.3 kB	52.3 kB	✓
dist_client_...dev.js gzip	324 B	324 B	✓
dist_client_...dev.js gzip	326 B	326 B	✓
dist_client_...dev.js gzip	318 B	318 B	✓
dist_client_...dev.js gzip	317 B	317 B	✓
pages-api-tu...dev.js gzip	43.9 kB	43.9 kB	✓
pages-api-tu..prod.js gzip	33.5 kB	33.5 kB	✓
pages-api.ru...dev.js gzip	43.9 kB	43.9 kB	✓
pages-api.ru..prod.js gzip	33.5 kB	33.5 kB	✓
pages-turbo....dev.js gzip	53.3 kB	53.3 kB	✓
pages-turbo...prod.js gzip	39.1 kB	39.1 kB	✓
pages.runtim...dev.js gzip	53.3 kB	53.3 kB	✓
pages.runtim..prod.js gzip	39.1 kB	39.1 kB	✓
server.runti..prod.js gzip	62.9 kB	62.9 kB	✓
Total	3.06 MB	3.06 MB	✅ -1 B

📎 Tarball URL

https://vercel-packages.vercel.app/next/commits/59875f45ec5a4ddd1f78087b1b1a77fe0b0a97dc/next

nextjs-bot · 2026-04-14T05:46:59Z

Tests Passed

lukesandberg · 2026-04-16T17:04:26Z

Rename MagicAny to DynTaskInput or RawTaskInputs

... in another PR

Two changes to reduce heap allocations when calling turbo-tasks functions: 1. Move persistent_task_type propagation from connect_child/IncreaseActiveCount into initialize_new_task. This removes the need to thread task_type through operations on every call (hit or miss), and lets connect_child use TaskDataCategory::Meta instead of All. 2. Add a fast-path cache lookup (try_native_call / native_call_if_consistent) that checks the task_cache with borrowed args before boxing. The macro- generated code now tries this read-only lookup first for non-self function calls. On cache hit (~85% of calls), no Box<dyn MagicAny> is allocated. On miss, falls back to the existing boxed path. Co-Authored-By: Claude <noreply@anthropic.com>

- Replace redundant closures with `RawVc::TaskOutput` (clippy) - Return `Err(0)` from VcStorage::try_native_call instead of unreachable!(), since the testing backend has no task cache - Fall back to dynamic_call (not native_call) on cache miss, since dynamic_call is the universal entry point all backends implement Co-Authored-By: Claude <noreply@anthropic.com>

- Extract native_fn before Arc::new(task_type) to avoid an extra .clone() in the Vacant arms of get_or_create_{persistent,transient}_task - Add track_cache_miss_by_fn (mirrors track_cache_hit_by_fn) - Remove explanatory comments about persistent_task_type eagerness - Remove unused persistence() method instead of suppressing warning Co-Authored-By: Claude <noreply@anthropic.com>

The static_block codegen for method calls (self/this pointer) now uses the same optimized path as free functions: args stay on the stack and we try a read-only cache lookup before boxing. For methods, we additionally check this.is_resolved() before taking the fast path, since unresolved self values need a resolution wrapper task. Co-Authored-By: Claude <noreply@anthropic.com>

Instead of expanding each macro callsite into two code paths (one for cache hit, one for miss), introduce a StackArg trait that keeps args on the caller's stack. The backend does a read-only cache lookup with a borrowed &dyn MagicAny reference; only on cache miss does take_box() move the value to the heap — zero clones, single code path per callsite. Key changes: - Add StackArg trait + StackArgSlot<T> (stack slot) + OwnedArg (boxed adapter) - dynamic_call/native_call now take &mut dyn StackArg instead of Box<dyn MagicAny> - Backend::get_or_create_*_task takes components (native_fn, this, &mut dyn StackArg) and does raw_get with borrowed arg before materializing the Box on miss - Remove try_native_call, native_call_if_consistent, try_get_or_create_* - Macro static_block reduces to a single dynamic_call with StackArgSlot Co-Authored-By: Claude <noreply@anthropic.com>

- Comment 4+5: Restore `persistence()` helper, use it in both `static_block` and `dynamic_block` to reduce diff from canary - Comment 6: Make `trait_call` take `&mut dyn StackArg` too, so `dynamic_block` (trait dispatch) also uses `StackArgSlot` instead of `Box::new(inputs)` — deferred boxing on trait calls - Comment 2: Merge `get_or_create_persistent_task` and `get_or_create_transient_task` into shared `get_or_create_task_inner` parameterized by `transient: bool` - Comment 1: Construct `CachedTaskType` in the transient panic path so `panic_persistent_calling_transient` gets a real task description Co-Authored-By: Claude <noreply@anthropic.com>

Replace the two-phase lookup (read-lock raw_get then write-lock raw_entry with re-hash) with a single raw_entry_with_hash call that takes the pre-computed hash and a heterogeneous eq closure. The map is sharded so write-lock contention is minimal, and this eliminates redundant hashing on the miss path. Co-Authored-By: Claude <noreply@anthropic.com>

…ring backing storage read The single raw_entry_with_hash approach held the dashmap write lock while calling task_by_type (backing storage). Restore the three-step flow: raw_get (read lock) -> task_by_type (no lock) -> raw_entry_with_hash (write lock), but now the write-lock step reuses the pre-computed hash instead of re-hashing. Co-Authored-By: Claude <noreply@anthropic.com>

- Delete arc_or_owned.rs (no longer referenced after ArcOrOwned removal) - Remove or_insert_with, get_mut, into_mut, RefMut and its Deref/DerefMut impls from dash_map_raw_entry (none used by current callers) - VacantEntry::insert now returns () since no caller used RefMut - Mark panic_persistent_calling_transient as -> ! to make the divergence contract explicit Co-Authored-By: Claude <noreply@anthropic.com>

…MagicAny Address review comments: - Rename StackArg -> StackMagicAny, StackArgSlot -> StackMagicAnySlot, OwnedArg -> OwnedMagicAny, arg_ref -> as_ref - FilterOwnedArgsFunctor now takes &mut dyn StackMagicAny and returns OwnedMagicAny, so the caller doesn't manually take_box + rewrap Co-Authored-By: Claude <noreply@anthropic.com>

…eate_task The Backend trait had two methods with identical signatures that only differed by transience. The caller just matched on persistence and dispatched. Merge into a single method that accepts TaskPersistence, eliminating the redundant trait surface. Co-Authored-By: Claude <noreply@anthropic.com>

Co-Authored-By: Claude <noreply@anthropic.com>

…inline variable - Thread 14: Restore comment explaining why read lock is used for Step 1 - Thread 16: Restore descriptive comments on backing storage path - Thread 19: Remove StackMagicAny doc comment from TurboTasksCallApi - Thread 20: Inline parent_task local variable in native_call Co-Authored-By: Claude <noreply@anthropic.com>

When restoring a task from backing storage, reuse the existing Arc<CachedTaskType> from the stored persistent_task_type rather than creating a new Arc from the caller's boxed copy. This avoids having two copies of the same task_type in memory. Co-Authored-By: Claude <noreply@anthropic.com>

…ents Verify that hash_from_components produces the same hash as the Hash impl on a fully constructed CachedTaskType, and that eq_components correctly matches/rejects on each component (native_fn, this, arg). Co-Authored-By: Claude <noreply@anthropic.com>

Compute the shard index once from the hash and reuse it for both the read-only cache check and the subsequent write-lock entry lookup. Saves a few math operations and a pointer dereference on the miss path. Co-Authored-By: Claude <noreply@anthropic.com>

Guarantees same layout as Option<T>, making the type suitable for FFI-like patterns and ensuring no padding overhead. Co-Authored-By: Claude <noreply@anthropic.com>

Enable dashmap's raw-api feature to access shard internals directly. get_shard() returns a reference to the shard itself, which is reused across both read-only and write-lock lookups, eliminating redundant shard index computation. Co-Authored-By: Claude <noreply@anthropic.com>

Rework task_by_type and lookup_task_candidates to accept exploded components (native_fn, this, &dyn MagicAny) instead of &CachedTaskType. This allows the backing storage lookup to happen using borrowed references from the stack — the Box<dyn MagicAny> allocation for the arg is now deferred until both the in-memory cache AND backing storage have confirmed a miss. Co-Authored-By: Claude <noreply@anthropic.com>

- Restore "another thread beat us" comment in Occupied race path - Restore "Initialize storage BEFORE making task_id visible" ordering invariant - Restore "insert() consumes e, releasing the shard write lock" - Fix stale connect_child.rs comments about removed task type update - Restore "stay Meta not All" performance rationale in aggregation_update - Improve error message in kv_backing_storage to include this parameter Co-Authored-By: Claude <noreply@anthropic.com>

lukesandberg · 2026-04-16T21:22:32Z

turbo-tasks: Reduce allocations on cache hits #92756 👈 (View in Graphite)
canary

This stack of pull requests is managed by Graphite. Learn more about stacking.

mmastrac

LGTM. I suspect that we may actually be able to remove the virtual methods on StackMagicAny in a followup, assuming that Rust isn't smart enough to devirtualize them itself with some additional tricks.

lukesandberg · 2026-04-16T21:37:33Z

This is the 'explicitly capture layout information and vtables idea?

mmastrac · 2026-04-16T21:54:21Z

This is the 'explicitly capture layout information and vtables idea?

I missed that Rust already captures layout inside the vtable already, so if you 1) have a &dyn MagicAny and 2) guarantee that it doesn't use niches and 3) use a combination of MaybeInit + Cell with a boolean, you can use that vtable information to write a universal as_ref and take_box.

You'd have something like this (hand-wavey):

 #[repr(C)]
 pub struct MagicStackAny<T: MagicAny> {
     value: MaybeUninit<T>,
     taken: UnsafeCell<bool>,
 }

impl Drop {
  // Only if not taken
}

 /// Move the value out of a `MagicStackAny<T>` and onto the heap as
 /// `Box<dyn MagicAny>`, without knowing `T` statically.
 ///
 /// # Safety
 ///
 /// `value` MUST be a reference obtained from `MagicStackAny::as_ref` on a
 /// `#[repr(C)] MagicStackAny<T>` where `T` matches the vtable carried by
 /// `value`. In particular:
 /// - The byte at offset `size_of_val(value)` past the data pointer must be
 ///   the `UnsafeCell<bool>` `taken` flag.
 /// - The caller must not read `value` again after this returns (the slot
 ///   becomes logically uninitialized, protected only by `taken = true`).
 pub unsafe fn take_magic_stack_any(value: &dyn MagicAny) -> Box<dyn MagicAny> {
     use std::alloc::{Layout, alloc, handle_alloc_error};
     use std::ptr;

     let size  = std::mem::size_of_val(value);
     let align = std::mem::align_of_val(value);
     let src   = value as *const dyn MagicAny as *const u8;

     // `taken` sits immediately past `value` (see layout invariants above).
     let taken_ptr = unsafe { src.add(size) } as *mut bool;
     assert!(unsafe { !*taken_ptr }, "take_magic_stack_any called twice");

     // Allocate a heap block matching T's layout and byte-copy the value.
     let layout = unsafe { Layout::from_size_align_unchecked(size, align) };
     let dest = unsafe { alloc(layout) };
     if dest.is_null() { handle_alloc_error(layout); }
     unsafe { ptr::copy_nonoverlapping(src, dest, size) };

     // Mark the stack slot taken so the MagicStackAny<T>'s Drop skips T::drop.
     unsafe { ptr::write(taken_ptr, true) };

     // Rebuild a Box<dyn MagicAny> with the heap pointer + the original vtable.
     let meta = ptr::metadata(value as *const dyn MagicAny);
     let fat: *mut dyn MagicAny =
         ptr::from_raw_parts_mut(dest as *mut (), meta);
     unsafe { Box::from_raw(fat) }
 }

nextjs-bot added created-by: Turbopack team PRs by the Turbopack team. Turbopack Related to Turbopack with Next.js. labels Apr 13, 2026

lukesandberg commented Apr 14, 2026

View reviewed changes

Comment thread turbopack/crates/turbo-tasks-backend/src/backend/mod.rs Outdated

lukesandberg commented Apr 14, 2026

View reviewed changes

Comment thread turbopack/crates/turbo-tasks-backend/src/backend/operation/aggregation_update.rs Outdated

lukesandberg commented Apr 14, 2026

View reviewed changes

Comment thread turbopack/crates/turbo-tasks-backend/src/backend/operation/connect_child.rs Outdated

lukesandberg commented Apr 14, 2026

View reviewed changes

Comment thread turbopack/crates/turbo-tasks-macros/src/func.rs Outdated