(fix): fallback type memory leak in checkpoint/rewind cycle#19
Merged
(fix): fallback type memory leak in checkpoint/rewind cycle#19
Conversation
…ype leak When a new fallback type (non-fixed-slot) was first created inside a @with_pool scope, get_typed_pool! auto-checkpointed it but did not set _touched_has_others[depth] = true. In typed-lazy mode, _acquire_impl! bypasses _record_type_touch!, so the has_others flag stayed false, causing lazy/typed-lazy rewind to skip pool.others entirely and leak the new type's n_active count. Fix: set _touched_has_others = true in get_typed_pool!'s slow path (both CPU and CUDA extension). Add test/test_fallback_reclamation.jl with 28 test sections (387 tests) covering fallback type reclamation, parametric struct (FakeDual) leak detection, and the exact _acquire_impl! bypass bug reproduction.
…eckpoints checkpoint!(pool, types...) always pushed has_others=false, causing _typed_lazy_rewind! to skip pool.others on 2nd+ calls when the fallback type already existed. Fix: compute has_any_fallback at compile time via _fixed_slot_bit check. Add idempotency guard to _checkpoint_typed_pool! to prevent double-push from get_typed_pool! auto-checkpoint + explicit checkpoint call. Apply same fixes to CUDA extension for parity.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19 +/- ##
=======================================
Coverage 96.46% 96.47%
=======================================
Files 9 9
Lines 1273 1276 +3
=======================================
+ Hits 1228 1231 +3
Misses 45 45
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Fixes a leak in fallback-type (pool.others) accounting during repeated checkpoint/rewind cycles—especially in typed / typed-lazy macro paths where _acquire_impl! bypasses _record_type_touch!—and adds regression tests (including a ForwardDiff.Dual-like scenario) plus CUDA extension parity.
Changes:
- Ensure fallback pools mark
has_othersduring first-touch creation (get_typed_pool!) so lazy/typed-lazy rewind doesn’t skippool.others. - Correct
checkpoint!(single- and multi-type) to pushhas_others=truewhen any tracked type is a fallback; add an idempotency guard to avoid double checkpoint pushes. - Add an extensive fallback reclamation test suite and include it in
runtests.jl; mirror fixes in the CUDA extension.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/types.jl |
Marks _touched_has_others when creating a new fallback typed pool inside a checkpointed scope. |
src/state.jl |
Fixes has_others tracking for typed checkpoints; adds checkpoint idempotency guard. |
ext/AdaptiveArrayPoolsCUDAExt/dispatch.jl |
Mirrors the get_typed_pool! fallback-touch fix for CUDA pools. |
ext/AdaptiveArrayPoolsCUDAExt/state.jl |
Mirrors has_others logic for CUDA typed checkpoints (single + varargs). |
test/test_fallback_reclamation.jl |
Adds comprehensive regression coverage for fallback lifecycle / leak scenarios. |
test/runtests.jl |
Runs the new fallback reclamation test file in the suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace fresh_pool assertion (always empty, never tests actual rewind) with task-local pool n_active checks for UInt8 and Float16.
Sentinel pattern ([0] initial value) guarantees _checkpoint_depths is never empty. _rewind_typed_pool! already relies on this invariant with unguarded @inbounds access — align checkpoint to the same contract.
mgyoo86
referenced
this pull request
Mar 4, 2026
mgyoo86
referenced
this pull request
Mar 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fallback types (e.g.,
ForwardDiff.Dual) stored inpool.othersleakedn_activeon repeated@with_poolcalls. After N iterations,n_activegrew to N instead of returning to 0.Root cause — two independent bugs in
_touched_has_otherstracking:get_typed_pool!: When creating a new fallback type inside a checkpoint scope,_touched_has_otherswas not set totrue. The rewind path (_typed_lazy_rewind!) skippedpool.othersentirely.checkpoint!(pool, types...): Always pushedhas_others = false, regardless of whether any type was a fallback. On the 2nd+ call (type already exists,get_typed_pool!closure doesn't fire), the flag stayedfalse— causing the same skip in_typed_lazy_rewind!.Fix
get_typed_pool!_touched_has_others[depth] = truewhen auto-checkpointing a new fallback typecheckpoint!(pool, ::Type{T})_fixed_slot_bit(T) == UInt16(0)(constant-folded at compile time)checkpoint!(pool, types...)has_any_fallbackat compile time in@generatedbody via_fixed_slot_bit_checkpoint_typed_pool!get_typed_pool!auto-checkpoint + explicit checkpointCuAdaptiveArrayPoolparityPerformance impact: zero. All fallback detection is resolved at compile time (
@generatedbody or constant folding). The idempotency guard adds one@inboundsarray-end check (~0.3ns).Tests
test_fallback_reclamation.jlcovering all fallback lifecycle scenariosForwardDiff.gradientleak pattern with repeated typed checkpoints