Skip to content

perf: de-virtualize replace_rec_fn dispatch via templated functor#13380

Draft
Kha wants to merge 1 commit intoleanprover:masterfrom
Kha:push-ymynvpuplvlo
Draft

perf: de-virtualize replace_rec_fn dispatch via templated functor#13380
Kha wants to merge 1 commit intoleanprover:masterfrom
Kha:push-ymynvpuplvlo

Conversation

@Kha
Copy link
Copy Markdown
Member

@Kha Kha commented Apr 12, 2026

This PR turns replace_rec_fn into a class template parameterized on the functor type and exposes it from kernel/replace_fn.h, alongside templated replace<F>(...) overloads that monomorphize on the functor at the call site. Hot callers like instantiate, instantiate_rev, and the C-exported lean_expr_instantiate*_core variants in kernel/instantiate.cpp all pass their substitution closure as a stateless lambda; previously each call into the closure went through std::function's indirect dispatch on every recursion step, blocking inlining of the loose-bvar-range early return. With this change the closure body inlines into replace_rec_fn::apply and the std::function indirection is gone.

The legacy replace(expr, std::function<...>, bool) overload is preserved as a thin out-of-line wrapper that instantiates the template once with std::function as the functor type, so external callers continue to work unchanged.

On leanchecker --fresh Init.Data.List.Lemmas this shaves 17.10 G -> 16.50 G instructions (~3.5%) and 1.74s -> 1.69s wall-clock (~3%). All existing callers benefit automatically since they were already passing lambdas directly; no caller-side changes were required.

This PR turns `replace_rec_fn` into a class template parameterized on the
functor type and exposes it from `kernel/replace_fn.h`, alongside templated
`replace<F>(...)` overloads that monomorphize on the functor at the call
site. Hot callers like `instantiate`, `instantiate_rev`, and the C-exported
`lean_expr_instantiate*_core` variants in `kernel/instantiate.cpp` all pass
their substitution closure as a stateless lambda; previously each call into
the closure went through `std::function`'s indirect dispatch on every
recursion step, blocking inlining of the loose-bvar-range early return.
With this change the closure body inlines into `replace_rec_fn::apply` and
the `std::function` indirection is gone.

The legacy `replace(expr, std::function<...>, bool)` overload is preserved
as a thin out-of-line wrapper that instantiates the template once with
`std::function` as the functor type, so external callers continue to work
unchanged.

On `leanchecker --fresh Init.Data.List.Lemmas` this shaves
`17.10 G -> 16.50 G` instructions (~3.5%) and `1.74s -> 1.69s` wall-clock
(~3%). All existing callers benefit automatically since they were already
passing lambdas directly; no caller-side changes were required.
@Kha
Copy link
Copy Markdown
Member Author

Kha commented Apr 12, 2026

!bench

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Apr 12, 2026

Benchmark results for ea73513 against 82bb27f are in. Significant changes detected! @Kha

  • build//instructions: -84.4G (-0.70%)

Large changes (8✅)

  • elab/big_do//instructions: -218.7M (-0.92%)
  • elab/big_struct_dep//instructions: -538.9M (-3.70%)
  • elab/cbv_divisors//instructions: -955.9M (-1.79%)
  • elab/grind_bitvec2//instructions: -3.0G (-1.54%)
  • elab/grind_ring_5//instructions: -438.6M (-4.96%)
  • elab/mut_rec_wf//instructions: -461.7M (-1.70%)
  • misc/import Init.Data.BitVec.Lemmas//instructions: -2.5G (-2.05%)
  • misc/leanchecker --fresh Init//instructions: -15.1G (-4.02%)

Medium changes (28✅)

Too many entries to display here. View the full report on radar instead.

Small changes (604✅, 2🟥)

Too many entries to display here. View the full report on radar instead.

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Apr 12, 2026
@mathlib-lean-pr-testing
Copy link
Copy Markdown

Mathlib CI status (docs):

  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2026-04-09 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. You can force Mathlib CI using the force-mathlib-ci label. (2026-04-12 17:58:57)

@leanprover-bot
Copy link
Copy Markdown
Collaborator

Reference manual CI status:

  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-04-09 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-04-12 17:58:58)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants