Skip to content

perf: pre-reserve replace_rec_fn cache to avoid early rehashing#13376

Draft
Kha wants to merge 1 commit intoleanprover:masterfrom
Kha:push-qrytwnmxnvqk
Draft

perf: pre-reserve replace_rec_fn cache to avoid early rehashing#13376
Kha wants to merge 1 commit intoleanprover:masterfrom
Kha:push-qrytwnmxnvqk

Conversation

@Kha
Copy link
Copy Markdown
Member

@Kha Kha commented Apr 12, 2026

This PR pre-reserves 128 buckets in the replace_rec_fn cache when caching is enabled. The default std::unordered_map starts very small and rehashes several times during a typical traversal, and the resulting bucket-array churn dominated cache-miss stalls in the hot kernel traversal. Reserving up front shaves another ~4% off the wall-clock of leanchecker --fresh Init.Data.List.Lemmas on top of the previous try_emplace change, for a combined ~8% wall-clock improvement (with a small ~1.5% increase in retired instructions that is more than offset by the improved IPC).

This PR pre-reserves 128 buckets in the `replace_rec_fn` cache when
caching is enabled. The default `std::unordered_map` starts very small
and rehashes several times during a typical traversal, and the
resulting bucket-array churn dominated cache-miss stalls in the hot
kernel traversal. Reserving up front shaves another ~4% off the
wall-clock of `leanchecker --fresh Init.Data.List.Lemmas` on top of
the previous `try_emplace` change, for a combined ~8% wall-clock
improvement (with a small ~1.5% increase in retired instructions
that is more than offset by the improved IPC).
@Kha
Copy link
Copy Markdown
Member Author

Kha commented Apr 12, 2026

!bench

@leanprover-radar
Copy link
Copy Markdown

leanprover-radar commented Apr 12, 2026

Benchmark results for 1f4b906 against 82bb27f are in. Significant changes detected! @Kha

  • 🟥 build//instructions: +17.5G (+0.15%)

Medium changes (3✅, 3🟥)

  • 🟥 elab/big_match//instructions: +113.5M (+0.98%)
  • 🟥 elab/cbv_divisors//instructions: +422.9M (+0.79%)
  • 🟥 elab/cbv_leroy//instructions: +409.4M (+0.87%)
  • elab/grind_ring_5//instructions: -103.9M (-1.17%)
  • misc/leanchecker --fresh Init//task-clock: -1s (-5.50%)
  • misc/leanchecker --fresh Init//wall-clock: -1s (-5.50%)

Small changes (7✅, 69🟥)

Too many entries to display here. View the full report on radar instead.

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Apr 12, 2026
@mathlib-lean-pr-testing
Copy link
Copy Markdown

Mathlib CI status (docs):

  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2026-04-09 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. You can force Mathlib CI using the force-mathlib-ci label. (2026-04-12 17:00:30)

@leanprover-bot
Copy link
Copy Markdown
Collaborator

Reference manual CI status:

  • ❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-04-09 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-04-12 17:00:32)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants