Fix search performance regression and TUI race condition by jesserobbins · Pull Request #256 · wesm/msgvault

jesserobbins · 2026-04-10T19:25:42Z

Tested and ran into a bunch of performance & context issues that surfaced in the TUI. Original approach using ILIKE is better it seems like for large archives.

Changes

Revert DuckDB fast search from regexp_matches back to ILIKE — on large Parquet archives the regex-per-row evaluation caused TUI fast search to hang. ILIKE is natively optimized by DuckDB for columnar Parquet scans. FTS5 deep search path is unchanged.
Fix TUI search race condition — when starting an inline search, invalidate in-flight loadMessages responses (from sort toggle or "all messages" view) so they can't overwrite search results after the search completes.
Clear m.messages immediately on search start — prevents stale pre-search messages from showing during the async search transition.
Remove dead code — escapeRegex and startsWithWordChar are no longer used.
Add regression tests — 4 new tests covering the race condition, stale-message clearing, and ILIKE path verification for both fast and aggregate search.

🤖 Generated with Claude Code

The regexp_matches approach caused significant performance regression in TUI fast search (Parquet scans). ILIKE is natively optimized by DuckDB for Parquet columnar scans. FTS5 deep search path remains unchanged for body text accuracy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When starting an inline search, increment loadRequestID to prevent in-flight loadMessages responses from overwriting search results. Previously, pressing (a) to view all messages then quickly searching could cause the stale loadMessages response to replace search results with unfiltered messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- TestStaleLoadMessagesDoesNotOverwriteSearch: verifies that in-flight loadMessages responses (from sort toggle or "all messages" view) are ignored after a search starts, preventing stale results from overwriting search results - TestSearchClearsStaleMessages: verifies messages are cleared immediately when search starts so stale results never display - TestBuildSearchConditions_UsesILIKENotRegex: verifies the fast search path uses ILIKE (fast on Parquet) instead of regexp_matches (slow) - TestBuildAggregateSearchConditions_UsesILIKENotRegex: same check for the aggregate search path - Remove dead code: escapeRegex and startsWithWordChar (no longer used after reverting to ILIKE) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-04-10T19:27:56Z

roborev: Combined Review (`d3811db`)

Medium issue found; security review and one default review found no additional issues.

Medium

internal/query/duckdb.go:447 - buildAggregateSearchConditions now emits 4 arguments per text term with no key columns, but countArgsForTextTerms still skips only 3. buildStatsSearchConditions uses that helper when a stats search has both text terms and non-text filters, so it can append leftover text-term args without matching placeholders and fail aggregate stats queries.

Suggested fix: Update countArgsForTextTerms to return n * 4 and add a regression test for a stats/aggregate query combining a text term with a non-text operator such as from:.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

jesserobbins · 2026-04-10T19:31:50Z

Missed a chance to make a joke that ILIKE Ducks or something but will keep the PR

jesserobbins · 2026-04-10T19:35:55Z

roborev: Combined Review (d3811db)

Medium issue found; security review and one default review found no additional issues.

Medium

internal/query/duckdb.go:447 - buildAggregateSearchConditions now emits 4 arguments per text term with no key columns, but countArgsForTextTerms still skips only 3. buildStatsSearchConditions uses that helper when a stats search has both text terms and non-text filters, so it can append leftover text-term args without matching placeholders and fail aggregate stats queries.

Suggested fix: Update countArgsForTextTerms to return n * 4 and add a regression test for a stats/aggregate query combining a text term with a non-text operator such as from:.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Interesting will look and fix

buildStatsSearchConditions used to call buildAggregateSearchConditions and slice off the text-term prefix using a hand-tracked countArgsForTextTerms helper. Any drift between the helper and the actual per-term arg count (subject + snippet + 2 sender = 4) would cause aggregate stats queries to fail with unmatched placeholders when a search mixed text terms with non-text operators like from:. Extract the non-text filter portion of buildAggregateSearchConditions into a new buildNonTextSearchConditions helper and call it directly from buildStatsSearchConditions, eliminating the arg-slicing workaround and the brittle countArgsForTextTerms constant entirely. Add regression tests that lock the invariant in place by counting "?" placeholders in the emitted WHERE conditions and comparing to len(args) across a matrix of query shapes (text only, text+from, text+to+subject, label view, date/size filters, etc.) for both builders.

roborev-ci · 2026-04-10T21:02:08Z

roborev: Combined Review (`bf776fb`)

Verdict: Changes are mostly sound, but there are 2 medium-severity behavioral regressions that should be addressed before merge.

Medium

Transient false "No results found" state
Location: internal/tui/model.go:1190
Clearing m.messages as soon as a debounced search starts creates a misleading empty state. The list view interprets len(m.messages) == 0 && !m.loading as "No results found", but this path only sets inlineSearchLoading, not loading, so users can briefly see a false zero-results message while the search is still in flight.
Suggested fix: Keep the previous rows visible until new results arrive, or suppress the empty-state message while inlineSearchLoading is true.
Search semantics broadened from word-start matching to arbitrary substring matching
Location: internal/query/duckdb.go:438, internal/query/duckdb.go:2373
Replacing the previous word-boundary regex search with %term% ILIKE materially changes free-text matching behavior. Queries that previously matched term starts now also match arbitrary substrings inside words, names, domains, and phone numbers, which is likely to produce noisier results. Current tests only validate SQL shape and escaping, so this behavior change is not covered as an intentional regression.
Suggested fix: Preserve the prior word-start semantics, or add explicit behavioral tests and documentation if the broader substring matching is intentional.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

jesserobbins · 2026-04-10T21:12:06Z

Hi! I disagree with these recommendations which is why I am proposing reverting to the prior search behavior in this PR.

A user initiating a search expects to see only the results of that search. Retaining and presenting cached and irrelevant search is confusing and counterproductive. Additionally the TUI is inconsistent in how it updates and displays this as results come in. The TUI should display the search response or nothing.
I agree that the regex based deep search is better. However it is unacceptably slower in my own testing.

So recommend proceeding with the current solution, and in future make TUI search flows use same path as the search function at the cli.

Thanks!

jesserobbins · 2026-04-10T21:18:04Z

(Also @wesm roborev is pretty cool!)

Temporary tracking branch for testing the deduplicate feature together with PR wesm#256 (search perf + race fix). Delete once wesm#256 is merged.

jesserobbins and others added 3 commits April 10, 2026 11:33

jesserobbins marked this pull request as ready for review April 10, 2026 21:16

jesserobbins mentioned this pull request Apr 10, 2026

Search produces false positives from substring matching and ignores sort/body context #253

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix search performance regression and TUI race condition#256

Fix search performance regression and TUI race condition#256
jesserobbins wants to merge 4 commits intowesm:mainfrom
jesserobbins:fix-search-perf-and-race

jesserobbins commented Apr 10, 2026

Uh oh!

roborev-ci bot commented Apr 10, 2026

Uh oh!

jesserobbins commented Apr 10, 2026

Uh oh!

jesserobbins commented Apr 10, 2026

roborev: Combined Review (`d3811db`)

Medium

Uh oh!

roborev-ci bot commented Apr 10, 2026

Uh oh!

jesserobbins commented Apr 10, 2026 •

edited

Loading

Uh oh!

jesserobbins commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jesserobbins commented Apr 10, 2026

Changes

Uh oh!

roborev-ci bot commented Apr 10, 2026

roborev: Combined Review (d3811db)

Medium

Uh oh!

jesserobbins commented Apr 10, 2026

Uh oh!

jesserobbins commented Apr 10, 2026

roborev: Combined Review (d3811db)

Medium

Uh oh!

roborev-ci bot commented Apr 10, 2026

roborev: Combined Review (bf776fb)

Medium

Uh oh!

jesserobbins commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jesserobbins commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roborev: Combined Review (`d3811db`)

roborev: Combined Review (`d3811db`)

roborev: Combined Review (`bf776fb`)

jesserobbins commented Apr 10, 2026 •

edited

Loading