Skip to content

Fix multiple race conditions in TMModel #109#114

Merged
danipen merged 6 commits intodanipen:masterfrom
udlose:fix/race-conditions
Feb 17, 2026
Merged

Fix multiple race conditions in TMModel #109#114
danipen merged 6 commits intodanipen:masterfrom
udlose:fix/race-conditions

Conversation

@udlose
Copy link
Contributor

@udlose udlose commented Feb 16, 2026

Summary

Fixes #109

This PR hardens TMModel's lifecycle and threading behavior under its intended usage (UI-thread document edits + background tokenization), addressing races, deadlock vectors, and resource cleanup issues without changing the public API surface

Fixes / changes included

Lifecycle + disposal correctness

  • Added atomic dispose guards using Interlocked.CompareExchange to prevent double-dispose / double-stop behavior under concurrent calls
  • Made tokenizer thread cleanup non-blocking by deferring disposal until the worker task completes (via continuation), avoiding UI-thread stalls during shutdown
  • Ensured the tokenizer worker's resources (CancellationTokenSource, AutoResetEvent, and worker Task) are cancelled/disposed safely and exactly once

Tokenizer thread state machine hardening

  • Introduced a dedicated lifecycle lock inside TokenizerThread to guard start/stop/signal/dispose transitions and eliminate check-then-act races
  • Worker loop exits promptly if the active thread instance has been replaced by SetGrammar/Stop (thread identity check under the model lock)
  • Capture the current grammar epoch and tokenizer thread when building a token-change event, and re-validate them before delivering callbacks; if SetGrammar advanced the epoch or swapped the thread, drop the event to avoid emitting token-change notifications under a different grammar context (stale publish)

Model state synchronization (race fixes)

  • Synchronized reads of ModelLine.State used for tokenization by capturing the starting state under _lock before calling into the tokenizer
  • Published tokenization results (ModelLine.Tokens, ModelLine.IsInvalid, and end-state propagation) under _lock with epoch/thread/tokenizer identity checks so stale work is discarded and re-queued safely

Lock ordering + deadlock prevention

  • Standardized lock ordering as _lock → listeners and documented it in code comments
  • Avoided invoking listeners under lock(listeners) by snapshotting the listener list under the lock and invoking callbacks outside the lock
  • Adjusted listener add/remove paths to avoid lock-order inversion with Stop() and to prevent adding listeners to a disposed model

ForceTokenization / invalidation robustness

  • ForceTokenization(start,end) clamps indices under _lock and ensures grammar/thread checks are coherent with the captured thread instance (avoids TOCTOU races)
  • InvalidateLineRange clamps indices under _lock and signals the worker thread safely after enqueueing invalid work

Notes / non-goals

  • This PR intentionally avoids behavioral changes to tokenization semantics and avoids allocations on hot paths (no defensive token copies)

  • ITMModel.cs: Updates interface to inherit IDisposable, clarifying resource management contract.
  • TMModelTests.cs: Adds tests for Dispose behavior, listener removal, and grammar flip deadlock prevention.

ITMModel now inherits from IDisposable, removing the explicit Dispose method declaration. Enhanced XML documentation for clarity and maintainability.
Expanded TMModelTests with new tests for Dispose() (non-blocking, idempotency), event emission after listener removal, and grammar flip deadlock scenarios. Added CountingListener and CountingClearListener helpers. Updated usings and refactored ModelLinesMock for clarity. Improves test coverage for resource management and concurrency.
Refactor TMModel threading and disposal for safety

Comprehensively refactored TMModel and TokenizerThread to eliminate race conditions, prevent deadlocks, and ensure safe disposal. Introduced atomic disposal flags, improved thread lifecycle management, and enforced strict lock ordering. Listener management is now deadlock-free, and all public methods are robust against concurrent access and disposal. Added extensive documentation and improved error handling. These changes significantly increase the reliability and maintainability of background tokenization.
Removed explanatory FIX comments related to thread-safety and resource management in TMModel.cs. The code logic remains unchanged; this cleanup improves code readability by eliminating detailed inline justifications.
Added detailed XML documentation to AbstractLineList, clarifying threading, lock-ordering, and safe usage patterns to prevent deadlocks with TMModel. Replaced foreach with for loops for line iteration. Marked mLock as readonly and documented its role in thread safety. Enhanced remarks for invalidation methods to guide correct lock usage.
Improve event emission safety in TMModel

Refactored event emission logic to track TokenizerThread and grammar epoch, ensuring events are only published if model state remains consistent during callbacks. Added stricter checks to prevent stale events after concurrent Stop, Dispose, or SetGrammar calls. Updated listener notification to handle nulls and re-validate model state per listener.
Copy link
Owner

@danipen danipen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of changes! The PR is well-engineered and addresses real race conditions without changing tokenization semantics. LGTM

@danipen
Copy link
Owner

danipen commented Feb 17, 2026

Tested with AvaloniaEdit's current master (AvaloniaUI/AvaloniaEdit@7fa5d88). No issues found.

@danipen danipen merged commit 9aabe1b into danipen:master Feb 17, 2026
5 checks passed
@danipen
Copy link
Owner

danipen commented Feb 17, 2026

Merged! Thanks for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race Conditions in TMModel

2 participants