Fix #116 and Performance/reduce allocations#121
Conversation
Converted test class to public and added extensive unit tests for AttributedScopeStack, covering constructors, equality, hash codes, scope name retrieval, attribute merging, and push operations. Includes edge cases, reflection-based tests, and helpers for stack creation and metadata. Validates robustness and correctness of all key behaviors.
…l stacks (null Parent / null ScopePath). Add parameter validation to PushAtributed to throw ArgumetNullException instead of a NullReferenceException
…le TFM #ifdef for GetHashCode - Refactored AttributedScopeStack for performance and clarity: precomputed hash codes, optimized Equals/GetHashCode, and improved scope matching logic. - Changed handling of empty parentScopes to "always matches" instead of throwing. - Enhanced Push logic to handle multi-scope strings and trailing spaces correctly. - Expanded and clarified unit tests for scope matching and stack manipulation, including new edge cases. - Improved resilience to malformed theme data and overall test coverage.
…hrow ArgumentOutOfRangeException thru the Matches method if an empty scopesList param was passed. This was because there was no length check on the collection before the collection was indexed. I opted to change Matches to return true if parentScopes is empty. Let me know if you agree with this change. Updated the Matches method to return true when parentScopes is null or empty, ensuring correct behavior when no parent scopes are provided.
Updated .gitignore to exclude Visual Studio Live Unit Testing files, specifically those with the .lutconfig extension, to prevent committing generated test artifacts.
Simplified dictionary add/update logic by directly assigning values without checking for key existence. Replaced ContainsKey with TryGetValue in Remove(string aKey) for efficiency. Refactored Remove(JSONNode aNode) to avoid LINQ and exception handling, using a straightforward iteration to find and remove the node. These changes improve code clarity and maintainability.
Introduced StateStackTests with extensive NUnit test coverage for the StateStack.ToString() method. Tests include various stack depths, special rule IDs, boundary conditions, repeated calls, and correct rule ordering. No changes to existing code.
Refactored the StateStack class to improve code clarity and performance. Updated using directives, modernized the Equals method with pattern matching, and reimplemented ToString using StringBuilder and an array of RuleIds for efficiency. Removed the old recursive string construction logic. Reformatted GetHashCode for readability. These changes enhance maintainability and performance.
Simplified type checking and casting for IncludeOnlyRule, BeginEndRule, and BeginWhileRule by using C# pattern matching. This improves code readability and removes redundant casts and variable declarations.
Introduce ParsedThemeTests and ThemeTests to cover a wide range of scenarios for theme parsing and management. Tests include handling of null/empty/whitespace includes, color ID management for various formats, GUI color dictionary behavior, color map correctness, scope matching, and rule overwriting. Utilizes Moq and NUnit for thorough coverage and reliability.
Added extensive unit tests for ParsedTheme and Theme, covering edge cases, concurrency, color and font style parsing, rule sorting, and default rule handling. Improved test structure with region markers and helper methods. Ensured robust validation of theme logic, including thread-safety and handling of malformed or unusual input.
Refactored theme parsing and caching to minimize memory allocations and improve performance. - Replaced String.Split/LINQ with span-based parsing for scopes and font styles. - Introduced thread-safe, lock-free caching using ConcurrentDictionary and atomic operations. - Improved sorting and default rule handling to avoid unnecessary allocations. - Enhanced code clarity with comments and better variable naming. These changes make theme handling faster, more scalable, and robust for concurrent use.
…ing allocations Added a fast hash code check to AttributedScopeStack equality for O(1) rejection of non-equal stacks. Refactored GenerateScopes to use a preallocated List<string> and reverse it, improving clarity and efficiency.
Removed unnecessary result list variable in ParseTheme. Now returns a new empty list directly on early exit, improving code clarity and efficiency.
Introduced RawTests in the TextMateSharp.Tests.Internal.Grammars.Parser namespace using NUnit. The new test suite covers merging, property accessors, captures, patterns, injections, repository, boolean flags, file type handling, deep cloning, and enumeration for the Raw class. These tests ensure correct behavior, deep copy semantics, and robust handling of edge cases, significantly improving test coverage for Raw and related interfaces.
Refactored Raw.cs to use pattern matching for type checks, replaced string concatenation with ToString(), and ensured proper initialization of file type lists. Optimized list and dictionary allocations, made ConvertToDictionary static, and improved code clarity by removing redundant casts and branches.
Added MatcherBuilderTests and NameMatcherTests for thorough coverage of matcher expression parsing, priority, and edge cases using NUnit and Moq. Updated MatcherTests.cs to fix namespace and fully qualify Matcher.CreateMatchers usage. Improves reliability and test coverage for matcher logic.
Added null checks to constructors and methods for safety. Optimized list initializations with expected capacities. Refactored MatcherBuilder parsing logic for clarity and efficiency. Made utility methods static/private where appropriate. Improved tokenizer immutability and input validation. Enhanced RegExpSource and RegExpSourceList with more efficient iteration and allocation. Removed unnecessary usings and improved code readability.
Refactored identifier matching to use a foreach loop instead of LINQ's All, ensuring lastIndex is updated correctly after each match. This prevents re-scanning scopes and improves matching accuracy. Added comments to clarify the bug and the fix.
Improved list initialization with expected capacity in theme rule handling for better performance. Fixed grammar package resource name construction in ResourceLoader. Minor formatting and whitespace adjustments for consistency.
…d-safe test assertion Updated the test assertion in ParsedThemeTests to use Volatile.Read when checking totalCalls. This ensures thread-safe access to the variable in multi-threaded scenarios, improving reliability of the test.
|
Other than the two comments above the PR overall LGTM. Thanks! |
@danipen I don't see any comments from you |
Sorry I didn't submit the review. Check now please. |
Refactored ParsedTheme and Theme classes to eliminate the unused 'priority' parameter from ParseTheme, ParseInclude, and LookupThemeRules methods. Updated all method calls and related test cases accordingly. This cleanup simplifies method signatures and removes dead code.
Removed the unused second argument (0) from the ParsedTheme.ParseTheme call in ThemeParsingTest.cs, reflecting the updated method signature that now only requires the theme parameter.
|
@danipen how do you want me to handle unit tests where I'm testing for inequality but there is a super small risk of the hash of the different object being equal which could cause the test to fail? example: I'd prefer to keep them as re-running would pass them, but if you're not cool with having something that could (very rarely) fail, I can remove them. But i did document the behavior. |
|
I prefer not having non-deterministic unit tests. |
cool - I can respect that! i will remove them |
|
@danipen I just noticed that the upstream java implementation declares I don't know which is more important to you:
Here is a list of upstream differences I found - I only looked in this folder/namespace. So, there may also be others:
|
I guess nobody is using this lib to extend it so I guess the breaking change will be minor impact. Anyway as we're incrementing the version after these changes we are ok.
Yes please. |
|
since you are cool with the breaking changes I described above, do you also want me to hide the members that are used to calculate GetHashCode? Most of the classes I've seen are not immutable which breaks the "contract" with GetHashCode. BTW, if I'm suggesting too many changes, please let me know :) |
If are hidable, probably yeah, makes sense to hide.
Appreciated the work. Thanks again. |
… tests that could produce hash collisions for inequal objects - Add comprehensive IEquatable and operator ==/!= tests - Remove hash code difference tests for similar content - Use C# collection expressions for cleaner initialization - Refactor array usage in method invocations - Move reflection helper for private Equals to end of file - Improve test naming and coverage; no production code changes
|
Refactored AttributedScopeStack to implement IEquatable<T> and provide strongly-typed Equals methods. Added operator overloads for == and !=, and improved null safety using ReferenceEquals. Refactored ComputeHashCode for clarity and consistency, and added XML documentation to clarify method purposes. These changes improve value equality semantics and support for hash-based collections.
Hidable: currently internal or public but not used elsewhere.
Can we respect the upstream public definition but expose as read-only (private set) to avoid the issue? |
yes. I should have been more clear - when I said "hide" I meant to make the class immutable by making the setters private - and if the field/property was already internal or public, the |
Expanded StateStackTests to cover GetHashCode, Equals, IEquatable, operator overloads, HasSameRuleAs, WithContentNameScopesList, WithEndRule, Pop/SafePop, and Reset. Tests verify correct behavior for equality, hash code determinism, structural equivalence, null handling, dictionary usage, and edge cases including deep stack scenarios. Helper methods added for test scope stack creation.
|
All good then |
Added comprehensive tests for AttributedScopeStack, including equality (reflexivity, symmetry, deep equivalence, null scope path), hash code consistency, operator overloads (==, !=), and ToString formatting for various stack depths and null values. Improves reliability and correctness in edge scenarios.
…tream implementation Override ToString() in AttributedScopeStack to return a space-separated string of scope names from root to leaf. Includes XML documentation for the new method.
Refactored StateStack to implement IEquatable<StateStack> with efficient, null-safe equality and hash code logic. Equality now uses iterative structural comparison and precomputed hash codes for performance. Added operator overloads for == and !=, and replaced GetHashCode with a multiply-accumulate scheme. WithContentNameScopesList uses null-safe comparison. Enhanced HasSameRuleAs to traverse parent chain and check both RuleId and enter position, throwing ArgumentNullException for null input. Added XML documentation for clarity.
Initialized List<Predicate<T>> matchers with a capacity of 4 in two places to improve memory usage and performance when adding multiple matchers.
Quite decent! (you increased a lot actually) |
…) checks that would be triggered since we overload ==, !=, etc. and are unable to use 'is not' because we are restricted to C# 8.0 Replaced standard null checks with ReferenceEquals to avoid issues with overloaded equality operators and improve performance. Added explanatory comments and made minor code style adjustments for clarity and consistency.
|
all done with this one - benchmarks posted above. |

Fixes #116 (separate PR coming for #117 )
This pull request introduces comprehensive improvements to the
AttributedScopeStackandStateStackclasses, focusing on performance, correctness, and test coverage. The most significant changes include optimizing equality and hashing forAttributedScopeStack, refactoring scope parsing logic for efficiency and correctness, and adding robust unit tests forStateStack's string representation. These changes enhance the reliability and speed of grammar stack operations in TextMateSharp.Performance and Correctness Improvements
AttributedScopeStacknow precomputes hash codes for each instance, enabling fast equality checks and preventing unnecessary traversal of the parent chain. The equality logic has also been updated to use these hash codes and to compare scope paths using ordinal string comparison for correctness. [1] [2]AttributedScopeStackis refactored to avoid unnecessary string allocations and to improve correctness, especially for edge cases involving scope selectors.Refactoring and Efficiency
AttributedScopeStackis rewritten to useReadOnlySpan<char>for efficient, allocation-free parsing, and to correctly handle trailing spaces and empty segments. [1] [2]GenerateScopesmethod now precomputes the stack depth to pre-size the resulting list, reducing memory allocations and improving performance.Testing and Validation
StateStack.ToStringinStateStackTests.cs, covering single and multi-depth stacks, special rule IDs, boundary conditions, and repeated calls for consistency. This ensures the correctness and reliability of the stack's string representation.Code Quality and Maintenance
Equalsmethod inStateStackis refactored for clarity and correctness, using pattern matching and improved structure.ToStringmethod inStateStackis rewritten for efficiency, using a single pass to collect rule IDs and aStringBuilderto assemble the output, reducing allocations and improving readability.Minor Fixes
SimpleJSON.csto simplify dictionary assignment logic.Benchmarks
Before:
TextMateSharp.Benchmarks.BigFileTokenizationBenchmark-report-github.md
After:
TextMateSharp.Benchmarks.BigFileTokenizationBenchmark-report-github-ScopeStack work - final.md