-
Notifications
You must be signed in to change notification settings - Fork 5
Inline silent meta-rules #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughBumps CI Rust toolchain to 1.81; widespread formatting macro style changes to named placeholders; generator inlines silent rules with arguments and renames generated state identifiers; meta parser adds Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev
participant Gen as generator::typed::generator
participant Inl as inline_silent_rules_with_args
participant Mod as GrammarModule
participant Code as Codegen
Dev->>Gen: generate_typed(module)
Gen->>Inl: inline_silent_rules_with_args(module)
Inl->>Mod: collect_silent_rules_with_args(module & imports)
Inl->>Inl: traverse parse exprs, substitute args into call sites
Inl-->>Gen: inlined_module
Gen->>Code: generate typed parser from inlined_module
Code-->>Dev: typed parser artifacts
sequenceDiagram
autonumber
actor Dev
participant P as meta::parser::parse_rule
participant T as TokenStream
Dev->>P: parse_rule(tokens)
P->>T: peek for opening_brace
alt braces present
P->>T: skip opening_brace
P->>T: parse body
P->>T: skip closing_brace
else no braces
P->>T: parse body directly
end
P-->>Dev: Rule { body }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #25 +/- ##
==========================================
- Coverage 87.27% 87.04% -0.23%
==========================================
Files 29 26 -3
Lines 6216 6208 -8
==========================================
- Hits 5425 5404 -21
- Misses 791 804 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
FYI @TheVeryDarkness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
generator/src/state_builder.rs (1)
114-121: Critical: generated code uses obsolete...range patterns; breaks on Rust 2021/1.81. Replace with..=.Inside the quoted code, patterns like
Some(#start...#end)and#start...#endare 2021-edition obsolete and will fail on current toolchains. Use inclusive..=patterns instead.Rationale: The Rust Reference specifies inclusive range patterns as
..=and notes the obsolete range pattern syntax is removed in the 2021 edition. The Book also demonstrates..=for inclusive matches. (rust.googlesource.com, doc.rust-lang.org)Apply this targeted diff:
@@ - match self.input.get(self.index) { - Some(#start...#end) => { + match self.input.get(self.index) { + Some(#start..=#end) => { self.index += 1; return self.#accept(); } _ => return self.#reject() } @@ - match u32::from_le_bytes(bytes) { - #start...#end => { + match u32::from_le_bytes(bytes) { + #start..=#end => { self.index += #len; return self.#accept(); } _ => () }Follow-up: consider a quick grep for
...in codegen paths to catch any other occurrences that surface in emitted code.Also applies to: 128-144
🧹 Nitpick comments (7)
pest/benches/position.rs (1)
28-28: Prefer explicit'staticlifetime on FRAG for clarity and const-compatibilityThe literal is
'static, but annotating FRAG as&'static strencodes intent and avoids subtle lifetime inference surprises when used in otherconstcontexts (e.g.,wrapper::String::CONTENTexpects&'static str).-const FRAG: &str = "0123456789abcdef"; +const FRAG: &'static str = "0123456789abcdef";generator/src/config.rs (1)
74-77: Include the full path in the panic to aid debugging.Printing only
file_namecan be ambiguous when multiple files share the same name. Use{path:?}(orpath.display()) to include the full resolved path.Apply this focused diff:
- Err(error) => panic!("error opening {file_name:?}: {error}"), + Err(error) => panic!("error opening {path:?}: {error}"),generator/src/state_builder.rs (1)
76-79: Nit: unify state name formatting style within this file.A few places still use
format!("state{}", …). For consistency with the new style, switch to captured identifiers.Suggested diff:
- let enter = Ident::new(&format!("state{}", enter_state.id), Span::call_site()); + let enter = Ident::new(&format!("state{enter_id}"), Span::call_site()); + let enter_id = enter_state.id;- let name = Ident::new(&format!("state{}", self.id), Span::call_site()); + let name = Ident::new(&format!("state{sid}"), Span::call_site()); + let sid = self.id;Alternatively, keep the original variable ordering and just swap the format strings:
- let enter = Ident::new(&format!("state{}", enter_state.id), Span::call_site()); + let enter = Ident::new(&format!("state{eid}"), Span::call_site()); + let eid = enter_state.id;Pick whichever reads better with your local style.
Also applies to: 230-236
derive/tests/meta_rules_silent.rs (1)
28-31: Avoid repeating field() calls to reduce temporary churnMinor: you call
records[0].field()twice, constructing two temporaries. Bind once to improve clarity.Apply this diff:
- assert_eq!(records[0].field().0.span.as_str(), "123"); - assert_eq!(records[0].field().1[0].span.as_str(), "456"); + let (first, rest) = records[0].field(); + assert_eq!(first.span.as_str(), "123"); + assert_eq!(rest[0].span.as_str(), "456");core/src/formatter.rs (1)
248-249: Consistency: use the same dynamic-width style throughoutYou’ve updated one site to
{line:index_digit$}. For consistency, consider updating the remaining occurrences that still useformat!("{:w$}", line, w = index_digit)in this file (e.g., Lines 175, 211, 286, 310) to the same style.Example patches:
- let number = format!("{:w$}", line.line + 1, w = index_digit); + let number = format!("{:index_digit$}", line.line + 1);- let number = format!("{:w$}", start.line + 1, w = index_digit); + let number = format!("{:index_digit$}", start.line + 1);- let number = format!("{:w$}", end.line + 1, w = index_digit); + let number = format!("{:index_digit$}", end.line + 1);generator/src/typed/generator.rs (2)
849-870: Consider adding documentation for the silent rule collection logic.The
collect_silent_rules_with_argsfunction would benefit from a doc comment explaining:
- Its purpose in the inlining process
- The structure of the returned nested map (module path -> rule name -> rule)
- Why silent rules with arguments need special handling
+/// Collects all silent rules with arguments from a module and its imports. +/// +/// Returns a nested map where: +/// - The outer key is the optional module path (None for the root module) +/// - The inner map contains rule name to rule mappings for silent rules with arguments +/// +/// This is used to identify which rules need to be inlined during code generation. fn collect_silent_rules_with_args<'a>( path: Option<&'a str>, module: &'a GrammarModule, ) -> BTreeMap<Option<&'a str>, BTreeMap<&'a str, &'a ParseRule>> {
889-900: Consider extracting the module path lookup logic.The nested
find_mapwith multiple conditionals could be clearer if extracted into a helper function.+fn find_silent_rule<'a>( + rule_name: &str, + silent_rules: &BTreeMap<Option<&'a str>, BTreeMap<&'a str, &'a ParseRule>>, +) -> Option<&'a ParseRule> { + silent_rules + .values() + .find_map(|rules| rules.get(rule_name)) +} // In the main loop: -if let Some(silent_rule) = silent_rules - .iter() - .find_map(|(mod_path, rules)| rules.get(rule_name.as_str())) +if let Some(silent_rule) = find_silent_rule(rule_name.as_str(), &silent_rules)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (21)
.github/workflows/ci.yml(2 hunks)core/src/formatter.rs(8 hunks)core/src/token/mod.rs(1 hunks)core/src/typed/template/mod.rs(1 hunks)core/src/typed/tracker.rs(5 hunks)core/src/typed/traits.rs(1 hunks)derive/benches/json.rs(4 hunks)derive/tests/json.rs(2 hunks)derive/tests/meta_rules_silent.rs(1 hunks)generator/src/config.rs(1 hunks)generator/src/state_builder.rs(2 hunks)generator/src/typed/generator.rs(9 hunks)generator/src/typed/getter.rs(1 hunks)generator/src/typed/output.rs(1 hunks)generator/tests/generator.rs(1 hunks)meta/src/error.rs(1 hunks)meta/src/parser.rs(8 hunks)pest/benches/position.rs(1 hunks)pest/src/bin/unicode.rs(1 hunks)pest/src/lib.rs(0 hunks)pest/tests/predefined_node.rs(1 hunks)
💤 Files with no reviewable changes (1)
- pest/src/lib.rs
🧰 Additional context used
🧬 Code graph analysis (3)
derive/tests/json.rs (1)
core/src/span.rs (1)
as_str(218-221)
derive/tests/meta_rules_silent.rs (2)
core/src/typed/traits.rs (2)
try_parse(48-53)try_parse(283-285)core/src/span.rs (1)
as_str(218-221)
derive/benches/json.rs (1)
core/src/typed/traits.rs (1)
check(91-99)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Continuous Benchmarking with Bencher
🔇 Additional comments (20)
core/src/typed/traits.rs (1)
190-192: Option::for_each_child_pair simplification is correct and clearerSwitching from a
matchtoif let Some(...)preserves semantics and improves readability with no allocation or control-flow changes.meta/src/error.rs (1)
29-29: Named-arg debug formatting is a no-op behaviorally and aligns with repo style
format!("{other_rule:?}")is equivalent to the prior positional form and keeps formatting consistent with other files.pest/tests/predefined_node.rs (1)
73-74: Tests: switch to named-arg formatting is clean and idiomatic
format!("{s:?}")/format!("{trivia:?}")read better and match the repository-wide formatting updates. No impact on assertions.Also applies to: 81-82
core/src/typed/template/mod.rs (1)
469-476: Nice micro-simplification with Option::map; behavior preserved.
input.next_char()advancesinputin place; returning(input, Self { content: c })keeps the mutated position. All good.core/src/token/mod.rs (1)
47-52: LGTM: switch to captured identifiers in fmt strings.Using
{first}/{item}is clear and requires Rust ≥ 1.58; compatible with the toolchain bump.generator/src/state_builder.rs (2)
21-26: LGTM:format!("state{id}")modernizes formatting without changing semantics.
98-101: LGTM: consistent captured formatting for accept/reject state idents.pest/src/bin/unicode.rs (1)
13-15: LGTM: captured identifier formatting in writeln!.
writeln!(out, "crate::unicode!({unicode});")correctly capturesunicodeand emits the intended macro invocation.derive/benches/json.rs (2)
35-36: LGTM: remove unused binding in benchesSwitching from
let _ = rules::json::check(&s).unwrap();to callingrules::json::check(&s).unwrap();directly is tidy and avoids the unnecessary temporary. No behavior change in the benchmarked operation.Also applies to: 71-72
26-26: Review comment no longer applies:format!macro is used correctlyBoth occurrences in
derive/benches/json.rsalready use theformat!macro (with the!), and a project-wide grep found no plainformat(calls:
- derive/benches/json.rs: lines 26 and 62 both use
format!("{…}")- No other instances of
format("…")detected in the repoNo changes are needed here.
Likely an incorrect or invalid review comment.
core/src/typed/tracker.rs (1)
36-38: Named-arg formatting migration looks correctThe shift to named placeholders improves readability and avoids argument-ordering mistakes. Message content and tests remain consistent with prior behavior. No logical or control-flow changes observed.
Also applies to: 221-221, 235-235, 244-245, 259-259, 267-267
derive/tests/meta_rules_silent.rs (1)
6-12: Confirm intent of^usage and sequencing around line breaksThe inline grammar defines a rule named
^as CRLF/LF and then uses it infile = pest::soi - (record ^ "")* - pest::eoi. Please confirm:
- That
^is indeed allowed as a rule identifier in this grammar dialect.- That omitting
-around^(i.e.,record ^ ""instead ofrecord - ^ - "") is intentional and parsed as desired, rather than a sequencing oversight.If sequencing is intended, consider making the separators explicit for clarity.
If helpful, I can draft an additional test that exercises inputs with: trailing newline, missing trailing newline, and multiple consecutive newlines to pin down the intended semantics.
meta/src/parser.rs (3)
217-246: LGTM! Well-implemented AST replacement functionality.The
replacemethod provides a clean implementation for substituting rule arguments with actual expressions during inlining. The recursive traversal correctly handles all AST node types, and the boolean return value effectively tracks whether replacements occurred.
494-520: LGTM! Proper handling of optional braces in rule definitions.The implementation correctly:
- Checks for the presence of braces before parsing the rule body
- Skips opening/closing braces only when they exist
- Maintains backward compatibility with existing rule formats
106-110: LGTM! Consistent adoption of named format arguments throughout.The migration to named format arguments improves code readability and maintainability. This change aligns well with Rust's modern formatting practices and makes the code more self-documenting.
Also applies to: 152-153, 165-169, 252-271, 375-376, 771-771
generator/tests/generator.rs (1)
13-17: LGTM! Consistent formatting style update.The change to named format arguments in panic messages is consistent with the broader formatting improvements across the codebase.
generator/src/typed/getter.rs (1)
268-268: LGTM! Improved documentation formatting.The change to use a named placeholder in the generated doc string improves consistency with the rest of the codebase's formatting updates.
generator/src/typed/output.rs (1)
69-69: LGTM! Consistent formatting update.The change to use a named placeholder in the doc string generation aligns with the codebase-wide formatting improvements.
generator/src/typed/generator.rs (2)
872-943: Well-implemented inlining logic with proper argument substitution.The
inline_silent_rules_with_argsfunction correctly:
- Removes silent rules with arguments from the output
- Recursively traverses and replaces rule calls with their inlined bodies
- Validates argument counts with a clear assertion message
- Properly substitutes arguments using the new
replacemethodThe implementation handles all edge cases and provides good error messages.
1079-1105: Excellent test coverage for the inlining feature.The test properly validates:
- Silent rules with arguments are removed after inlining
- The inlined expression structure matches expectations
- The transformation preserves the semantic meaning of the rules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Updates string formatting throughout the codebase to use modern Rust interpolation syntax, adds support for silent meta-rules with arguments that are automatically inlined across modules, and includes minor improvements like removing unnecessary extern crate declarations and updating CI toolchain.
- Modern string formatting using
{var}instead of{:?}with variables - Silent rule inlining functionality for improved code generation and performance
- Updated Rust toolchain to version 1.81
Reviewed Changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pest/tests/predefined_node.rs | Updated string formatting in test assertions |
| pest/src/lib.rs | Removed unnecessary extern crate declaration |
| pest/benches/position.rs | Updated string literal lifetime annotation |
| meta/src/parser.rs | Added silent rule replacement logic and updated string formatting |
| meta/src/error.rs | Updated string formatting in error handling |
| generator/tests/generator.rs | Updated error message formatting |
| generator/src/typed/output.rs | Updated documentation string formatting |
| generator/src/typed/getter.rs | Updated documentation string formatting |
| generator/src/typed/generator.rs | Added silent rule inlining implementation with supporting functions |
| generator/src/state_builder.rs | Updated string formatting in code generation |
| generator/src/config.rs | Updated error message formatting |
| derive/tests/meta_rules_silent.rs | New test file for silent rule inlining functionality |
| derive/tests/json.rs | Simplified match expressions to if-let patterns |
| derive/benches/json.rs | Updated string formatting and removed unused assignments |
| core/src/typed/traits.rs | Simplified match expression to if-let pattern |
| core/src/typed/tracker.rs | Updated string formatting in error messages |
| core/src/typed/template/mod.rs | Simplified match expression using map |
| core/src/token/mod.rs | Updated string formatting in Display implementation |
| core/src/formatter.rs | Updated string formatting throughout formatter |
| .github/workflows/ci.yml | Updated Rust toolchain from 1.74 to 1.81 |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
meta/src/parser.rs (1)
494-519: parse_rule: optional braces and silent modifier — verify edge cases; avoid unwrapsLogic for optional braces and silent modifier ordering looks correct. Two follow-ups:
- Add tests for: "a = _ b" and "a = _ { b }" to cover both forms with the silent modifier.
- Consider avoiding .peek().unwrap() here to prevent panics on malformed input and surface a structured parse error instead.
Suggested refactor (keeps behavior, removes unwraps on peeks):
-// Then check for silent modifier -let silent = matches!(pairs.peek().unwrap().as_rule(), Rule::silent_modifier); +// Then check for silent modifier +let silent = matches!(pairs.peek().map(|p| p.as_rule()), Some(Rule::silent_modifier)); if silent { pairs.next().unwrap(); // modifier } -// Check if we have braces or direct expression -let has_braces = matches!(pairs.peek().unwrap().as_rule(), Rule::opening_brace); +// Check if we have braces or direct expression +let has_braces = matches!(pairs.peek().map(|p| p.as_rule()), Some(Rule::opening_brace));Please extend tests to cover the two silent forms above.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
meta/src/parser.rs(8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
meta/src/parser.rs (2)
core/src/token/mod.rs (1)
fmt(44-55)generator/src/typed/generator.rs (4)
fmt(80-89)fmt(141-147)fmt(290-292)path(136-136)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test Coverage
- GitHub Check: Continuous Benchmarking with Bencher
🔇 Additional comments (7)
meta/src/parser.rs (7)
103-114: Display: switch to named args is correctFormatting for Range using {start} and {end} looks good.
144-155: Display: PathArgs::Slice -> "[{range}]"Bracketed slice rendering is consistent with parser-side syntax. Good change.
157-172: fmt_sep: cleaner named-arg formattingRefactor is fine and keeps behavior unchanged.
249-273: Display(ParseExpr): named placeholders and composite renders LGTMAll variant printers look correct and align with new PathArgs/Range displays.
275-281: skip(): improved panic message"expected {rule:?}" is clearer. Looks good.
366-386: Import I/O error message improvedThe new message with {path:?} and {err} is clearer; good change.
762-775: Bounded repeat error message: updated wordingNew "{start} > {end}" message matches updated test expectations.
| impl ParseExpr { | ||
| pub fn replace(&mut self, arg_name: &str, new_expr: &ParseExpr) -> bool { | ||
| match self { | ||
| Self::Path(p, None) if p.len() == 1 && p[0] == arg_name => { | ||
| *self = new_expr.clone(); | ||
| true | ||
| } | ||
| Self::PosPred(inner) | ||
| | Self::NegPred(inner) | ||
| | Self::Opt(inner) | ||
| | Self::Rep(inner) | ||
| | Self::RepOnce(inner) | ||
| | Self::RepRange(inner, _) | ||
| | Self::Separated(inner, _) => inner.expr.replace(arg_name, new_expr), | ||
| Self::Seq(lhs, rhs, _) => { | ||
| let replaced_lhs = lhs.expr.replace(arg_name, new_expr); | ||
| let replaced_rhs = rhs.expr.replace(arg_name, new_expr); | ||
| replaced_lhs || replaced_rhs | ||
| } | ||
| Self::Choice(lhs, rhs) => { | ||
| let replaced_lhs = lhs.expr.replace(arg_name, new_expr); | ||
| let replaced_rhs = rhs.expr.replace(arg_name, new_expr); | ||
| replaced_lhs || replaced_rhs | ||
| } | ||
| Self::Path(_, Some(PathArgs::Call(args))) => args | ||
| .iter_mut() | ||
| .any(|arg| arg.expr.replace(arg_name, new_expr)), | ||
| _ => false, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
ParseExpr::replace: solid tree rewrite; add coverage
The traversal covers all relevant variants (incl. call-arg recursion) and returns a useful boolean. Please add unit tests for:
- Simple identifier replacement
- Nested in Seq/Choice
- In call arguments
- No-op when the path has multiple segments or a slice
Optionally, add a doc comment clarifying it only replaces single-segment identifiers (Path len == 1, no args).
To confirm usage and identify where to add tests, run:
🏁 Script executed:
#!/bin/bash
# Find replace() definition and call sites
rg -n -C3 -g '!**/target/**' -P '\bimpl\s+ParseExpr\b|\breplace\s*\('
# Grep for tests referencing silent rules or inlining to guide new test placement
rg -n -C2 -g '!**/target/**' -P 'silent_modifier|inline|inlin(e|ing)\s+silent'Length of output: 297
🏁 Script executed:
#!/bin/bash
# Locate ParseExpr::replace definition and its call sites
rg -n --no-ignore -C3 -P '\bimpl\s+ParseExpr\b|\breplace\s*\(' .
# Find existing Rust test modules to determine placement for new tests
rg -n --no-ignore 'mod\s+tests' -g '*.rs' .Length of output: 2983
Add unit tests for ParseExpr::replace
Cover the following cases in the meta/src/parser.rs tests module (around line 956):
- simple single-segment identifier replacement
- nested replacement in
Seq/Choice - replacement within call-argument lists
- no-op when
Pathhas multiple segments or non-Noneargs
Optionally, add a doc comment noting it only replaces bare, single-segment identifiers.
🤖 Prompt for AI Agents
In meta/src/parser.rs around lines 217 to 247, the ParseExpr::replace method
lacks unit test coverage; add tests in the tests module near line 956 that (1)
assert a simple single-segment identifier Path is replaced with the new
expression, (2) assert nested replacements occur in Seq and Choice nodes (both
sides), (3) assert replacement happens for arguments inside PathArgs::Call
argument lists, and (4) assert no replacement occurs when Path has multiple
segments or non-None args; also optionally add a short doc comment above replace
stating it only replaces bare, single-segment identifiers.
Summary by CodeRabbit
New Features
Refactor
Tests
Chores