-
Notifications
You must be signed in to change notification settings - Fork 182
Bugfix: slang now performs a second textual pass after an include file is pasted into the token stream. #1557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| filtered.reserve(results.size()); | ||
| for (auto& entry : results) { | ||
| auto name = entry.filename().string(); | ||
| if (name.rfind("._", 0) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary on MacOS, which generates a number of hidden files like .DS_Store
|
Thanks for the PR. I agree that other tools allow the use of includes inside of macro expansions. I need to think about the solution more carefully; it's not clear to me that this is the right way to do it, though it may very well be. |
Great; let me know what changes, if any, you prefer. |
…e is pasted into the token stream.
Invoking slang on this small design, which mirrors the original `LEGACY_HEADER` pattern, fails with `expected expression`, `expected identifier`, and related syntax errors:
```sh
slang -I import_include_error import_include_error/top.sv import_include_error/compute_unit.sv
```
The failure happens because the macro `MAKE_UNIT` defined in `top.sv` expands to
```systemverilog
compute_unit #(`include "shared_assign.sv", .INDEX(0))
```
and slang keeps the `` `include `` directive intact while it is still inside the macro replacement list. Once preprocessing finishes, the text fed to the parser becomes
```systemverilog
compute_unit #(, .INDEX(0)).WIDTH (WIDTH),
.ENABLE(ENABLE)
```
so the payload from `shared_assign.sv` is no longer inside the parameter list. That leaves a stray comma right after the `#(`, causes the `.WIDTH` and `.ENABLE` lines to look like module instantiations, and triggers slang’s "expected expression" and "expected identifier" diagnostics.
The offending construct exists in two places, which are reproduced in anonymized form here:
- `import_include_error/top.sv` contains
```systemverilog
`define MAKE_UNIT(idx_expr) compute_unit #(`include "shared_assign.sv", .INDEX(idx_expr))
```
and then instantiates `` `MAKE_UNIT(0) ``.
- `import_include_error/shared_assign.sv` is just the named parameter overrides:
```systemverilog
.WIDTH (WIDTH),
.ENABLE(ENABLE)
```
Slang’s preprocessor is stricter than other front-ends: it does not perform a second textual pass after the include file is pasted into the token stream. That behavior aligns with the SystemVerilog rules—once macro replacement completes, the directive has already injected its file contents, so there is no opportunity to slide the pasted text back into the surrounding `#(...)`. Other tools appear to "work" because they do an extra textual glue step and effectively treat the include contents as if they were written directly inside the parameter list.
The exact same idea works fine on the module declaration side:
```systemverilog
module compute_unit
#(`include "shared_decl.sv",
parameter int INDEX = 0)
(...);
```
because there is no macro indirection—the include statement is already present in the source file when the preprocessor runs, so the inserted `parameter` declarations remain inside the parentheses.
Solution: Suspend Macro Tokens Around `include`
- **Status (October 30, 2025):** Implemented by suspending macro buffers in `Preprocessor::handleIncludeDirective`, resuming them in `nextRaw`, and extending pragma handling. Regression coverage added via `DriverTests.cpp`.
- Introduce a tiny helper struct (e.g. `MacroBufferFrame`) that holds the owning `SmallVector<Token>` plus the iterator index; add a `SmallVector<MacroBufferFrame, N> pendingMacroFrames` member.
- In `Preprocessor::handleIncludeDirective`, if `currentMacroToken` is non-null, emplace the current tokens / iterator into `pendingMacroFrames`, then clear `currentMacroToken` so the include can push its own lexer.
- Teach `Preprocessor::nextRaw()` to resume the last stored frame whenever the active lexer stack empties (after `popSource()` runs) by restoring `expandedTokens` and `currentMacroToken` from the top frame.
- Audit call sites that currently assume `expandedTokens` is empty (`Preprocessor_pragmas.cpp:360-380`, error-reporting helpers) and update them to walk `pendingMacroFrames` when present.
- Add regression coverage:
- Unit: expand the existing preprocessor test harness to feed the `MAKE_UNIT` snippet and assert the post-include text matches expectations.
- Data-driven: wire the `import_include_error` sample into a CLI test to ensure no parser diagnostics are emitted.
- Document the fix in the changelog once merged, highlighting that `include` inside macro actual arguments now behaves per the SV spec.
ebaf8e7 to
ba68dfa
Compare
Preamble
Please let me know if you prefer this to be off by default, and behind a feature flag.
Problem
Invoking slang on this small design, which mirrors the original
LEGACY_HEADERpattern, fails withexpected expression,expected identifier, and related syntax errors:The failure happens because the macro
MAKE_UNITdefined intop.svexpands toand slang keeps the
`includedirective intact while it is still inside the macro replacement list. Once preprocessing finishes, the text fed to the parser becomesso the payload from
shared_assign.svis no longer inside the parameter list. That leaves a stray comma right after the#(, causes the.WIDTHand.ENABLElines to look like module instantiations, and triggers slang’s "expected expression" and "expected identifier" diagnostics.The offending construct exists in two places, which are reproduced in anonymized form here:
import_include_error/top.svcontainssystemverilog `define MAKE_UNIT(idx_expr) compute_unit #(`include "shared_assign.sv", .INDEX(idx_expr))and then instantiates`MAKE_UNIT(0).import_include_error/shared_assign.svis just the named parameter overrides:systemverilog .WIDTH (WIDTH), .ENABLE(ENABLE)Slang’s preprocessor is stricter than other front-ends: it does not perform a second textual pass after the include file is pasted into the token stream. That behavior aligns with the SystemVerilog rules—once macro replacement completes, the directive has already injected its file contents, so there is no opportunity to slide the pasted text back into the surrounding
#(...). Other tools appear to "work" because they do an extra textual glue step and effectively treat the include contents as if they were written directly inside the parameter list.The exact same idea works fine on the module declaration side:
because there is no macro indirection—the include statement is already present in the source file when the preprocessor runs, so the inserted
parameterdeclarations remain inside the parentheses.Solution
Suspend Macro Tokens Around
include.Preprocessor::handleIncludeDirective, resuming them innextRaw, and extending pragma handling. Regression coverage added viaDriverTests.cpp.MacroBufferFrame) that holds the owningSmallVector<Token>plus the iterator index; add aSmallVector<MacroBufferFrame, N> pendingMacroFramesmember.Preprocessor::handleIncludeDirective, ifcurrentMacroTokenis non-null, emplace the current tokens / iterator intopendingMacroFrames, then clearcurrentMacroTokenso the include can push its own lexer.Preprocessor::nextRaw()to resume the last stored frame whenever the active lexer stack empties (afterpopSource()runs) by restoringexpandedTokensandcurrentMacroTokenfrom the top frame.expandedTokensis empty (Preprocessor_pragmas.cpp:360-380, error-reporting helpers) and update them to walkpendingMacroFrameswhen present.MAKE_UNITsnippet and assert the post-include text matches expectations.import_include_errorsample into a CLI test to ensure no parser diagnostics are emitted.