Skip to content

Conversation

@abmcar
Copy link
Contributor

@abmcar abmcar commented Jan 28, 2026

1. Does this PR affect any open issues?(Y/N) and add issue references (e.g. "fix #123", "re #123".):

  • N
  • Y

2. What is the scope of this PR (e.g. component or file name):

3. Provide a description of the PR(e.g. more details, effects, motivations or doc link):

  • Affects user behaviors
  • Contains CI/CD configuration changes
  • Contains documentation changes
  • Contains experimental features
  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Other

4. Are there any breaking changes?(Y/N) and describe the breaking changes(e.g. more details, motivations or doc link):

  • N
  • Y

5. Are there test cases for these changes?(Y/N) select and add more details, references or doc links:

  • Unit test
  • Integration test
  • Benchmark (add benchmark stats below)
  • Manual test (add detailed scripts or steps below)
  • Other

6. Release note

None

Copilot AI review requested due to automatic review settings January 28, 2026 08:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces cycle detection using Tarjan's strongly connected components (SCC) algorithm to prevent gas cost shifting into or out of cyclic blocks during the SPP (Safety, Precision, Polynomial-time) optimization phase. The change aims to preserve gas costs within cyclic chunks to maintain correct gas accounting in loops.

Changes:

  • Added computeInCycle function implementing Tarjan's SCC algorithm to identify blocks in cycles
  • Created NonCycleMask bitset to mark non-cyclic blocks
  • Modified gas chunk building logic to skip cyclic blocks during lemma614Update calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


std::vector<uint64_t> NonCycleMask(bitsetWordCount(Blocks.size()), 0);
for (size_t Id = 0; Id < Blocks.size(); ++Id) {
if (Id < InCycle.size() && InCycle[Id] == 0) {
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The redundant bounds check Id < InCycle.size() adds unnecessary complexity. Since InCycle is always sized to Blocks.size() (from line 370, 375), and the loop iterates from 0 to Blocks.size(), this check will always be true. This check is redundant and should be removed for clarity.

Copilot uses AI. Check for mistakes.
Comment on lines +378 to +424
std::function<void(uint32_t)> StrongConnect = [&](uint32_t Node) {
Index[Node] = CurIndex;
Lowlink[Node] = CurIndex;
++CurIndex;
Stack.push_back(Node);
OnStack[Node] = 1;

for (uint32_t Succ : Blocks[Node].Succs) {
if (Succ >= NumBlocks) {
continue;
}
if (Index[Succ] < 0) {
StrongConnect(Succ);
Lowlink[Node] = std::min(Lowlink[Node], Lowlink[Succ]);
} else if (OnStack[Succ] != 0) {
Lowlink[Node] = std::min(Lowlink[Node], Index[Succ]);
}
}

if (Lowlink[Node] != Index[Node]) {
return;
}

std::vector<uint32_t> Scc;
while (!Stack.empty()) {
const uint32_t Top = Stack.back();
Stack.pop_back();
OnStack[Top] = 0;
Scc.push_back(Top);
if (Top == Node) {
break;
}
}

if (Scc.size() > 1) {
for (uint32_t Id : Scc) {
InCycle[Id] = 1;
}
return;
}

const uint32_t Only = Scc.front();
for (uint32_t Succ : Blocks[Only].Succs) {
if (Succ == Only) {
InCycle[Only] = 1;
break;
}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recursive lambda StrongConnect can cause stack overflow for deeply nested or large graphs due to unbounded recursion depth. For EVM bytecode with complex control flow graphs containing thousands of blocks, this could be problematic.

Consider converting this to an iterative implementation using an explicit stack to avoid potential stack overflow issues, or document the assumption about maximum CFG depth if that's acceptable for this use case.

Suggested change
std::function<void(uint32_t)> StrongConnect = [&](uint32_t Node) {
Index[Node] = CurIndex;
Lowlink[Node] = CurIndex;
++CurIndex;
Stack.push_back(Node);
OnStack[Node] = 1;
for (uint32_t Succ : Blocks[Node].Succs) {
if (Succ >= NumBlocks) {
continue;
}
if (Index[Succ] < 0) {
StrongConnect(Succ);
Lowlink[Node] = std::min(Lowlink[Node], Lowlink[Succ]);
} else if (OnStack[Succ] != 0) {
Lowlink[Node] = std::min(Lowlink[Node], Index[Succ]);
}
}
if (Lowlink[Node] != Index[Node]) {
return;
}
std::vector<uint32_t> Scc;
while (!Stack.empty()) {
const uint32_t Top = Stack.back();
Stack.pop_back();
OnStack[Top] = 0;
Scc.push_back(Top);
if (Top == Node) {
break;
}
}
if (Scc.size() > 1) {
for (uint32_t Id : Scc) {
InCycle[Id] = 1;
}
return;
}
const uint32_t Only = Scc.front();
for (uint32_t Succ : Blocks[Only].Succs) {
if (Succ == Only) {
InCycle[Only] = 1;
break;
}
std::function<void(uint32_t)> StrongConnect = [&](uint32_t StartNode) {
// Iterative version of Tarjan's algorithm to avoid recursive calls.
struct Frame {
uint32_t Node;
size_t SuccIndex;
};
std::vector<Frame> DfsStack;
DfsStack.push_back(Frame{StartNode, 0});
while (!DfsStack.empty()) {
Frame &CurFrame = DfsStack.back();
const uint32_t Node = CurFrame.Node;
if (Index[Node] < 0) {
Index[Node] = CurIndex;
Lowlink[Node] = CurIndex;
++CurIndex;
Stack.push_back(Node);
OnStack[Node] = 1;
}
const auto &Succs = Blocks[Node].Succs;
bool Descended = false;
while (CurFrame.SuccIndex < Succs.size()) {
const uint32_t Succ = Succs[CurFrame.SuccIndex];
++CurFrame.SuccIndex;
if (Succ >= NumBlocks) {
continue;
}
if (Index[Succ] < 0) {
// Tree edge: descend into successor.
DfsStack.push_back(Frame{Succ, 0});
Descended = true;
break;
}
if (OnStack[Succ] != 0) {
// Successor in current SCC stack: update lowlink with its index.
Lowlink[Node] = std::min(Lowlink[Node], Index[Succ]);
}
}
if (Descended) {
continue;
}
// All successors processed; check if Node is a root of an SCC.
if (Lowlink[Node] == Index[Node]) {
std::vector<uint32_t> Scc;
while (!Stack.empty()) {
const uint32_t Top = Stack.back();
Stack.pop_back();
OnStack[Top] = 0;
Scc.push_back(Top);
if (Top == Node) {
break;
}
}
if (Scc.size() > 1) {
for (uint32_t Id : Scc) {
InCycle[Id] = 1;
}
} else if (!Scc.empty()) {
const uint32_t Only = Scc.front();
for (uint32_t Succ : Blocks[Only].Succs) {
if (Succ == Only) {
InCycle[Only] = 1;
break;
}
}
}
}
// Propagate lowlink information to the parent in the DFS stack.
DfsStack.pop_back();
if (!DfsStack.empty()) {
const uint32_t ParentNode = DfsStack.back().Node;
Lowlink[ParentNode] = std::min(Lowlink[ParentNode], Lowlink[Node]);
}

Copilot uses AI. Check for mistakes.
Comment on lines 965 to +970
if (!UseLinearSPP) {
for (uint32_t NodeId : RevTopo) {
lemma614Update(NodeId, Blocks, &BackEdges, nullptr, Metering);
if (NodeId < InCycle.size() && InCycle[NodeId] != 0) {
continue;
}
lemma614Update(NodeId, Blocks, &BackEdges, &NonCycleMask, Metering);
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code now uses two different mechanisms to identify loops/cycles: (1) dominator-based back edges via BackEdges, and (2) SCC-based cycle detection via InCycle. These serve different purposes but may produce conflicting results.

Specifically, BackEdges identifies loop back edges (edges where target dominates source), while InCycle identifies all nodes in any strongly connected component. A node can be in a cycle without having any back edges (e.g., in an irreducible loop), and a node with back edges might not be in the SCC identified as a cycle.

This dual approach could lead to inconsistent treatment of loop nodes. Consider documenting why both mechanisms are needed and how they interact, or simplifying to use a single, consistent approach to loop identification.

Copilot uses AI. Check for mistakes.
if (NodeId < InCycle.size() && InCycle[NodeId] != 0) {
continue;
}
lemma614Update(NodeId, Blocks, nullptr, &NonCycleMask, Metering);
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AllowedMask parameter has been changed from &Loops[LoopId].NodeMask to &NonCycleMask, which fundamentally alters the algorithm's behavior.

The original code restricted updates to successors within the specific loop being processed (via Loops[LoopId].NodeMask). The new code only restricts to non-cyclic successors globally (via NonCycleMask), which means it will incorrectly consider successors outside the current loop during loop-specific processing.

For example, if a non-cyclic node N in loop L1 has a successor S that is non-cyclic and in loop L2 (or not in any loop), the update will now incorrectly include S in the minimum cost calculation. This breaks the loop-local invariant that the SPP algorithm depends on.

The correct approach would be to compute an intersection of both masks, or to restructure the logic to maintain the loop-specific constraint while also excluding cyclic nodes.

Copilot uses AI. Check for mistakes.
if (!UseLinearSPP) {
for (uint32_t NodeId : RevTopo) {
lemma614Update(NodeId, Blocks, &BackEdges, nullptr, Metering);
if (NodeId < InCycle.size() && InCycle[NodeId] != 0) {
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bounds check NodeId < InCycle.size() is redundant because InCycle is created with size NumBlocks (line 375), and NodeId comes from iterating over RevTopo which contains node indices less than NumBlocks. The same InCycle vector is used throughout without any size changes.

This check pattern appears consistently but unnecessarily at lines 967, 999, and 1010. While not incorrect, it adds unnecessary overhead and code complexity. Consider removing these redundant checks or add a comment explaining why they are needed if there's a specific edge case they guard against.

Suggested change
if (NodeId < InCycle.size() && InCycle[NodeId] != 0) {
if (InCycle[NodeId] != 0) {

Copilot uses AI. Check for mistakes.
@abmcar abmcar marked this pull request as draft January 28, 2026 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant