Skip to content

l2: add advanced switching features#515

Closed
fabiendupont wants to merge 26 commits intoDPDK:mainfrom
fabiendupont:l2-switching-support
Closed

l2: add advanced switching features#515
fabiendupont wants to merge 26 commits intoDPDK:mainfrom
fabiendupont:l2-switching-support

Conversation

@fabiendupont
Copy link

@fabiendupont fabiendupont commented Feb 24, 2026

Summary

Add advanced L2 switching capabilities to the bridge module.

Depends on #504 (VXLAN/EVPN support).

Features

  • LAG bond bridge integration: bridge domain support for LAG bond interfaces
  • Extension hooks: per-bridge hook infrastructure for L2 feature subsystems
  • Bridge and FDB statistics: per-bridge and per-port packet/byte counters
  • Port security: per-interface MAC address limits with configurable violation actions
  • RSTP: Rapid Spanning Tree Protocol bridge and port management
  • VLAN filtering: access, trunk and hybrid port modes with per-VLAN FDB
  • Storm control: rate limiting for broadcast, multicast and unknown unicast
  • Port mirroring: RSPAN support with source/destination mirror sessions
  • IGMP/MLD snooping: multicast group tracking with MDB management
  • QoS: 802.1p CoS classification with priority queue scheduling
  • DHCP snooping: binding table management with trusted/untrusted ports
  • Dynamic ARP Inspection: ARP validation against DHCP snooping bindings
  • IP Source Guard: IP+MAC validation using DHCP snooping bindings
  • LLDP: foundational Link Layer Discovery Protocol infrastructure

Each feature follows the standard grout module structure with
separate api/, control/, datapath/, and cli/ components.

Test plan

  • Unit tests pass (meson test)
  • Existing smoke tests still pass (smoke/run.sh)
  • New L2 features verified via CLI (grcli)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added VXLAN tunneling with flood VTEP management
    • Added Dynamic ARP Inspection (DAI)
    • Added DHCP Snooping with trusted port management
    • Added IP Source Guard (IPSG)
    • Added LLDP neighbor discovery and tracking
    • Added multicast snooping (IGMP/MLD)
    • Added port mirroring and RSPAN sessions
    • Added per-port QoS with scheduling and metering
    • Added Rapid Spanning Tree Protocol (RSTP)
    • Added storm control for traffic suppression
    • Added VLAN filtering with access/trunk/hybrid modes
    • Added L2 port security with MAC limiting
    • Enhanced FDB with external entries and VTEP support

rjarry and others added 26 commits February 20, 2026 21:40
Introduce the VXLAN interface type for the L2 module. A VXLAN
interface carries a VNI (VXLAN Network Identifier), a local VTEP
address used as the outer IP source, an encapsulation VRF for
underlay routing, and a configurable UDP destination port (default
4789).

VXLAN interfaces are keyed by (VNI, encap_vrf_id) in a lockfree
RCU-protected hash table so that the datapath can resolve incoming
tunneled packets to the correct interface without locks.

VXLAN interfaces are intended to be attached to a bridge domain.
All L2 traffic entering the bridge is forwarded transparently over
the VXLAN tunnel. The local VTEP address must already be configured
in the encapsulation VRF.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
VXLAN uses UDP port 4789 by default but allows configuring a custom
destination port per interface. Allow the control plane to register
additional UDP ports at runtime as aliases for an already registered
port, reusing the same datapath edge.

Use reference counting so that multiple interfaces sharing the same
non-default port do not interfere with each other during teardown.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Wire up the VXLAN interface's configurable destination port to the
L4 input node. When a non-default port is configured, register it
as an alias for the standard VXLAN port (4789) so that the datapath
delivers matching UDP packets to the vxlan_input node.

Unregister the alias when the port changes or the interface is
destroyed.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Introduce a transport-agnostic flood list framework for BUM traffic
(Broadcast, Unknown unicast, Multicast). In EVPN, each PE maintains
a flooding list built from IMET routes (RFC 8365, RFC 9572). The
entries in this list differ depending on the overlay encapsulation:
VXLAN uses a remote VTEP IPv4 address and a VNI, while SRv6 would
use a 128-bit SID.

The API defines a gr_flood_entry structure with a type discriminant
and a union, allowing future encapsulation types (e.g. SRv6 SIDs)
to be added without changing the API request types. A dispatch
layer in control/flood.c routes add/del/list operations to
type-specific callbacks registered at init time.

Implement the VXLAN VTEP flood type (GR_FLOOD_T_VTEP). Each VXLAN
interface maintains a per-VNI array of remote VTEP addresses used
by the vxlan_flood datapath node for ingress replication. The array
is replaced atomically with an RCU synchronization barrier so that
the datapath never sees a partially updated list.

CLI commands are exposed under "flood vtep add/del/show". Add new
generated grcli-flood(1) man page.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
In a VXLAN overlay, the bridge needs to know which remote VTEP to use
when sending unicast frames to a learned MAC address. Add a VTEP IPv4
address field to FDB entries so that known unicast traffic can be sent
directly to the correct tunnel endpoint instead of being flooded to all
VTEPs.

When bridge_input learns a MAC address from a VXLAN member interface, it
records the source VTEP from the decapsulated packet's outer IP header.
When forwarding to a known destination, the stored VTEP address is
passed to the output path via the mbuf private data so that vxlan_output
can build the correct outer header.

Only set the VTEP field when the source interface is actually a VXLAN
type to avoid storing uninitialized data from other packet paths
(control plane, local bridge traffic).

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Add three datapath nodes for VXLAN packet processing.

vxlan_input decapsulates incoming UDP/4789 packets. It strips the
outer UDP and VXLAN headers, resolves the inner VNI to a VXLAN
interface via the RCU-protected hash table, records the source VTEP
from the outer IP header into the mbuf private data, and forwards
the inner Ethernet frame to iface_input for bridge processing.

vxlan_output encapsulates outgoing frames for a known destination
VTEP. It prepends a pre-built IP/UDP/VXLAN header template
initialized by the control plane, fills in the per-packet fields
(destination VTEP, UDP length, IP length, checksum), and hashes the
inner flow to select an ephemeral source port for underlay ECMP
(RFC 7348 Section 5). The FIB lookup for the outer IP uses the
encapsulation VRF, not the bridge domain.

vxlan_flood handles BUM traffic by replicating the frame to every
VTEP in the flood list via ingress replication. The original mbuf
is sent to the first VTEP and clones are created for the rest.

The bridge_flood node is updated to steer VXLAN member traffic
through vxlan_flood instead of direct iface_output.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Set up a VXLAN overlay between grout and a Linux netns peer. Grout
runs a bridge with a VXLAN member (VNI 100) and the Linux side
mirrors the topology with a kernel VXLAN device enslaved to a Linux
bridge. Both sides have flood lists configured with each other's
VTEP address for BUM traffic replication.

The test verifies L3 connectivity over the tunnel by having the
Linux side ping the bridge address. This exercises the full path:
ARP resolution over VXLAN, FDB learning from decapsulated traffic,
and ICMP echo reply via the VXLAN output encapsulation.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Report bridge interfaces to FRR as ZEBRA_IF_BRIDGE with their MAC
address. Tag members with ZEBRA_IF_SLAVE_BRIDGE and propagate the
bridge ifindex so that FRR can associate them with the correct master.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Report VXLAN interfaces to FRR's zebra as ZEBRA_IF_VXLAN with the
associated L2 VNI information. This allows FRR's EVPN control
plane to discover which VNIs are locally configured and advertise
them via BGP IMET routes to remote PEs.

The VXLAN L2 info includes the VNI, the local VTEP address, and
the underlay interface index so that zebra can correlate the tunnel
with the correct underlay routing context.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Synchronize bridge FDB entries bidirectionally between grout and FRR.
This is required for EVPN to advertise locally learned MAC addresses
via BGP type-2 routes and to install remotely learned MACs into the
bridge forwarding table.

Zebra's dplane API is asymmetric for MAC/FDB entries. In the downward
direction (zebra to dplane provider), zebra uses DPLANE_OP_MAC_INSTALL
and DPLANE_OP_MAC_DELETE to push MACs into the dataplane. In the
upward direction (dplane provider notifying zebra of learned MACs),
DPLANE_OP_NEIGH_INSTALL and DPLANE_OP_NEIGH_DELETE must be used
instead. These go through zebra_neigh_macfdb_update() which calls
zebra_vxlan_local_mac_add_update() and ultimately triggers BGP EVPN
type-2 route advertisement. By contrast, the DPLANE_OP_MAC_* result
handler (zebra_vxlan_handle_result) is a no-op. Despite the NEIGH op
name, the context payload uses the macinfo union member and is
populated with dplane_ctx_mac_set_*() accessors, exactly like zebra's
own netlink provider does in netlink_macfdb_change().

Unlike routes and nexthops which use higher-level zebra APIs that
resolve the namespace from the VRF ID, the FDB notification path
looks up interfaces via if_lookup_by_index_per_ns(ns_id, ifindex).
GROUT_NS must therefore be set on the dplane context for the
interface lookup to succeed.

Function names follow zebra's rt_netlink.c naming conventions:
grout_macfdb_change() for the upward notification path (like
netlink_macfdb_change) and grout_macfdb_update_ctx() for the
downward install path (like netlink_macfdb_update_ctx).

Self-event suppression is enabled on the FDB event subscriptions
to prevent feedback loops when FRR installs a MAC that was originally
learned by grout.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Handle DPLANE_OP_VTEP_ADD and DPLANE_OP_VTEP_DELETE operations from
FRR's EVPN control plane. When BGP learns a remote VTEP via an IMET
route (EVPN type-3), zebra pushes the VTEP to the dataplane provider.

The grout_vxlan_flood_update_ctx() function (named after zebra's
netlink_vxlan_flood_update_ctx() in rt_netlink.c) translates these
operations into GR_FLOOD_ADD/DEL requests with GR_FLOOD_T_VTEP type.
This is a downward-only path: zebra pushes flood list entries to the
dplane provider. There is no upward notification for VTEP flood list
changes since grout does not learn VTEPs on its own, they are always
provided by FRR's BGP EVPN control plane.

This allows BGP EVPN to dynamically manage the per-VNI flood lists
used for BUM traffic ingress replication, replacing the need for
static flood list configuration via the CLI.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Set up a full EVPN/VXLAN topology between FRR+grout and a
standalone FRR+Linux peer. Each side runs a bridge with a VXLAN
member (VNI 100) and a host namespace. Both peers run iBGP with
the l2vpn evpn address-family and advertise-all-vni.

The test verifies that EVPN type-3 (IMET) routes are exchanged so
that both sides install each other's VTEP in their flood lists.
It then verifies end-to-end L2 connectivity by pinging between the
two host namespaces through the VXLAN overlay, which exercises
type-2 (MAC/IP) route advertisement and FDB synchronization.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
Add bridge-aware behavior to bond interfaces for proper interaction
with L2 bridge domains. When a bond is added to a bridge, it tracks
the minimum number of active member links required to be considered
operationally up. This prevents forwarding loops and packet loss when
insufficient links are available.

The bond_bridge_integration structure tracks bridge membership and
the minimum active links threshold. When active members drop below
the threshold, the bond transitions to DOWN state and triggers events
so that protocols like RSTP can react and prevent forwarding through
unstable bonds.

Also trigger gratuitous ARP via GR_EVENT_IFACE_STATUS_UP when the
active member changes in active-backup mode.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Extend upstream's L2 bridge infrastructure with hook points for
advanced features that will be added in subsequent commits: RSTP,
port security, storm control, VLAN filtering, multicast snooping,
port mirroring, QoS, DHCP snooping, DAI, IPSG, and LLDP.

Add optional feature pointers (rstp, mcast_snoop, vlan_filter, lldp)
to iface_info_bridge so that features can be enabled per-bridge. Add
per-core bridge and FDB statistics arrays, and per-interface security
state (MAC limits, shutdown-on-violation).

In the bridge_input datapath, check RSTP port learning state and port
security shutdown/MAC limits before allowing source MAC learning. In
bridge_flood, skip ports not in RSTP forwarding state.

Define API request types for all planned features using request IDs
in the 0x0020-0x00B5 range, which do not conflict with upstream's
FDB (0x0001-0x0006) or flood list (0x0011-0x0013) request IDs.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement API handlers and CLI for per-bridge statistics. The stats
aggregate per-core counters for forwarding (unicast, broadcast,
multicast, flood), drops (no FDB, hairpin, interface down), learning
(learned, updated, failed, skipped, MAC limits), RSTP (blocking
drop, learn skip), and FDB (lookup hit/miss, aged entries).

The CLI uses bridge interface names consistent with upstream's
convention: "bridge stats <name>" to display and "bridge stats reset
<name>" to clear counters.

Extend bridge_test.sh smoke test with stats verification.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement port security for bridge member interfaces. Each interface
can be configured with a maximum number of learned MAC addresses.
When the limit is reached, new MACs are rejected. Optionally, the
interface can be shut down on violation and later re-enabled.

The datapath hooks in bridge_input already enforce these limits
(added in the extension hooks commit). This commit adds the control
plane API handlers for configuring and querying security state, and
the CLI under "port-security" with set, show, and reenable commands.

Extend bridge_test.sh smoke test with port security verification.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement Rapid Spanning Tree Protocol (IEEE 802.1D-2004) control
plane with bridge-level and per-port configuration. RSTP state is
stored as an optional pointer on the bridge info structure, allocated
on demand when RSTP is enabled via the API.

The port role selection algorithm compares priority vectors received
in BPDUs to elect a root bridge, assign root/designated/alternate
roles, and drive the port state machine through discarding, learning
and forwarding states. Edge ports (PortFast) skip the forward delay.
BPDU Guard and Root Guard protection features are supported per port.

Bridge control plane hooks in bridge_input and bridge_flood gate
learning and forwarding based on RSTP port state. When RSTP is not
configured, ports default to forwarding.

API handlers expose bridge-level RSTP configuration (priority, hello
time, forward delay, max age) and per-port parameters (priority,
path cost, edge/guard flags). Topology change detection triggers TC
propagation timers on non-edge port state transitions.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge VLAN filtering with three port modes: access
(single untagged VLAN), trunk (multiple tagged VLANs with optional
native VLAN), and hybrid (mix of tagged and untagged). The filtering
context is allocated on demand when VLAN filtering is enabled on a
bridge and freed when disabled or when the bridge is destroyed.

VLAN membership is tracked per port using a 4096-bit bitmap for O(1)
ingress and egress checks. In access mode, only the configured VLAN
is allowed and frames are always untagged on egress. In trunk and
hybrid modes, a configurable set of VLANs is allowed, with the
native VLAN untagged on egress.

Ingress and egress VLAN translation (1:1 mapping) and Q-in-Q
(802.1ad) outer S-VLAN tagging are supported per port. Per-core
statistics track filtered frames, PVID additions, tag removals,
translations and Q-in-Q operations.

API handlers cover enabling/disabling filtering per bridge, port
mode configuration, VLAN translation, Q-in-Q and statistics
retrieval.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-interface rate limiting for broadcast, multicast and
unknown unicast traffic using DPDK two-rate three-color markers
(RFC 2698). Each traffic type has an independent meter with
configurable rate in kbps or packets per second.

When a meter exceeds its committed information rate, packets are
dropped and a per-type violation counter increments. If shutdown on
violation is enabled, the interface is disabled after a configurable
number of consecutive violations. A reenable command allows recovery.

Per-core statistics track passed and dropped packets for each traffic
type, plus shutdown events. Configuration and metering state are
stored in per-interface arrays indexed by interface ID and lcore ID
for lock-free datapath access.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement session-based port mirroring with up to 8 concurrent
sessions per bridge. Each session maps a set of source ports to a
single destination port with configurable direction (ingress, egress
or both).

Source ports are stored in a gr_vec per session and deduplicated on
insertion. RSPAN (Remote Switched Port Analyzer) is supported by
tagging mirrored frames with a configurable VLAN ID for transport
across trunk links.

Per-session packet filters can match on ethertype, source MAC,
destination MAC and a set of VLAN IDs. The filter is evaluated in
the datapath before cloning to avoid unnecessary packet copies.

Per-core statistics track mirrored packets, drops, filter matches
and clone failures. Session state and statistics are stored in
per-bridge arrays for lock-free datapath access.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge multicast snooping with IGMP (IPv4) and MLD
(IPv6) support. The Multicast Database stores group memberships
as MAC-keyed entries in an rte_hash table, with each entry tracking
a vector of member port IDs and a last-seen timestamp.

IGMP membership reports add the reporting port to the corresponding
multicast group. Leave messages remove the port, and the entry is
deleted when no ports remain. Static MDB entries bypass aging and
persist until explicitly removed.

Periodic aging removes learned entries older than the configurable
aging time (default 260 seconds). The querier mode, when enabled,
sends general queries at configurable intervals to solicit fresh
membership reports from hosts.

IP-to-MAC conversion follows RFC 1112 for IPv4 (01:00:5E prefix
with low 23 bits of group address) and RFC 2464 for IPv6 (33:33
prefix with low 32 bits).

API handlers cover snooping configuration, MDB entry management
(add/del/list) and multicast group membership queries.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-interface Quality of Service with 8 priority queues
mapped to IEEE 802.1p Class of Service values. Three scheduling
modes are supported: strict priority, weighted round-robin (WRR)
and deficit weighted round-robin (DWRR).

Packet classification extracts priority from 802.1p CoS in VLAN
tags or from IP DSCP fields using a configurable 64-entry mapping
table. A configurable default priority applies when neither CoS
nor DSCP trust is enabled.

Per-queue rate limiting uses DPDK two-rate three-color meters with
configurable committed information rate. A port-level shaper caps
aggregate throughput across all queues. CoS remarking applies an
output CoS mapping before egress.

Per-core statistics track classification, remarking, per-queue
drops and transmits, and port shaper drops.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge DHCP snooping with a binding table that maps
client MAC addresses to assigned IPs, ports and VLANs. Bindings are
stored in an rte_hash table keyed by MAC for O(1) datapath lookups.

Trusted ports accept DHCP server responses (OFFER, ACK, NAK) while
untrusted ports only forward client messages. Source MAC verification
rejects packets where the Ethernet source MAC does not match the
DHCP client hardware address.

Bindings can be learned dynamically from DHCP ACK messages or
configured statically. Dynamic bindings expire when their lease
runs out or after a configurable aging time. A per-bridge maximum
binding limit prevents exhaustion.

The binding table is shared with DAI (Dynamic ARP Inspection) and
IPSG (IP Source Guard) for source address validation.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge Dynamic ARP Inspection that validates ARP
packets against DHCP snooping bindings. Configurable validation
checks include sender MAC versus Ethernet source MAC, target MAC
versus Ethernet destination MAC, and sender IP presence in the
DHCP binding table.

Per-port trust state allows DHCP servers and uplinks to bypass
inspection. Optional per-port ARP rate limiting uses DPDK token
bucket meters to prevent ARP storms on untrusted ports.

Per-core statistics track ARP requests, replies, valid packets and
drops categorized by violation type (MAC mismatch, IP not bound,
rate limit exceeded, trusted port bypass).

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge IP Source Guard that validates IPv4 source
addresses against DHCP snooping bindings. When enabled, packets
with source IPs not found in the binding table are dropped.

Configuration supports per-bridge enable/disable, source IP
verification toggle, and violation logging. IPSG works in
conjunction with DHCP snooping which maintains the binding table
of authorized MAC-to-IP mappings.

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge Link Layer Discovery Protocol configuration
with neighbor table management. LLDP state is stored as an optional
pointer on the bridge info structure, allocated when LLDP is enabled
and freed on disable or bridge destruction.

The neighbor table holds up to 128 entries per bridge, indexed by
chassis ID and interface. Entries are added or updated when LLDP
frames are received and aged out when the advertised TTL expires.
Each entry stores the mandatory TLVs (chassis ID, port ID, TTL)
and optional descriptive TLVs (port description, system name,
system description).

Configuration supports per-bridge TX interval and TTL with IEEE
802.1AB defaults (30 seconds and 120 seconds respectively).

Signed-off-by: Fabien Dupont <fdupont@redhat.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive L2 bridging features including DHCP snooping, Dynamic ARP Inspection, IP Source Guard, RSTP, QoS, port mirroring, VLAN filtering, LLDP, multicast snooping, and storm control. It adds VXLAN interface support with flooding and flood VTEP management, integrates FRR MAC FDB and flood updates into the datapath, and extends bond-to-bridge integration with active link thresholds. The changes include CLI command handlers, control-plane API endpoints, datapath processing nodes, configuration data structures, statistics tracking, and comprehensive unit/smoke tests across the system.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

🧹 Nitpick comments (2)
modules/l2/control/lldp_priv.h (1)

31-53: struct lldp_config embeds ~166 KB of neighbor storage inline.

Each lldp_neighbor is ~1.3 KB, so neighbors[128] inside lldp_config makes each allocation ~166 KB. This works since lldp_config_alloc uses rte_zmalloc, but it's worth being aware of if multiple bridges enable LLDP.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/lldp_priv.h` around lines 31 - 53, The lldp_config
currently embeds a large array neighbors[LLDP_MAX_NEIGHBORS] causing ~166KB per
lldp_config; change struct lldp_config to hold a pointer (struct lldp_neighbor
*neighbors) and a count (num_neighbors) instead of the fixed array, allocate the
neighbors buffer separately (e.g., in lldp_config_alloc using rte_zmalloc for
neighbors) and free it in the corresponding free function; update any code
referencing lldp_config.neighbors to handle the pointer (and ensure bounds
checks against num_neighbors/LLDP_MAX_NEIGHBORS) and keep existing symbols
lldp_config, lldp_neighbor, LLDP_MAX_NEIGHBORS, lldp_config_alloc, and
rte_zmalloc as reference points.
modules/l2/control/rstp.c (1)

617-619: BPDUs are built but never transmitted — RSTP is effectively non-functional on the wire.

The TODO at line 617 means no BPDU will ever leave the bridge, so peers won't participate in spanning-tree negotiation. This is fine for an in-progress feature, but should be tracked to avoid shipping a silently broken RSTP.

Would you like me to open an issue to track the BPDU transmission implementation?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/rstp.c` around lines 617 - 619, BPDU packets are created
but never transmitted because the code only logs and does "(void)bpdu"; replace
that TODO by actually sending the built BPDU via the existing control path: call
the bridge's transmit API (control_output or the module's equivalent transmit
function) with the bpdu buffer/mbuf, handle and log any errors, and only
increment the BPDU transmit counter on successful send; remove the "(void)bpdu"
noop and ensure proper ownership/freeing of the bpdu buffer after send or on
error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/l2/cli/port_mirror.c`:
- Around line 56-67: The code silently skips invalid interface names when
parsing src_list: when iface_from_name(c, tok) returns NULL the user is not
informed and req.num_sources is quietly unchanged; update the parsing in the
port mirror setup (the block using strdup(src_list), strtok, iface_from_name,
req.source_ports and req.num_sources) to detect a NULL return and either log a
clear error/warning (including the bad token) or return an error status to the
caller instead of silently skipping; ensure the behavior is consistent (choose
logging vs returning) and that you free copy before returning on error and do
not overflow req.source_ports (keep the existing limit check).

In `@modules/l2/cli/vlan_filtering.c`:
- Around line 93-103: The code calls strdup(vlan_list) without checking its
return; if strdup returns NULL, subsequent strtok(copy, ",") will crash—add a
NULL check after the strdup call (e.g., if (copy != NULL) { ... }) so parsing
into tok and pushing into req.allowed_vlans (and calling free(copy)) only
happens when copy is non-NULL; apply the same NULL-checked strdup pattern in the
other branch that handles trunk/hybrid VLAN parsing so both branches (the
vlan_list strdup usage) are guarded.

In `@modules/l2/control/bridge.c`:
- Around line 50-60: Both iface_increment_mac_count and
iface_decrement_mac_count validate iface_id but not lcore_id, allowing
out-of-bounds accesses into l2_iface_mac_counts; add a guard checking lcore_id
against the second-dimension size before touching l2_iface_mac_counts (e.g.,
ensure lcore_id < the configured lcore count constant or
ARRAY_SIZE(l2_iface_mac_counts[0]) / L2_MAX_LCORES) in both functions
(iface_increment_mac_count and iface_decrement_mac_count) and return early if
the check fails so all writes/reads use valid lcore indices.

In `@modules/l2/control/dhcp_snooping.c`:
- Around line 353-362: dhcp_trusted_port_set_cb currently ignores return values
from dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port and
always returns success; change it to capture the int rc from those calls, and if
rc < 0 return a failure api_out reflecting the error (e.g., api_out(-1, rc,
NULL) or the project's existing pattern) so callers see -EINVAL/-EEXIST/-ENOENT;
update dhcp_trusted_port_set_cb to call the appropriate function, check rc, and
only return api_out(0,0,NULL) on success.
- Around line 272-286: The handler dhcp_snooping_set_cb currently ignores the
return value of dhcp_snooping_enable which can fail (e.g., -ENOMEM); modify
dhcp_snooping_set_cb to capture the return value of
dhcp_snooping_enable(req->bridge_id, req->enabled), and if it returns a negative
error, immediately return api_out(err, 0, NULL) (do not continue to call
dhcp_snooping_set_verify_mac, dhcp_snooping_set_max_bindings, or
dhcp_snooping_set_aging_time); ensure the error from dhcp_snooping_enable is
propagated to the client so the enabled state isn’t reported as successful when
initialization failed.

In `@modules/l2/control/flood.c`:
- Around line 30-42: The code indexes flood_types with unvalidated user input
req->entry.type in flood_add (and similarly in the flood_delete/flood_get
paths), which can cause out-of-bounds access; fix by validating the type before
using it: check req and req->entry.type against the valid range (e.g., 0 <= type
< NUM_FLOOD_TYPES or whatever the flood_types size symbol is) and return
api_out(EINVAL or EAFNOSUPPORT, ...) if invalid, then proceed to load ops =
flood_types[type] and check ops->add as currently done; apply the same
validation pattern to the other functions that access flood_types (the functions
handling lines ~50-62).

In `@modules/l2/control/gr_l2_control.h`:
- Around line 60-70: Both bridge_get_stats and fdb_get_stats are missing a
bounds check for lcore_id and may index l2_bridge_stats or l2_fdb_stats out of
bounds; add a check in each function to ensure lcore_id < RTE_MAX_LCORE (similar
to storm_control_get_stats/qos_get_stats) and return NULL if the check fails so
the functions validate both bridge_id and lcore_id before returning
&l2_bridge_stats[bridge_id][lcore_id] or &l2_fdb_stats[bridge_id][lcore_id].

In `@modules/l2/control/iface_security.c`:
- Around line 59-76: The reenable handler iface_security_reenable can silently
return success when req->iface_id is >= L2_MAX_IFACES because the
shutdown-clearing branch is skipped; add an explicit range check for
req->iface_id (compare against L2_MAX_IFACES) and return api_out(ERANGE, 0,
NULL) to match the behavior of the corresponding set handler, before accessing
l2_iface_security[req->iface_id] or calling iface_from_id, so that
iface_security_reenable and the set handler are consistent; update references to
l2_iface_security and iface_security_reenable accordingly.

In `@modules/l2/control/ipsg.c`:
- Around line 28-38: The code indexes ipsg_configs with req->bridge_id without
verifying it is within L2_MAX_BRIDGES; add a bounds check (e.g., if
(req->bridge_id >= L2_MAX_BRIDGES) return api_out(ENOENT, 0, NULL);) before any
access to ipsg_configs in the allocation/lookup block and also at the start of
ipsg_config_get_cb so all uses of ipsg_configs[req->bridge_id] are protected;
reference the ipsg_configs array, iface_from_id call and the ipsg_config_get_cb
function when applying this change.

In `@modules/l2/control/mcast_snooping_test.c`:
- Around line 26-34: The test constructs ip4_addr_t in host byte order which
masks an endianness bug in mcast_ip_to_mac; update the test so ip4 is created in
network byte order (big-endian) using rte_cpu_to_be_32 (or equivalent) when
building the 224.1.2.3 value, keep the rest of the assertions the same to
validate mcast_ip_to_mac correctly handles network-order ip4_addr_t.

In `@modules/l2/control/mcast_snooping.c`:
- Around line 78-96: The IPv4-to-MAC conversion in mcast_ip_to_mac is
endian-unsafe because it shifts a network-order ip4_addr_t on the host integer
representation; fix mcast_ip_to_mac by treating the IPv4 address as a byte array
in network (big-endian) order and map bytes directly into struct rte_ether_addr:
use a const uint8_t *ip4bytes = (const uint8_t *)ip and set mac->addr_bytes[3] =
ip4bytes[1] & 0x7F, mac->addr_bytes[4] = ip4bytes[2], mac->addr_bytes[5] =
ip4bytes[3] (keep IPv6 branch unchanged), ensuring correct behavior on
little-endian and big-endian hosts and aligning with the multicast mapping
rules.

In `@modules/l2/control/port_mirror.c`:
- Around line 279-303: mirror_session_get_cb sets resp->num_sources to the full
vector length but only copies up to 16 entries, causing clients to believe more
ports were returned than exist; fix by capping resp->num_sources to the response
array capacity (e.g., uint16_t cap = min(gr_vec_len(session.source_ports),
(size_t)16)) and use that capped value for the copy loop and resp->num_sources,
or alternatively return an error when the source_ports length exceeds the
response capacity; update the logic around gr_vec_len(session.source_ports),
resp->num_sources, and the for loop that copies session.source_ports so they
consistently use the same capped count.

In `@modules/l2/control/qos.c`:
- Around line 55-61: qos_port_set currently treats array entries of 0 as "unset"
and overwrites user mappings; instead add explicit boolean flags (e.g., in the
qos_port_config struct: bool dscp_map_set and bool cos_map_set) to record when a
user has configured maps, set those flags in qos_dscp_map_set and
qos_cos_remap_set, and change qos_port_set to skip replacing cfg->dscp_to_cos
and cfg->cos_to_cos when the corresponding flag is true; use the flags rather
than sentinel-value checks on cfg->dscp_to_cos and cfg->cos_to_cos to avoid
overwriting legitimate mappings to 0.

In `@modules/l2/control/rstp_priv.h`:
- Around line 224-228: The macros for constructing and extracting RSTP port IDs
use an 8+8 layout but IEEE 802.1D-2004 requires a 4-bit priority + 12-bit port
number. Update RSTP_PORT_ID to shift the priority left 12 bits and mask the port
number with 0x0FFF, and update RSTP_PORT_ID_PRIORITY and RSTP_PORT_ID_NUM to
extract using >>12 and &0x0FFF respectively (use appropriate uint8_t/uint16_t
casts as in the originals), referencing the macros RSTP_PORT_ID,
RSTP_PORT_ID_PRIORITY, and RSTP_PORT_ID_NUM.
- Line 182: The ports field in the rstp_priv struct is missing the gr_vec
annotation; update the declaration of the ports field (the "ports" member in
struct rstp_priv) to include the gr_vec annotation used by other vector fields
(as seen in port_mirror_priv.h, mcast_snooping_priv.h, dhcp_snooping_priv.h) so
it correctly reflects that it is managed by gr_vec_add(), gr_vec_free(),
gr_vec_del(), and gr_vec_len(); ensure the annotation placement matches the
project convention for vector fields.

In `@modules/l2/control/vxlan.c`:
- Around line 52-84: The current VXLAN update mutates cur->encap_vrf_id and
adjusts VRF refcounts before validating the new vxlan_key and before
successfully adding it to vxlan_hash, which can orphan the interface on failure;
change the flow in the VXLAN update path so you first validate next->vni and
build the new vxlan_key (using rte_cpu_to_be_32(next->vni) and next encap VRF),
then try rte_hash_lookup/rte_hash_add_key_data for the new key before calling
rte_hash_del_key for the old cur_key; only after rte_hash_add_key_data succeeds
update cur->vni and cur->encap_vrf_id, and if you must increment the new VRF
(vrf_incref) do so tentatively and on any later failure call vrf_decref to roll
back so vrf refcounts remain consistent (use vrf_incref/vrf_decref and handle
errors from rte_hash_add_key_data to restore state).

In `@modules/l2/datapath/bridge_input.c`:
- Around line 83-103: The code enforces iface MAC limits before checking if the
source MAC is already learned, causing already-learned MACs to be dropped and
their last_seen not refreshed; fix by first checking the FDB for an existing
entry for bridge->id / d->iface->id / &eth->src_addr / d->vlan_id (use the
existing FDB lookup function, e.g. fdb_lookup or fdb_find in this module), and
only if the lookup returns “not found” apply iface_get_max_macs() /
iface_get_total_macs() enforcement and the shutdown/stats logic; if the entry
exists, skip the limit check and call fdb_learn(bridge->id, d->iface->id,
&eth->src_addr, d->vlan_id, vtep) to refresh last_seen.

In `@modules/l2/datapath/l2_datapath.h`:
- Line 15: trace_vxlan_format in modules/l2/datapath/vxlan_input.c currently
chains snprintf calls unsafely: it uses the first snprintf return (int n)
directly to compute buf + n and len - n, but snprintf returns the would-be
length and can be >= len causing out-of-bounds pointer arithmetic and size_t
underflow. Fix by checking snprintf's return: if n < 0 treat as 0/error; if
(size_t)n >= len clamp n to (int)len (or set remaining = 0) and only compute buf
+ n and remaining = len - (size_t)n when remaining > 0; only call the second
snprintf when remaining > 0 and use size_t for remaining to avoid
signed/unsigned mix. Ensure these checks are applied in trace_vxlan_format
around the first snprintf and before the conditional snprintf for t->vtep.

In `@modules/l2/datapath/vxlan_input.c`:
- Around line 61-83: The VXLAN path reads the VXLAN header via
rte_pktmbuf_mtod_offset and then calls rte_pktmbuf_adj without validating mbuf
lengths; add explicit length checks before accessing struct rte_vxlan_hdr and
before trimming the UDP+VXLAN headers: verify packet length (using
rte_pktmbuf_pkt_len and/or rte_pktmbuf_data_len) is at least sizeof(struct
rte_udp_hdr) + sizeof(struct rte_vxlan_hdr) and return/skip the packet (set edge
to BAD_FLAGS or NO_TUNNEL as appropriate) if not. After calling rte_pktmbuf_adj
ensure the adjustment succeeded (check the returned pointer or remaining data
length) and bail out safely if it failed before calling iface_mbuf_data,
vxlan_decode_vni, or vxlan_get_iface so you never read past the mbuf bounds.

In `@smoke/port_mirror_test.sh`:
- Around line 27-28: The test assumes `grcli port-mirror show br0 2` will return
a disabled record, but deletion may remove the session entirely; update the
assertion around the `grcli port-mirror show br0 2` call so it accepts either a
successful show reporting 'enabled: false' or a non-zero/empty output indicating
the session no longer exists. Modify the check that currently pipes to `grep -q
"enabled: false"` and `fail "session 2 not deleted"` to first capture the `grcli
port-mirror show br0 2` exit status/output (the `grcli port-mirror show`
invocation and the `grep -q "enabled: false"` check) and only fail if the
command succeeded but the output does not contain 'enabled: false'; treat a
non-zero exit or empty output as a valid deletion.

---

Nitpick comments:
In `@modules/l2/control/lldp_priv.h`:
- Around line 31-53: The lldp_config currently embeds a large array
neighbors[LLDP_MAX_NEIGHBORS] causing ~166KB per lldp_config; change struct
lldp_config to hold a pointer (struct lldp_neighbor *neighbors) and a count
(num_neighbors) instead of the fixed array, allocate the neighbors buffer
separately (e.g., in lldp_config_alloc using rte_zmalloc for neighbors) and free
it in the corresponding free function; update any code referencing
lldp_config.neighbors to handle the pointer (and ensure bounds checks against
num_neighbors/LLDP_MAX_NEIGHBORS) and keep existing symbols lldp_config,
lldp_neighbor, LLDP_MAX_NEIGHBORS, lldp_config_alloc, and rte_zmalloc as
reference points.

In `@modules/l2/control/rstp.c`:
- Around line 617-619: BPDU packets are created but never transmitted because
the code only logs and does "(void)bpdu"; replace that TODO by actually sending
the built BPDU via the existing control path: call the bridge's transmit API
(control_output or the module's equivalent transmit function) with the bpdu
buffer/mbuf, handle and log any errors, and only increment the BPDU transmit
counter on successful send; remove the "(void)bpdu" noop and ensure proper
ownership/freeing of the bpdu buffer after send or on error.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16060d5 and 093f793.

⛔ Files ignored due to path filters (1)
  • docs/graph.svg is excluded by !**/*.svg
📒 Files selected for processing (88)
  • docs/meson.build
  • frr/if_grout.c
  • frr/rt_grout.c
  • frr/rt_grout.h
  • frr/zebra_dplane_grout.c
  • frr/zebra_dplane_grout.h
  • modules/infra/api/gr_infra.h
  • modules/infra/control/bond.c
  • modules/infra/control/bond_bridge_test.c
  • modules/infra/control/ctlplane.c
  • modules/infra/control/gr_bond.h
  • modules/infra/control/iface.c
  • modules/infra/control/meson.build
  • modules/infra/datapath/gr_rxtx.h
  • modules/l2/api/gr_l2.h
  • modules/l2/cli/dai.c
  • modules/l2/cli/dhcp_snooping.c
  • modules/l2/cli/fdb.c
  • modules/l2/cli/flood.c
  • modules/l2/cli/iface_security.c
  • modules/l2/cli/ipsg.c
  • modules/l2/cli/l2_stats.c
  • modules/l2/cli/lldp.c
  • modules/l2/cli/mcast_snooping.c
  • modules/l2/cli/meson.build
  • modules/l2/cli/port_mirror.c
  • modules/l2/cli/qos.c
  • modules/l2/cli/rstp.c
  • modules/l2/cli/storm_control.c
  • modules/l2/cli/vlan_filtering.c
  • modules/l2/cli/vxlan.c
  • modules/l2/control/bridge.c
  • modules/l2/control/dai.c
  • modules/l2/control/dai_priv.h
  • modules/l2/control/dhcp_snooping.c
  • modules/l2/control/dhcp_snooping_priv.h
  • modules/l2/control/fdb.c
  • modules/l2/control/flood.c
  • modules/l2/control/gr_l2_control.h
  • modules/l2/control/iface_security.c
  • modules/l2/control/ipsg.c
  • modules/l2/control/ipsg_priv.h
  • modules/l2/control/l2_hooks_test.c
  • modules/l2/control/l2_stats.c
  • modules/l2/control/lldp.c
  • modules/l2/control/lldp_priv.h
  • modules/l2/control/mcast_snooping.c
  • modules/l2/control/mcast_snooping_priv.h
  • modules/l2/control/mcast_snooping_test.c
  • modules/l2/control/meson.build
  • modules/l2/control/port_mirror.c
  • modules/l2/control/port_mirror_priv.h
  • modules/l2/control/port_mirror_test.c
  • modules/l2/control/qos.c
  • modules/l2/control/qos_priv.h
  • modules/l2/control/qos_test.c
  • modules/l2/control/rstp.c
  • modules/l2/control/rstp_priv.h
  • modules/l2/control/rstp_test.c
  • modules/l2/control/storm_control.c
  • modules/l2/control/storm_control_priv.h
  • modules/l2/control/storm_control_test.c
  • modules/l2/control/vlan_filtering.c
  • modules/l2/control/vlan_filtering_priv.h
  • modules/l2/control/vlan_filtering_test.c
  • modules/l2/control/vxlan.c
  • modules/l2/datapath/bridge_flood.c
  • modules/l2/datapath/bridge_input.c
  • modules/l2/datapath/l2_datapath.h
  • modules/l2/datapath/meson.build
  • modules/l2/datapath/vxlan_flood.c
  • modules/l2/datapath/vxlan_input.c
  • modules/l2/datapath/vxlan_output.c
  • modules/l4/gr_l4.h
  • modules/l4/l4_input_local.c
  • smoke/bridge_test.sh
  • smoke/dai_test.sh
  • smoke/dhcp_snooping_test.sh
  • smoke/evpn_vxlan_frr_test.sh
  • smoke/ipsg_test.sh
  • smoke/lldp_test.sh
  • smoke/mcast_snooping_test.sh
  • smoke/port_mirror_test.sh
  • smoke/qos_test.sh
  • smoke/rstp_test.sh
  • smoke/storm_control_test.sh
  • smoke/vlan_filtering_test.sh
  • smoke/vxlan_test.sh

Comment on lines +56 to +67
char *copy = strdup(src_list);
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_sources < 16) {
src = iface_from_name(c, tok);
if (src != NULL) {
req.source_ports[req.num_sources++] = src->id;
free(src);
}
tok = strtok(NULL, ",");
}
free(copy);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent skip on invalid source interface name.

If a source interface name in the comma-separated list is invalid (typo, non-existent), iface_from_name returns NULL and the entry is silently skipped with no warning. The user gets a successful response with fewer sources than intended. Consider logging or returning an error.

Proposed fix
 		while (tok != NULL && req.num_sources < 16) {
 			src = iface_from_name(c, tok);
-			if (src != NULL) {
-				req.source_ports[req.num_sources++] = src->id;
-				free(src);
+			if (src == NULL) {
+				fprintf(stderr, "unknown source interface: %s\n", tok);
+				free(copy);
+				return CMD_ERROR;
 			}
+			req.source_ports[req.num_sources++] = src->id;
+			free(src);
 			tok = strtok(NULL, ",");
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
char *copy = strdup(src_list);
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_sources < 16) {
src = iface_from_name(c, tok);
if (src != NULL) {
req.source_ports[req.num_sources++] = src->id;
free(src);
}
tok = strtok(NULL, ",");
}
free(copy);
}
char *copy = strdup(src_list);
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_sources < 16) {
src = iface_from_name(c, tok);
if (src == NULL) {
fprintf(stderr, "unknown source interface: %s\n", tok);
free(copy);
return CMD_ERROR;
}
req.source_ports[req.num_sources++] = src->id;
free(src);
tok = strtok(NULL, ",");
}
free(copy);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/cli/port_mirror.c` around lines 56 - 67, The code silently skips
invalid interface names when parsing src_list: when iface_from_name(c, tok)
returns NULL the user is not informed and req.num_sources is quietly unchanged;
update the parsing in the port mirror setup (the block using strdup(src_list),
strtok, iface_from_name, req.source_ports and req.num_sources) to detect a NULL
return and either log a clear error/warning (including the bad token) or return
an error status to the caller instead of silently skipping; ensure the behavior
is consistent (choose logging vs returning) and that you free copy before
returning on error and do not overflow req.source_ports (keep the existing limit
check).

Comment on lines +93 to +103
if (vlan_list != NULL) {
char *copy = strdup(vlan_list);
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_allowed_vlans < 256) {
unsigned long v = strtoul(tok, NULL, 10);
if (v > 0 && v <= 4094)
req.allowed_vlans[req.num_allowed_vlans++] = v;
tok = strtok(NULL, ",");
}
free(copy);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate the file and understand its structure
find . -name "vlan_filtering.c" -type f

Repository: DPDK/grout

Length of output: 125


🏁 Script executed:

# Read the relevant section of the file (lines 80-120)
wc -l modules/l2/cli/vlan_filtering.c

Repository: DPDK/grout

Length of output: 89


🏁 Script executed:

# Get context around the flagged lines
sed -n '80,120p' modules/l2/cli/vlan_filtering.c | cat -n

Repository: DPDK/grout

Length of output: 1645


🏁 Script executed:

# Find the function definition that contains these lines
sed -n '1,100p' modules/l2/cli/vlan_filtering.c | cat -n

Repository: DPDK/grout

Length of output: 3501


🏁 Script executed:

# Verify both occurrences and see more context
sed -n '88,125p' modules/l2/cli/vlan_filtering.c | cat -n

Repository: DPDK/grout

Length of output: 1542


Add NULL check after strdup to prevent crash on allocation failure

strdup can fail and return NULL on out-of-memory, causing strtok to crash with undefined behavior. Guard both the trunk and hybrid branches with NULL checks.

🔧 Proposed fix
char *copy = strdup(vlan_list);
+if (copy == NULL)
+	return CMD_ERROR;
char *tok = strtok(copy, ",");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (vlan_list != NULL) {
char *copy = strdup(vlan_list);
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_allowed_vlans < 256) {
unsigned long v = strtoul(tok, NULL, 10);
if (v > 0 && v <= 4094)
req.allowed_vlans[req.num_allowed_vlans++] = v;
tok = strtok(NULL, ",");
}
free(copy);
}
if (vlan_list != NULL) {
char *copy = strdup(vlan_list);
if (copy == NULL)
return CMD_ERROR;
char *tok = strtok(copy, ",");
while (tok != NULL && req.num_allowed_vlans < 256) {
unsigned long v = strtoul(tok, NULL, 10);
if (v > 0 && v <= 4094)
req.allowed_vlans[req.num_allowed_vlans++] = v;
tok = strtok(NULL, ",");
}
free(copy);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/cli/vlan_filtering.c` around lines 93 - 103, The code calls
strdup(vlan_list) without checking its return; if strdup returns NULL,
subsequent strtok(copy, ",") will crash—add a NULL check after the strdup call
(e.g., if (copy != NULL) { ... }) so parsing into tok and pushing into
req.allowed_vlans (and calling free(copy)) only happens when copy is non-NULL;
apply the same NULL-checked strdup pattern in the other branch that handles
trunk/hybrid VLAN parsing so both branches (the vlan_list strdup usage) are
guarded.

Comment on lines +50 to +60
void iface_increment_mac_count(uint16_t iface_id, uint16_t lcore_id) {
if (iface_id >= L2_MAX_IFACES)
return;
l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs++;
}

void iface_decrement_mac_count(uint16_t iface_id, uint16_t lcore_id) {
if (iface_id >= L2_MAX_IFACES)
return;
if (l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs > 0)
l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs--;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bounds-check lcore_id to avoid OOB writes

iface_increment_mac_count and iface_decrement_mac_count only check iface_id. If lcore_id is out of range, this writes past l2_iface_mac_counts, causing memory corruption.

🔧 Proposed fix
-	if (iface_id >= L2_MAX_IFACES)
+	if (iface_id >= L2_MAX_IFACES || lcore_id >= RTE_MAX_LCORE)
 		return;
@@
-	if (iface_id >= L2_MAX_IFACES)
+	if (iface_id >= L2_MAX_IFACES || lcore_id >= RTE_MAX_LCORE)
 		return;
#!/bin/bash
# Inspect call sites to confirm lcore_id is always a valid RTE lcore.
rg -n "iface_(increment|decrement)_mac_count" -C3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/bridge.c` around lines 50 - 60, Both
iface_increment_mac_count and iface_decrement_mac_count validate iface_id but
not lcore_id, allowing out-of-bounds accesses into l2_iface_mac_counts; add a
guard checking lcore_id against the second-dimension size before touching
l2_iface_mac_counts (e.g., ensure lcore_id < the configured lcore count constant
or ARRAY_SIZE(l2_iface_mac_counts[0]) / L2_MAX_LCORES) in both functions
(iface_increment_mac_count and iface_decrement_mac_count) and return early if
the check fails so all writes/reads use valid lcore indices.

Comment on lines +272 to +286
static struct api_out dhcp_snooping_set_cb(const void *request, struct api_ctx *) {
const struct gr_l2_dhcp_snooping_req *req = request;
const struct iface *bridge;

bridge = iface_from_id(req->bridge_id);
if (bridge == NULL || bridge->type != GR_IFACE_TYPE_BRIDGE)
return api_out(ENOENT, 0, NULL);

dhcp_snooping_enable(req->bridge_id, req->enabled);
dhcp_snooping_set_verify_mac(req->bridge_id, req->verify_mac);
dhcp_snooping_set_max_bindings(req->bridge_id, req->max_bindings);
dhcp_snooping_set_aging_time(req->bridge_id, req->aging_time);

return api_out(0, 0, NULL);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

dhcp_snooping_set_cb ignores the return value of dhcp_snooping_enable, which can fail with -ENOMEM.

If rte_hash_create fails inside dhcp_snooping_enable, the handler reports success to the client while DHCP snooping is left in a broken state (enabled flag set but no bindings hash).

Proposed fix
 static struct api_out dhcp_snooping_set_cb(const void *request, struct api_ctx *) {
 	const struct gr_l2_dhcp_snooping_req *req = request;
 	const struct iface *bridge;
+	int ret;
 
 	bridge = iface_from_id(req->bridge_id);
 	if (bridge == NULL || bridge->type != GR_IFACE_TYPE_BRIDGE)
 		return api_out(ENOENT, 0, NULL);
 
-	dhcp_snooping_enable(req->bridge_id, req->enabled);
+	ret = dhcp_snooping_enable(req->bridge_id, req->enabled);
+	if (ret < 0)
+		return api_out(-ret, 0, NULL);
 	dhcp_snooping_set_verify_mac(req->bridge_id, req->verify_mac);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/dhcp_snooping.c` around lines 272 - 286, The handler
dhcp_snooping_set_cb currently ignores the return value of dhcp_snooping_enable
which can fail (e.g., -ENOMEM); modify dhcp_snooping_set_cb to capture the
return value of dhcp_snooping_enable(req->bridge_id, req->enabled), and if it
returns a negative error, immediately return api_out(err, 0, NULL) (do not
continue to call dhcp_snooping_set_verify_mac, dhcp_snooping_set_max_bindings,
or dhcp_snooping_set_aging_time); ensure the error from dhcp_snooping_enable is
propagated to the client so the enabled state isn’t reported as successful when
initialization failed.

Comment on lines +353 to +362
static struct api_out dhcp_trusted_port_set_cb(const void *request, struct api_ctx *) {
const struct gr_l2_dhcp_trusted_port_req *req = request;

if (req->trusted)
dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id);
else
dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id);

return api_out(0, 0, NULL);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

dhcp_trusted_port_set_cb silently discards errors from add/del trusted port operations.

Both dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port return error codes (-EINVAL, -EEXIST, -ENOENT), but this handler always returns success. An invalid bridge_id or duplicate add would go unnoticed by the client.

Proposed fix
 static struct api_out dhcp_trusted_port_set_cb(const void *request, struct api_ctx *) {
 	const struct gr_l2_dhcp_trusted_port_req *req = request;
+	int ret;
 
 	if (req->trusted)
-		dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id);
+		ret = dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id);
 	else
-		dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id);
+		ret = dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id);
 
-	return api_out(0, 0, NULL);
+	return api_out(ret < 0 ? -ret : 0, 0, NULL);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/dhcp_snooping.c` around lines 353 - 362,
dhcp_trusted_port_set_cb currently ignores return values from
dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port and always
returns success; change it to capture the int rc from those calls, and if rc < 0
return a failure api_out reflecting the error (e.g., api_out(-1, rc, NULL) or
the project's existing pattern) so callers see -EINVAL/-EEXIST/-ENOENT; update
dhcp_trusted_port_set_cb to call the appropriate function, check rc, and only
return api_out(0,0,NULL) on success.

Comment on lines +52 to +84
if (set_attrs & GR_VXLAN_SET_ENCAP_VRF) {
uint16_t vrf = next->encap_vrf_id;
uint16_t old = cur->encap_vrf_id;

if (vrf == GR_VRF_ID_UNDEF)
vrf = vrf_default_get_or_create();

if (vrf != old && vrf_incref(vrf) < 0)
return -errno;

if (old != GR_VRF_ID_UNDEF)
vrf_decref(old);

cur->encap_vrf_id = vrf;
}

if (set_attrs & (GR_VXLAN_SET_VNI | GR_VXLAN_SET_ENCAP_VRF)) {
const struct vxlan_key next_key = {rte_cpu_to_be_32(next->vni), cur->encap_vrf_id};

if (rte_hash_lookup(vxlan_hash, &next_key) >= 0)
return errno_set(EADDRINUSE);

if (next->vni == 0 || next->vni > 0xffffff)
return errno_set(ERANGE);

rte_hash_del_key(vxlan_hash, &cur_key);

ret = rte_hash_add_key_data(vxlan_hash, &next_key, iface);
if (ret < 0)
return errno_log(-ret, "rte_hash_add_key_data");

cur->vni = next->vni;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, examine the full vxlan_reconfig function to understand the context
rg -B20 "GR_VXLAN_SET_ENCAP_VRF" modules/l2/control/vxlan.c | head -100

Repository: DPDK/grout

Length of output: 1517


🏁 Script executed:

# Find where vxlan_reconfig is called and how the flags are set
rg -n "vxlan_reconfig|GR_VXLAN_SET" modules/l2/control/vxlan.c

Repository: DPDK/grout

Length of output: 555


🏁 Script executed:

# Look for call sites that invoke vxlan_reconfig
rg -n "\.reconfig\s*=" modules/l2/control/vxlan.c -A5 -B5

Repository: DPDK/grout

Length of output: 476


🏁 Script executed:

# Check the entire vxlan.c file structure to understand the function
wc -l modules/l2/control/vxlan.c

Repository: DPDK/grout

Length of output: 84


🏁 Script executed:

# Check if vrf_incref can fail and what errors are possible
rg -n "vrf_incref" modules/infra/control/vrf.c -A3 -B1

Repository: DPDK/grout

Length of output: 191


🏁 Script executed:

# Verify the vrf_default_get_or_create behavior
rg -n "vrf_default_get_or_create" modules/infra/control/vrf.c -A10

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Check if there's any error handling for hash operations elsewhere
rg -n "rte_hash_add_key_data|rte_hash_del_key" --type c -A2 -B1 | head -80

Repository: DPDK/grout

Length of output: 4014


🏁 Script executed:

# Look at the full context of iface_vxlan_reconfig to see if there are other state mutations
sed -n '41,130p' modules/l2/control/vxlan.c

Repository: DPDK/grout

Length of output: 2709


🏁 Script executed:

# Check vrf_default_get_or_create to understand if it can fail
rg -n "vrf_default_get_or_create" modules/infra/control/vrf.c -A5

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Verify vrf_incref failure condition more carefully
rg -n "vrf_incref|get_vrf_iface" modules/infra/control/vrf.c -B2 -A5

Repository: DPDK/grout

Length of output: 940


🏁 Script executed:

# Look at how this reconfig function is called and what set_attrs combinations are realistic
rg -n "iface_vxlan_reconfig\|GR_VXLAN_SET" modules/l2/control/vxlan.c -B3 -A3 | grep -E "iface_vxlan_reconfig|IFACE_SET_ALL"

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Search for vrf_default_get_or_create to find where it's defined
rg -n "vrf_default_get_or_create" --type c

Repository: DPDK/grout

Length of output: 1033


🏁 Script executed:

# Check if there's any cleanup at the call site of iface_vxlan_reconfig
rg -n "iface_vxlan_reconfig\|\.reconfig" modules/infra/control/iface.c -A10 -B2 | head -60

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Look for the full function signature and error handling context
rg -n "iface_reconfig\|iface_type_dispatch" modules/infra/control/iface.c -B3 -A8 | head -100

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Get the implementation of vrf_default_get_or_create
sed -n '134,150p' modules/infra/control/iface.c

Repository: DPDK/grout

Length of output: 491


🏁 Script executed:

# Look at the iface_reconfig dispatch pattern
rg -n "\.reconfig" modules/infra/control/iface.c -B5 -A15 | head -80

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Check for error recovery patterns in interface reconfig
rg -n "reconfig.*errno|reconfig.*ret" modules/infra/control/iface.c -B2 -A2

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Confirm the cur_key is computed before ENCAP_VRF updates
sed -n '41,84p' modules/l2/control/vxlan.c | head -50

Repository: DPDK/grout

Length of output: 1223


🏁 Script executed:

# Check what happens if both flags are set and second fails
cat -n modules/l2/control/vxlan.c | sed -n '48,84p'

Repository: DPDK/grout

Length of output: 1297


🏁 Script executed:

# Check how the reconfig is called and if there's rollback/cleanup
rg -n "iface_reconfig\|\.reconfig" modules/infra/control/iface.c -C10 | head -150

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Verify what happens when reconfig returns error to the caller
rg -n "iface.*reconfig" modules/infra/control/iface.c | head -20

Repository: DPDK/grout

Length of output: 77


🏁 Script executed:

# Search for the dispatcher that calls .reconfig callbacks
rg -n "type->reconfig\|iface->type->reconfig" modules/infra/control/iface.c -B5 -A10

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Get the full iface_reconfig function
sed -n '290,350p' modules/infra/control/iface.c

Repository: DPDK/grout

Length of output: 1528


🏁 Script executed:

# Search more broadly for reconfig invocation
rg -n "reconfig" modules/infra/control/iface.c -B3 -A8

Repository: DPDK/grout

Length of output: 664


🏁 Script executed:

# Look for how vxlan_reconfig is called - check the API
rg -n "iface_reconfig" modules/l2/control/vxlan.c -B5 -A5

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Check the error handling path in iface_reconfig
sed -n '324,380p' modules/infra/control/iface.c

Repository: DPDK/grout

Length of output: 1403


🏁 Script executed:

# Look for the err label to see what cleanup happens
sed -n '360,400p' modules/infra/control/iface.c

Repository: DPDK/grout

Length of output: 1047


🏁 Script executed:

# Verify if vrf_decref is idempotent or what happens on double-call
sed -n '151,160p' modules/infra/control/vrf.c

Repository: DPDK/grout

Length of output: 266


🏁 Script executed:

# Look at the complete error path in iface_reconfig and see if there's any vxlan-specific cleanup
sed -n '290,375p' modules/infra/control/iface.c

Repository: DPDK/grout

Length of output: 1971


🏁 Script executed:

# Check if vrf_decref on a VRF that was just incremented in the same function would work
# and verify the actual failure case scenario
rg -n "rte_hash_add_key_data" modules/l2/control/ -B3 -A3

Repository: DPDK/grout

Length of output: 2256


Fix VXLAN hash/VRF state corruption on reconfig failures

The ENCAP_VRF/VNI update path mutates cur->encap_vrf_id and VRF refcounts before validating the new hash key. If rte_hash_add_key_data fails after the old key is deleted, the interface is orphaned from the hash with stale VRF refcounts, and no rollback occurs. Reorder to validate all inputs first, attempt the hash add before deleting the old key, and explicitly roll back VRF refs on failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/control/vxlan.c` around lines 52 - 84, The current VXLAN update
mutates cur->encap_vrf_id and adjusts VRF refcounts before validating the new
vxlan_key and before successfully adding it to vxlan_hash, which can orphan the
interface on failure; change the flow in the VXLAN update path so you first
validate next->vni and build the new vxlan_key (using
rte_cpu_to_be_32(next->vni) and next encap VRF), then try
rte_hash_lookup/rte_hash_add_key_data for the new key before calling
rte_hash_del_key for the old cur_key; only after rte_hash_add_key_data succeeds
update cur->vni and cur->encap_vrf_id, and if you must increment the new VRF
(vrf_incref) do so tentatively and on any later failure call vrf_decref to roll
back so vrf refcounts remain consistent (use vrf_incref/vrf_decref and handle
errors from rte_hash_add_key_data to restore state).

Comment on lines 83 to +103
if (rte_is_unicast_ether_addr(&eth->src_addr)
&& !(br->flags & GR_BRIDGE_F_NO_LEARN))
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id);
&& !(br->flags & GR_BRIDGE_F_NO_LEARN)) {
// Port security: check MAC limits before learning.
uint32_t max_macs = iface_get_max_macs(d->iface->id);
if (max_macs > 0) {
uint32_t cur = iface_get_total_macs(d->iface->id);
if (cur >= max_macs) {
struct bridge_stats *st = bridge_get_stats(
bridge->id, rte_lcore_id()
);
if (st != NULL)
st->learn_limit_iface++;
if (iface_get_shutdown_on_violation(d->iface->id))
iface_shutdown_violation(d->iface->id);
edge = BRIDGE_INVAL;
goto next;
}
}
vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0;
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id, vtep);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

MAC limit enforcement drops traffic for already‑learned MACs

When cur >= max_macs, packets are dropped before checking whether the source MAC is already learned on this interface. That blocks legitimate traffic and prevents last_seen refresh once the limit is reached. Enforce the limit only for new MACs.

🔧 Suggested fix
-			uint32_t max_macs = iface_get_max_macs(d->iface->id);
-			if (max_macs > 0) {
-				uint32_t cur = iface_get_total_macs(d->iface->id);
-				if (cur >= max_macs) {
+			uint32_t max_macs = iface_get_max_macs(d->iface->id);
+			if (max_macs > 0) {
+				const struct gr_fdb_entry *src_fdb =
+					fdb_lookup(bridge->id, &eth->src_addr, d->vlan_id);
+				if (src_fdb == NULL || src_fdb->iface_id != d->iface->id) {
+					uint32_t cur = iface_get_total_macs(d->iface->id);
+					if (cur >= max_macs) {
 						struct bridge_stats *st = bridge_get_stats(
 							bridge->id, rte_lcore_id()
 						);
 						if (st != NULL)
 							st->learn_limit_iface++;
 						if (iface_get_shutdown_on_violation(d->iface->id))
 							iface_shutdown_violation(d->iface->id);
 						edge = BRIDGE_INVAL;
 						goto next;
-				}
-			}
+					}
+				}
+			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (rte_is_unicast_ether_addr(&eth->src_addr)
&& !(br->flags & GR_BRIDGE_F_NO_LEARN))
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id);
&& !(br->flags & GR_BRIDGE_F_NO_LEARN)) {
// Port security: check MAC limits before learning.
uint32_t max_macs = iface_get_max_macs(d->iface->id);
if (max_macs > 0) {
uint32_t cur = iface_get_total_macs(d->iface->id);
if (cur >= max_macs) {
struct bridge_stats *st = bridge_get_stats(
bridge->id, rte_lcore_id()
);
if (st != NULL)
st->learn_limit_iface++;
if (iface_get_shutdown_on_violation(d->iface->id))
iface_shutdown_violation(d->iface->id);
edge = BRIDGE_INVAL;
goto next;
}
}
vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0;
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id, vtep);
}
if (rte_is_unicast_ether_addr(&eth->src_addr)
&& !(br->flags & GR_BRIDGE_F_NO_LEARN)) {
// Port security: check MAC limits before learning.
uint32_t max_macs = iface_get_max_macs(d->iface->id);
if (max_macs > 0) {
const struct gr_fdb_entry *src_fdb =
fdb_lookup(bridge->id, &eth->src_addr, d->vlan_id);
if (src_fdb == NULL || src_fdb->iface_id != d->iface->id) {
uint32_t cur = iface_get_total_macs(d->iface->id);
if (cur >= max_macs) {
struct bridge_stats *st = bridge_get_stats(
bridge->id, rte_lcore_id()
);
if (st != NULL)
st->learn_limit_iface++;
if (iface_get_shutdown_on_violation(d->iface->id))
iface_shutdown_violation(d->iface->id);
edge = BRIDGE_INVAL;
goto next;
}
}
}
vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0;
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id, vtep);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/datapath/bridge_input.c` around lines 83 - 103, The code enforces
iface MAC limits before checking if the source MAC is already learned, causing
already-learned MACs to be dropped and their last_seen not refreshed; fix by
first checking the FDB for an existing entry for bridge->id / d->iface->id /
&eth->src_addr / d->vlan_id (use the existing FDB lookup function, e.g.
fdb_lookup or fdb_find in this module), and only if the lookup returns “not
found” apply iface_get_max_macs() / iface_get_total_macs() enforcement and the
shutdown/stats logic; if the entry exists, skip the limit check and call
fdb_learn(bridge->id, d->iface->id, &eth->src_addr, d->vlan_id, vtep) to refresh
last_seen.

ip4_addr_t vtep;
};

int trace_vxlan_format(char *buf, size_t len, const void *data, size_t data_len);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

trace_vxlan_format implementation has an unsafe snprintf-chaining pattern.

The implementation in modules/l2/datapath/vxlan_input.c (lines 29–35):

int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
if (t->vtep != 0)
    n += snprintf(buf + n, len - n, " vtep=" IP4_F, &t->vtep);

snprintf returns the number of chars that would have been written, which may be ≥ len on truncation. If that happens:

  • buf + n is past the end of the buffer → out-of-bounds write.
  • len - n: len is size_t, n is int; the subtraction wraps to ~UINT64_MAX → the second snprintf is handed a near-infinite size limit, potentially writing far beyond the buffer.

Real-world risk is low (the formatted string is at most ~35 bytes), but the arithmetic is wrong and will fire if the function is ever called with a small buffer. The idiomatic fix is to guard against truncation:

🐛 Proposed fix
 int trace_vxlan_format(char *buf, size_t len, const void *data, size_t /*data_len*/) {
 	const struct trace_vxlan_data *t = data;
-	int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
-	if (t->vtep != 0)
-		n += snprintf(buf + n, len - n, " vtep=" IP4_F, &t->vtep);
+	int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
+	if (t->vtep != 0 && n >= 0 && (size_t)n < len)
+		n += snprintf(buf + n, len - (size_t)n, " vtep=" IP4_F, &t->vtep);
 	return n;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/datapath/l2_datapath.h` at line 15, trace_vxlan_format in
modules/l2/datapath/vxlan_input.c currently chains snprintf calls unsafely: it
uses the first snprintf return (int n) directly to compute buf + n and len - n,
but snprintf returns the would-be length and can be >= len causing out-of-bounds
pointer arithmetic and size_t underflow. Fix by checking snprintf's return: if n
< 0 treat as 0/error; if (size_t)n >= len clamp n to (int)len (or set remaining
= 0) and only compute buf + n and remaining = len - (size_t)n when remaining >
0; only call the second snprintf when remaining > 0 and use size_t for remaining
to avoid signed/unsigned mix. Ensure these checks are applied in
trace_vxlan_format around the first snprintf and before the conditional snprintf
for t->vtep.

Comment on lines +61 to +83
vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr));
if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) {
edge = BAD_FLAGS;
goto next;
}

vni = vxlan_decode_vni(vh->vx_vni);
if (vni != last_vni || vrf_id != last_vrf_id) {
iface = vxlan_get_iface(vni, vrf_id);
last_vrf_id = vrf_id;
last_vni = vni;
}
if (iface == NULL) {
edge = NO_TUNNEL;
goto next;
}

rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh));

d = iface_mbuf_data(m);
d->iface = iface;
d->vlan_id = 0;
d->vtep = src_vtep;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, examine the full function containing lines 61-83
wc -l modules/l2/datapath/vxlan_input.c

Repository: DPDK/grout

Length of output: 91


🏁 Script executed:

# Get context around the target lines to understand the function
sed -n '1,100p' modules/l2/datapath/vxlan_input.c | cat -n

Repository: DPDK/grout

Length of output: 3087


🏁 Script executed:

# Check if there are any existing length checks in the function
rg -n "data_len|pkt_len|rte_pktmbuf_data_len" modules/l2/datapath/vxlan_input.c -B2 -A2

Repository: DPDK/grout

Length of output: 266


🏁 Script executed:

# Search for how rte_pktmbuf_adj is used elsewhere in the codebase
rg -n "rte_pktmbuf_adj" --type c -B1 -A1 | head -50

Repository: DPDK/grout

Length of output: 2174


🏁 Script executed:

# Check what ip_local_mbuf_data does and how packets are validated
rg -n "ip_local_mbuf_data" modules/l2/datapath/ -B5 -A5

Repository: DPDK/grout

Length of output: 1464


🏁 Script executed:

# Look at the L4 input registration to understand packet preconditions
rg -n "l4_input_register_port" --type c -B3 -A3 | head -40

Repository: DPDK/grout

Length of output: 1978


🏁 Script executed:

# Check DPDK documentation or implementation notes for rte_pktmbuf_adj
rg -n "rte_pktmbuf_adj" modules/ --type c -B3 -A3 | grep -A3 -B3 "return\|NULL"

Repository: DPDK/grout

Length of output: 2014


🏁 Script executed:

# Search for similar VXLAN or UDP header parsing patterns in the codebase
rg -n "rte_vxlan_hdr|VXLAN" --type c -B3 -A3

Repository: DPDK/grout

Length of output: 21101


🏁 Script executed:

# Check if there are any packet length validations in earlier stages (IP/UDP processing)
fd "ip_local\.c" --type f | xargs cat -n | head -80

Repository: DPDK/grout

Length of output: 2513


🏁 Script executed:

# Check if there's any UDP payload length validation
rg -n "rte_be_to_cpu_16.*length|payload.*len|udp" modules/l4/ --type c -B2 -A2 | head -60

Repository: DPDK/grout

Length of output: 2735


🏁 Script executed:

# Check the actual rte_pktmbuf_adj signature and behavior in DPDK headers
fd "rte.*" modules/infra/api --type f | xargs rg -l "pktmbuf_adj" 2>/dev/null || echo "Not found in headers"

Repository: DPDK/grout

Length of output: 356


🏁 Script executed:

# Look for any validation of packet length before VXLAN header access in similar datapath files
rg -n "data_len|pkt_len.*<" modules/l2/datapath/ modules/infra/datapath/ --type c -B2 -A2

Repository: DPDK/grout

Length of output: 10891


🏁 Script executed:

# Check if ip_local_mbuf_data.len is used for validation before accessing offset payloads
rg -n "l->len|data->len" modules/ --type c -B1 -A1 | grep -A1 -B1 "sizeof\|offset"

Repository: DPDK/grout

Length of output: 463


Add length validation before VXLAN header access

The code accesses the VXLAN header without verifying the mbuf contains sufficient data. A truncated packet can cause out-of-bounds reads. Validate the packet size before rte_pktmbuf_mtod_offset and check the return value of rte_pktmbuf_adj.

🔧 Proposed fix
+		const uint16_t vxlan_min_len =
+			sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr);
+		if (rte_pktmbuf_data_len(m) < vxlan_min_len) {
+			edge = BAD_FLAGS;
+			goto next;
+		}
 		vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr));
if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) {
edge = BAD_FLAGS;
goto next;
}
vni = vxlan_decode_vni(vh->vx_vni);
if (vni != last_vni || vrf_id != last_vrf_id) {
iface = vxlan_get_iface(vni, vrf_id);
last_vrf_id = vrf_id;
last_vni = vni;
}
if (iface == NULL) {
edge = NO_TUNNEL;
goto next;
}
rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh));
d = iface_mbuf_data(m);
d->iface = iface;
d->vlan_id = 0;
d->vtep = src_vtep;
const uint16_t vxlan_min_len =
sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr);
if (rte_pktmbuf_data_len(m) < vxlan_min_len) {
edge = BAD_FLAGS;
goto next;
}
vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr));
if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) {
edge = BAD_FLAGS;
goto next;
}
vni = vxlan_decode_vni(vh->vx_vni);
if (vni != last_vni || vrf_id != last_vrf_id) {
iface = vxlan_get_iface(vni, vrf_id);
last_vrf_id = vrf_id;
last_vni = vni;
}
if (iface == NULL) {
edge = NO_TUNNEL;
goto next;
}
rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh));
d = iface_mbuf_data(m);
d->iface = iface;
d->vlan_id = 0;
d->vtep = src_vtep;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/l2/datapath/vxlan_input.c` around lines 61 - 83, The VXLAN path reads
the VXLAN header via rte_pktmbuf_mtod_offset and then calls rte_pktmbuf_adj
without validating mbuf lengths; add explicit length checks before accessing
struct rte_vxlan_hdr and before trimming the UDP+VXLAN headers: verify packet
length (using rte_pktmbuf_pkt_len and/or rte_pktmbuf_data_len) is at least
sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr) and return/skip the
packet (set edge to BAD_FLAGS or NO_TUNNEL as appropriate) if not. After calling
rte_pktmbuf_adj ensure the adjustment succeeded (check the returned pointer or
remaining data length) and bail out safely if it failed before calling
iface_mbuf_data, vxlan_decode_vni, or vxlan_get_iface so you never read past the
mbuf bounds.

Comment on lines +27 to +28
grcli port-mirror del br0 2
grcli port-mirror show br0 2 | grep -q "enabled: false" || fail "session 2 not deleted"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Verify post-deletion show behavior assumption.

If grcli port-mirror show br0 2 exits non-zero (e.g. ENOENT) with no output after deletion, grep -q "enabled: false" fails on empty input and fail "session 2 not deleted" fires — masking a correct deletion with a misleading error message. This assertion holds only if the CLI keeps the session record in a disabled state rather than removing it entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smoke/port_mirror_test.sh` around lines 27 - 28, The test assumes `grcli
port-mirror show br0 2` will return a disabled record, but deletion may remove
the session entirely; update the assertion around the `grcli port-mirror show
br0 2` call so it accepts either a successful show reporting 'enabled: false' or
a non-zero/empty output indicating the session no longer exists. Modify the
check that currently pipes to `grep -q "enabled: false"` and `fail "session 2
not deleted"` to first capture the `grcli port-mirror show br0 2` exit
status/output (the `grcli port-mirror show` invocation and the `grep -q
"enabled: false"` check) and only fail if the command succeeded but the output
does not contain 'enabled: false'; treat a non-zero exit or empty output as a
valid deletion.

@christophefontaine
Copy link
Collaborator

Lot of good stuff !
We may want to create individual PRs for each feature though :)
This will ease the review. Thanks a lot !

@fabiendupont
Copy link
Author

Wrong branch — this included too many features. Will submit individual feature branches instead.

@fabiendupont fabiendupont deleted the l2-switching-support branch February 27, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants