l2: add advanced switching features#515
Conversation
Introduce the VXLAN interface type for the L2 module. A VXLAN interface carries a VNI (VXLAN Network Identifier), a local VTEP address used as the outer IP source, an encapsulation VRF for underlay routing, and a configurable UDP destination port (default 4789). VXLAN interfaces are keyed by (VNI, encap_vrf_id) in a lockfree RCU-protected hash table so that the datapath can resolve incoming tunneled packets to the correct interface without locks. VXLAN interfaces are intended to be attached to a bridge domain. All L2 traffic entering the bridge is forwarded transparently over the VXLAN tunnel. The local VTEP address must already be configured in the encapsulation VRF. Signed-off-by: Robin Jarry <rjarry@redhat.com>
VXLAN uses UDP port 4789 by default but allows configuring a custom destination port per interface. Allow the control plane to register additional UDP ports at runtime as aliases for an already registered port, reusing the same datapath edge. Use reference counting so that multiple interfaces sharing the same non-default port do not interfere with each other during teardown. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Wire up the VXLAN interface's configurable destination port to the L4 input node. When a non-default port is configured, register it as an alias for the standard VXLAN port (4789) so that the datapath delivers matching UDP packets to the vxlan_input node. Unregister the alias when the port changes or the interface is destroyed. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Introduce a transport-agnostic flood list framework for BUM traffic (Broadcast, Unknown unicast, Multicast). In EVPN, each PE maintains a flooding list built from IMET routes (RFC 8365, RFC 9572). The entries in this list differ depending on the overlay encapsulation: VXLAN uses a remote VTEP IPv4 address and a VNI, while SRv6 would use a 128-bit SID. The API defines a gr_flood_entry structure with a type discriminant and a union, allowing future encapsulation types (e.g. SRv6 SIDs) to be added without changing the API request types. A dispatch layer in control/flood.c routes add/del/list operations to type-specific callbacks registered at init time. Implement the VXLAN VTEP flood type (GR_FLOOD_T_VTEP). Each VXLAN interface maintains a per-VNI array of remote VTEP addresses used by the vxlan_flood datapath node for ingress replication. The array is replaced atomically with an RCU synchronization barrier so that the datapath never sees a partially updated list. CLI commands are exposed under "flood vtep add/del/show". Add new generated grcli-flood(1) man page. Signed-off-by: Robin Jarry <rjarry@redhat.com>
In a VXLAN overlay, the bridge needs to know which remote VTEP to use when sending unicast frames to a learned MAC address. Add a VTEP IPv4 address field to FDB entries so that known unicast traffic can be sent directly to the correct tunnel endpoint instead of being flooded to all VTEPs. When bridge_input learns a MAC address from a VXLAN member interface, it records the source VTEP from the decapsulated packet's outer IP header. When forwarding to a known destination, the stored VTEP address is passed to the output path via the mbuf private data so that vxlan_output can build the correct outer header. Only set the VTEP field when the source interface is actually a VXLAN type to avoid storing uninitialized data from other packet paths (control plane, local bridge traffic). Signed-off-by: Robin Jarry <rjarry@redhat.com>
Add three datapath nodes for VXLAN packet processing. vxlan_input decapsulates incoming UDP/4789 packets. It strips the outer UDP and VXLAN headers, resolves the inner VNI to a VXLAN interface via the RCU-protected hash table, records the source VTEP from the outer IP header into the mbuf private data, and forwards the inner Ethernet frame to iface_input for bridge processing. vxlan_output encapsulates outgoing frames for a known destination VTEP. It prepends a pre-built IP/UDP/VXLAN header template initialized by the control plane, fills in the per-packet fields (destination VTEP, UDP length, IP length, checksum), and hashes the inner flow to select an ephemeral source port for underlay ECMP (RFC 7348 Section 5). The FIB lookup for the outer IP uses the encapsulation VRF, not the bridge domain. vxlan_flood handles BUM traffic by replicating the frame to every VTEP in the flood list via ingress replication. The original mbuf is sent to the first VTEP and clones are created for the rest. The bridge_flood node is updated to steer VXLAN member traffic through vxlan_flood instead of direct iface_output. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Set up a VXLAN overlay between grout and a Linux netns peer. Grout runs a bridge with a VXLAN member (VNI 100) and the Linux side mirrors the topology with a kernel VXLAN device enslaved to a Linux bridge. Both sides have flood lists configured with each other's VTEP address for BUM traffic replication. The test verifies L3 connectivity over the tunnel by having the Linux side ping the bridge address. This exercises the full path: ARP resolution over VXLAN, FDB learning from decapsulated traffic, and ICMP echo reply via the VXLAN output encapsulation. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Report bridge interfaces to FRR as ZEBRA_IF_BRIDGE with their MAC address. Tag members with ZEBRA_IF_SLAVE_BRIDGE and propagate the bridge ifindex so that FRR can associate them with the correct master. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Report VXLAN interfaces to FRR's zebra as ZEBRA_IF_VXLAN with the associated L2 VNI information. This allows FRR's EVPN control plane to discover which VNIs are locally configured and advertise them via BGP IMET routes to remote PEs. The VXLAN L2 info includes the VNI, the local VTEP address, and the underlay interface index so that zebra can correlate the tunnel with the correct underlay routing context. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Synchronize bridge FDB entries bidirectionally between grout and FRR. This is required for EVPN to advertise locally learned MAC addresses via BGP type-2 routes and to install remotely learned MACs into the bridge forwarding table. Zebra's dplane API is asymmetric for MAC/FDB entries. In the downward direction (zebra to dplane provider), zebra uses DPLANE_OP_MAC_INSTALL and DPLANE_OP_MAC_DELETE to push MACs into the dataplane. In the upward direction (dplane provider notifying zebra of learned MACs), DPLANE_OP_NEIGH_INSTALL and DPLANE_OP_NEIGH_DELETE must be used instead. These go through zebra_neigh_macfdb_update() which calls zebra_vxlan_local_mac_add_update() and ultimately triggers BGP EVPN type-2 route advertisement. By contrast, the DPLANE_OP_MAC_* result handler (zebra_vxlan_handle_result) is a no-op. Despite the NEIGH op name, the context payload uses the macinfo union member and is populated with dplane_ctx_mac_set_*() accessors, exactly like zebra's own netlink provider does in netlink_macfdb_change(). Unlike routes and nexthops which use higher-level zebra APIs that resolve the namespace from the VRF ID, the FDB notification path looks up interfaces via if_lookup_by_index_per_ns(ns_id, ifindex). GROUT_NS must therefore be set on the dplane context for the interface lookup to succeed. Function names follow zebra's rt_netlink.c naming conventions: grout_macfdb_change() for the upward notification path (like netlink_macfdb_change) and grout_macfdb_update_ctx() for the downward install path (like netlink_macfdb_update_ctx). Self-event suppression is enabled on the FDB event subscriptions to prevent feedback loops when FRR installs a MAC that was originally learned by grout. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Handle DPLANE_OP_VTEP_ADD and DPLANE_OP_VTEP_DELETE operations from FRR's EVPN control plane. When BGP learns a remote VTEP via an IMET route (EVPN type-3), zebra pushes the VTEP to the dataplane provider. The grout_vxlan_flood_update_ctx() function (named after zebra's netlink_vxlan_flood_update_ctx() in rt_netlink.c) translates these operations into GR_FLOOD_ADD/DEL requests with GR_FLOOD_T_VTEP type. This is a downward-only path: zebra pushes flood list entries to the dplane provider. There is no upward notification for VTEP flood list changes since grout does not learn VTEPs on its own, they are always provided by FRR's BGP EVPN control plane. This allows BGP EVPN to dynamically manage the per-VNI flood lists used for BUM traffic ingress replication, replacing the need for static flood list configuration via the CLI. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Set up a full EVPN/VXLAN topology between FRR+grout and a standalone FRR+Linux peer. Each side runs a bridge with a VXLAN member (VNI 100) and a host namespace. Both peers run iBGP with the l2vpn evpn address-family and advertise-all-vni. The test verifies that EVPN type-3 (IMET) routes are exchanged so that both sides install each other's VTEP in their flood lists. It then verifies end-to-end L2 connectivity by pinging between the two host namespaces through the VXLAN overlay, which exercises type-2 (MAC/IP) route advertisement and FDB synchronization. Signed-off-by: Robin Jarry <rjarry@redhat.com>
Add bridge-aware behavior to bond interfaces for proper interaction with L2 bridge domains. When a bond is added to a bridge, it tracks the minimum number of active member links required to be considered operationally up. This prevents forwarding loops and packet loss when insufficient links are available. The bond_bridge_integration structure tracks bridge membership and the minimum active links threshold. When active members drop below the threshold, the bond transitions to DOWN state and triggers events so that protocols like RSTP can react and prevent forwarding through unstable bonds. Also trigger gratuitous ARP via GR_EVENT_IFACE_STATUS_UP when the active member changes in active-backup mode. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Extend upstream's L2 bridge infrastructure with hook points for advanced features that will be added in subsequent commits: RSTP, port security, storm control, VLAN filtering, multicast snooping, port mirroring, QoS, DHCP snooping, DAI, IPSG, and LLDP. Add optional feature pointers (rstp, mcast_snoop, vlan_filter, lldp) to iface_info_bridge so that features can be enabled per-bridge. Add per-core bridge and FDB statistics arrays, and per-interface security state (MAC limits, shutdown-on-violation). In the bridge_input datapath, check RSTP port learning state and port security shutdown/MAC limits before allowing source MAC learning. In bridge_flood, skip ports not in RSTP forwarding state. Define API request types for all planned features using request IDs in the 0x0020-0x00B5 range, which do not conflict with upstream's FDB (0x0001-0x0006) or flood list (0x0011-0x0013) request IDs. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement API handlers and CLI for per-bridge statistics. The stats aggregate per-core counters for forwarding (unicast, broadcast, multicast, flood), drops (no FDB, hairpin, interface down), learning (learned, updated, failed, skipped, MAC limits), RSTP (blocking drop, learn skip), and FDB (lookup hit/miss, aged entries). The CLI uses bridge interface names consistent with upstream's convention: "bridge stats <name>" to display and "bridge stats reset <name>" to clear counters. Extend bridge_test.sh smoke test with stats verification. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement port security for bridge member interfaces. Each interface can be configured with a maximum number of learned MAC addresses. When the limit is reached, new MACs are rejected. Optionally, the interface can be shut down on violation and later re-enabled. The datapath hooks in bridge_input already enforce these limits (added in the extension hooks commit). This commit adds the control plane API handlers for configuring and querying security state, and the CLI under "port-security" with set, show, and reenable commands. Extend bridge_test.sh smoke test with port security verification. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement Rapid Spanning Tree Protocol (IEEE 802.1D-2004) control plane with bridge-level and per-port configuration. RSTP state is stored as an optional pointer on the bridge info structure, allocated on demand when RSTP is enabled via the API. The port role selection algorithm compares priority vectors received in BPDUs to elect a root bridge, assign root/designated/alternate roles, and drive the port state machine through discarding, learning and forwarding states. Edge ports (PortFast) skip the forward delay. BPDU Guard and Root Guard protection features are supported per port. Bridge control plane hooks in bridge_input and bridge_flood gate learning and forwarding based on RSTP port state. When RSTP is not configured, ports default to forwarding. API handlers expose bridge-level RSTP configuration (priority, hello time, forward delay, max age) and per-port parameters (priority, path cost, edge/guard flags). Topology change detection triggers TC propagation timers on non-edge port state transitions. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge VLAN filtering with three port modes: access (single untagged VLAN), trunk (multiple tagged VLANs with optional native VLAN), and hybrid (mix of tagged and untagged). The filtering context is allocated on demand when VLAN filtering is enabled on a bridge and freed when disabled or when the bridge is destroyed. VLAN membership is tracked per port using a 4096-bit bitmap for O(1) ingress and egress checks. In access mode, only the configured VLAN is allowed and frames are always untagged on egress. In trunk and hybrid modes, a configurable set of VLANs is allowed, with the native VLAN untagged on egress. Ingress and egress VLAN translation (1:1 mapping) and Q-in-Q (802.1ad) outer S-VLAN tagging are supported per port. Per-core statistics track filtered frames, PVID additions, tag removals, translations and Q-in-Q operations. API handlers cover enabling/disabling filtering per bridge, port mode configuration, VLAN translation, Q-in-Q and statistics retrieval. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-interface rate limiting for broadcast, multicast and unknown unicast traffic using DPDK two-rate three-color markers (RFC 2698). Each traffic type has an independent meter with configurable rate in kbps or packets per second. When a meter exceeds its committed information rate, packets are dropped and a per-type violation counter increments. If shutdown on violation is enabled, the interface is disabled after a configurable number of consecutive violations. A reenable command allows recovery. Per-core statistics track passed and dropped packets for each traffic type, plus shutdown events. Configuration and metering state are stored in per-interface arrays indexed by interface ID and lcore ID for lock-free datapath access. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement session-based port mirroring with up to 8 concurrent sessions per bridge. Each session maps a set of source ports to a single destination port with configurable direction (ingress, egress or both). Source ports are stored in a gr_vec per session and deduplicated on insertion. RSPAN (Remote Switched Port Analyzer) is supported by tagging mirrored frames with a configurable VLAN ID for transport across trunk links. Per-session packet filters can match on ethertype, source MAC, destination MAC and a set of VLAN IDs. The filter is evaluated in the datapath before cloning to avoid unnecessary packet copies. Per-core statistics track mirrored packets, drops, filter matches and clone failures. Session state and statistics are stored in per-bridge arrays for lock-free datapath access. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge multicast snooping with IGMP (IPv4) and MLD (IPv6) support. The Multicast Database stores group memberships as MAC-keyed entries in an rte_hash table, with each entry tracking a vector of member port IDs and a last-seen timestamp. IGMP membership reports add the reporting port to the corresponding multicast group. Leave messages remove the port, and the entry is deleted when no ports remain. Static MDB entries bypass aging and persist until explicitly removed. Periodic aging removes learned entries older than the configurable aging time (default 260 seconds). The querier mode, when enabled, sends general queries at configurable intervals to solicit fresh membership reports from hosts. IP-to-MAC conversion follows RFC 1112 for IPv4 (01:00:5E prefix with low 23 bits of group address) and RFC 2464 for IPv6 (33:33 prefix with low 32 bits). API handlers cover snooping configuration, MDB entry management (add/del/list) and multicast group membership queries. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-interface Quality of Service with 8 priority queues mapped to IEEE 802.1p Class of Service values. Three scheduling modes are supported: strict priority, weighted round-robin (WRR) and deficit weighted round-robin (DWRR). Packet classification extracts priority from 802.1p CoS in VLAN tags or from IP DSCP fields using a configurable 64-entry mapping table. A configurable default priority applies when neither CoS nor DSCP trust is enabled. Per-queue rate limiting uses DPDK two-rate three-color meters with configurable committed information rate. A port-level shaper caps aggregate throughput across all queues. CoS remarking applies an output CoS mapping before egress. Per-core statistics track classification, remarking, per-queue drops and transmits, and port shaper drops. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge DHCP snooping with a binding table that maps client MAC addresses to assigned IPs, ports and VLANs. Bindings are stored in an rte_hash table keyed by MAC for O(1) datapath lookups. Trusted ports accept DHCP server responses (OFFER, ACK, NAK) while untrusted ports only forward client messages. Source MAC verification rejects packets where the Ethernet source MAC does not match the DHCP client hardware address. Bindings can be learned dynamically from DHCP ACK messages or configured statically. Dynamic bindings expire when their lease runs out or after a configurable aging time. A per-bridge maximum binding limit prevents exhaustion. The binding table is shared with DAI (Dynamic ARP Inspection) and IPSG (IP Source Guard) for source address validation. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge Dynamic ARP Inspection that validates ARP packets against DHCP snooping bindings. Configurable validation checks include sender MAC versus Ethernet source MAC, target MAC versus Ethernet destination MAC, and sender IP presence in the DHCP binding table. Per-port trust state allows DHCP servers and uplinks to bypass inspection. Optional per-port ARP rate limiting uses DPDK token bucket meters to prevent ARP storms on untrusted ports. Per-core statistics track ARP requests, replies, valid packets and drops categorized by violation type (MAC mismatch, IP not bound, rate limit exceeded, trusted port bypass). Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge IP Source Guard that validates IPv4 source addresses against DHCP snooping bindings. When enabled, packets with source IPs not found in the binding table are dropped. Configuration supports per-bridge enable/disable, source IP verification toggle, and violation logging. IPSG works in conjunction with DHCP snooping which maintains the binding table of authorized MAC-to-IP mappings. Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implement per-bridge Link Layer Discovery Protocol configuration with neighbor table management. LLDP state is stored as an optional pointer on the bridge info structure, allocated when LLDP is enabled and freed on disable or bridge destruction. The neighbor table holds up to 128 entries per bridge, indexed by chassis ID and interface. Entries are added or updated when LLDP frames are received and aged out when the advertised TTL expires. Each entry stores the mandatory TLVs (chassis ID, port ID, TTL) and optional descriptive TLVs (port description, system name, system description). Configuration supports per-bridge TX interval and TTL with IEEE 802.1AB defaults (30 seconds and 120 seconds respectively). Signed-off-by: Fabien Dupont <fdupont@redhat.com>
📝 WalkthroughWalkthroughThis pull request introduces comprehensive L2 bridging features including DHCP snooping, Dynamic ARP Inspection, IP Source Guard, RSTP, QoS, port mirroring, VLAN filtering, LLDP, multicast snooping, and storm control. It adds VXLAN interface support with flooding and flood VTEP management, integrates FRR MAC FDB and flood updates into the datapath, and extends bond-to-bridge integration with active link thresholds. The changes include CLI command handlers, control-plane API endpoints, datapath processing nodes, configuration data structures, statistics tracking, and comprehensive unit/smoke tests across the system. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 20
🧹 Nitpick comments (2)
modules/l2/control/lldp_priv.h (1)
31-53:struct lldp_configembeds ~166 KB of neighbor storage inline.Each
lldp_neighboris ~1.3 KB, soneighbors[128]insidelldp_configmakes each allocation ~166 KB. This works sincelldp_config_allocusesrte_zmalloc, but it's worth being aware of if multiple bridges enable LLDP.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/l2/control/lldp_priv.h` around lines 31 - 53, The lldp_config currently embeds a large array neighbors[LLDP_MAX_NEIGHBORS] causing ~166KB per lldp_config; change struct lldp_config to hold a pointer (struct lldp_neighbor *neighbors) and a count (num_neighbors) instead of the fixed array, allocate the neighbors buffer separately (e.g., in lldp_config_alloc using rte_zmalloc for neighbors) and free it in the corresponding free function; update any code referencing lldp_config.neighbors to handle the pointer (and ensure bounds checks against num_neighbors/LLDP_MAX_NEIGHBORS) and keep existing symbols lldp_config, lldp_neighbor, LLDP_MAX_NEIGHBORS, lldp_config_alloc, and rte_zmalloc as reference points.modules/l2/control/rstp.c (1)
617-619: BPDUs are built but never transmitted — RSTP is effectively non-functional on the wire.The TODO at line 617 means no BPDU will ever leave the bridge, so peers won't participate in spanning-tree negotiation. This is fine for an in-progress feature, but should be tracked to avoid shipping a silently broken RSTP.
Would you like me to open an issue to track the BPDU transmission implementation?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/l2/control/rstp.c` around lines 617 - 619, BPDU packets are created but never transmitted because the code only logs and does "(void)bpdu"; replace that TODO by actually sending the built BPDU via the existing control path: call the bridge's transmit API (control_output or the module's equivalent transmit function) with the bpdu buffer/mbuf, handle and log any errors, and only increment the BPDU transmit counter on successful send; remove the "(void)bpdu" noop and ensure proper ownership/freeing of the bpdu buffer after send or on error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/l2/cli/port_mirror.c`:
- Around line 56-67: The code silently skips invalid interface names when
parsing src_list: when iface_from_name(c, tok) returns NULL the user is not
informed and req.num_sources is quietly unchanged; update the parsing in the
port mirror setup (the block using strdup(src_list), strtok, iface_from_name,
req.source_ports and req.num_sources) to detect a NULL return and either log a
clear error/warning (including the bad token) or return an error status to the
caller instead of silently skipping; ensure the behavior is consistent (choose
logging vs returning) and that you free copy before returning on error and do
not overflow req.source_ports (keep the existing limit check).
In `@modules/l2/cli/vlan_filtering.c`:
- Around line 93-103: The code calls strdup(vlan_list) without checking its
return; if strdup returns NULL, subsequent strtok(copy, ",") will crash—add a
NULL check after the strdup call (e.g., if (copy != NULL) { ... }) so parsing
into tok and pushing into req.allowed_vlans (and calling free(copy)) only
happens when copy is non-NULL; apply the same NULL-checked strdup pattern in the
other branch that handles trunk/hybrid VLAN parsing so both branches (the
vlan_list strdup usage) are guarded.
In `@modules/l2/control/bridge.c`:
- Around line 50-60: Both iface_increment_mac_count and
iface_decrement_mac_count validate iface_id but not lcore_id, allowing
out-of-bounds accesses into l2_iface_mac_counts; add a guard checking lcore_id
against the second-dimension size before touching l2_iface_mac_counts (e.g.,
ensure lcore_id < the configured lcore count constant or
ARRAY_SIZE(l2_iface_mac_counts[0]) / L2_MAX_LCORES) in both functions
(iface_increment_mac_count and iface_decrement_mac_count) and return early if
the check fails so all writes/reads use valid lcore indices.
In `@modules/l2/control/dhcp_snooping.c`:
- Around line 353-362: dhcp_trusted_port_set_cb currently ignores return values
from dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port and
always returns success; change it to capture the int rc from those calls, and if
rc < 0 return a failure api_out reflecting the error (e.g., api_out(-1, rc,
NULL) or the project's existing pattern) so callers see -EINVAL/-EEXIST/-ENOENT;
update dhcp_trusted_port_set_cb to call the appropriate function, check rc, and
only return api_out(0,0,NULL) on success.
- Around line 272-286: The handler dhcp_snooping_set_cb currently ignores the
return value of dhcp_snooping_enable which can fail (e.g., -ENOMEM); modify
dhcp_snooping_set_cb to capture the return value of
dhcp_snooping_enable(req->bridge_id, req->enabled), and if it returns a negative
error, immediately return api_out(err, 0, NULL) (do not continue to call
dhcp_snooping_set_verify_mac, dhcp_snooping_set_max_bindings, or
dhcp_snooping_set_aging_time); ensure the error from dhcp_snooping_enable is
propagated to the client so the enabled state isn’t reported as successful when
initialization failed.
In `@modules/l2/control/flood.c`:
- Around line 30-42: The code indexes flood_types with unvalidated user input
req->entry.type in flood_add (and similarly in the flood_delete/flood_get
paths), which can cause out-of-bounds access; fix by validating the type before
using it: check req and req->entry.type against the valid range (e.g., 0 <= type
< NUM_FLOOD_TYPES or whatever the flood_types size symbol is) and return
api_out(EINVAL or EAFNOSUPPORT, ...) if invalid, then proceed to load ops =
flood_types[type] and check ops->add as currently done; apply the same
validation pattern to the other functions that access flood_types (the functions
handling lines ~50-62).
In `@modules/l2/control/gr_l2_control.h`:
- Around line 60-70: Both bridge_get_stats and fdb_get_stats are missing a
bounds check for lcore_id and may index l2_bridge_stats or l2_fdb_stats out of
bounds; add a check in each function to ensure lcore_id < RTE_MAX_LCORE (similar
to storm_control_get_stats/qos_get_stats) and return NULL if the check fails so
the functions validate both bridge_id and lcore_id before returning
&l2_bridge_stats[bridge_id][lcore_id] or &l2_fdb_stats[bridge_id][lcore_id].
In `@modules/l2/control/iface_security.c`:
- Around line 59-76: The reenable handler iface_security_reenable can silently
return success when req->iface_id is >= L2_MAX_IFACES because the
shutdown-clearing branch is skipped; add an explicit range check for
req->iface_id (compare against L2_MAX_IFACES) and return api_out(ERANGE, 0,
NULL) to match the behavior of the corresponding set handler, before accessing
l2_iface_security[req->iface_id] or calling iface_from_id, so that
iface_security_reenable and the set handler are consistent; update references to
l2_iface_security and iface_security_reenable accordingly.
In `@modules/l2/control/ipsg.c`:
- Around line 28-38: The code indexes ipsg_configs with req->bridge_id without
verifying it is within L2_MAX_BRIDGES; add a bounds check (e.g., if
(req->bridge_id >= L2_MAX_BRIDGES) return api_out(ENOENT, 0, NULL);) before any
access to ipsg_configs in the allocation/lookup block and also at the start of
ipsg_config_get_cb so all uses of ipsg_configs[req->bridge_id] are protected;
reference the ipsg_configs array, iface_from_id call and the ipsg_config_get_cb
function when applying this change.
In `@modules/l2/control/mcast_snooping_test.c`:
- Around line 26-34: The test constructs ip4_addr_t in host byte order which
masks an endianness bug in mcast_ip_to_mac; update the test so ip4 is created in
network byte order (big-endian) using rte_cpu_to_be_32 (or equivalent) when
building the 224.1.2.3 value, keep the rest of the assertions the same to
validate mcast_ip_to_mac correctly handles network-order ip4_addr_t.
In `@modules/l2/control/mcast_snooping.c`:
- Around line 78-96: The IPv4-to-MAC conversion in mcast_ip_to_mac is
endian-unsafe because it shifts a network-order ip4_addr_t on the host integer
representation; fix mcast_ip_to_mac by treating the IPv4 address as a byte array
in network (big-endian) order and map bytes directly into struct rte_ether_addr:
use a const uint8_t *ip4bytes = (const uint8_t *)ip and set mac->addr_bytes[3] =
ip4bytes[1] & 0x7F, mac->addr_bytes[4] = ip4bytes[2], mac->addr_bytes[5] =
ip4bytes[3] (keep IPv6 branch unchanged), ensuring correct behavior on
little-endian and big-endian hosts and aligning with the multicast mapping
rules.
In `@modules/l2/control/port_mirror.c`:
- Around line 279-303: mirror_session_get_cb sets resp->num_sources to the full
vector length but only copies up to 16 entries, causing clients to believe more
ports were returned than exist; fix by capping resp->num_sources to the response
array capacity (e.g., uint16_t cap = min(gr_vec_len(session.source_ports),
(size_t)16)) and use that capped value for the copy loop and resp->num_sources,
or alternatively return an error when the source_ports length exceeds the
response capacity; update the logic around gr_vec_len(session.source_ports),
resp->num_sources, and the for loop that copies session.source_ports so they
consistently use the same capped count.
In `@modules/l2/control/qos.c`:
- Around line 55-61: qos_port_set currently treats array entries of 0 as "unset"
and overwrites user mappings; instead add explicit boolean flags (e.g., in the
qos_port_config struct: bool dscp_map_set and bool cos_map_set) to record when a
user has configured maps, set those flags in qos_dscp_map_set and
qos_cos_remap_set, and change qos_port_set to skip replacing cfg->dscp_to_cos
and cfg->cos_to_cos when the corresponding flag is true; use the flags rather
than sentinel-value checks on cfg->dscp_to_cos and cfg->cos_to_cos to avoid
overwriting legitimate mappings to 0.
In `@modules/l2/control/rstp_priv.h`:
- Around line 224-228: The macros for constructing and extracting RSTP port IDs
use an 8+8 layout but IEEE 802.1D-2004 requires a 4-bit priority + 12-bit port
number. Update RSTP_PORT_ID to shift the priority left 12 bits and mask the port
number with 0x0FFF, and update RSTP_PORT_ID_PRIORITY and RSTP_PORT_ID_NUM to
extract using >>12 and &0x0FFF respectively (use appropriate uint8_t/uint16_t
casts as in the originals), referencing the macros RSTP_PORT_ID,
RSTP_PORT_ID_PRIORITY, and RSTP_PORT_ID_NUM.
- Line 182: The ports field in the rstp_priv struct is missing the gr_vec
annotation; update the declaration of the ports field (the "ports" member in
struct rstp_priv) to include the gr_vec annotation used by other vector fields
(as seen in port_mirror_priv.h, mcast_snooping_priv.h, dhcp_snooping_priv.h) so
it correctly reflects that it is managed by gr_vec_add(), gr_vec_free(),
gr_vec_del(), and gr_vec_len(); ensure the annotation placement matches the
project convention for vector fields.
In `@modules/l2/control/vxlan.c`:
- Around line 52-84: The current VXLAN update mutates cur->encap_vrf_id and
adjusts VRF refcounts before validating the new vxlan_key and before
successfully adding it to vxlan_hash, which can orphan the interface on failure;
change the flow in the VXLAN update path so you first validate next->vni and
build the new vxlan_key (using rte_cpu_to_be_32(next->vni) and next encap VRF),
then try rte_hash_lookup/rte_hash_add_key_data for the new key before calling
rte_hash_del_key for the old cur_key; only after rte_hash_add_key_data succeeds
update cur->vni and cur->encap_vrf_id, and if you must increment the new VRF
(vrf_incref) do so tentatively and on any later failure call vrf_decref to roll
back so vrf refcounts remain consistent (use vrf_incref/vrf_decref and handle
errors from rte_hash_add_key_data to restore state).
In `@modules/l2/datapath/bridge_input.c`:
- Around line 83-103: The code enforces iface MAC limits before checking if the
source MAC is already learned, causing already-learned MACs to be dropped and
their last_seen not refreshed; fix by first checking the FDB for an existing
entry for bridge->id / d->iface->id / ð->src_addr / d->vlan_id (use the
existing FDB lookup function, e.g. fdb_lookup or fdb_find in this module), and
only if the lookup returns “not found” apply iface_get_max_macs() /
iface_get_total_macs() enforcement and the shutdown/stats logic; if the entry
exists, skip the limit check and call fdb_learn(bridge->id, d->iface->id,
ð->src_addr, d->vlan_id, vtep) to refresh last_seen.
In `@modules/l2/datapath/l2_datapath.h`:
- Line 15: trace_vxlan_format in modules/l2/datapath/vxlan_input.c currently
chains snprintf calls unsafely: it uses the first snprintf return (int n)
directly to compute buf + n and len - n, but snprintf returns the would-be
length and can be >= len causing out-of-bounds pointer arithmetic and size_t
underflow. Fix by checking snprintf's return: if n < 0 treat as 0/error; if
(size_t)n >= len clamp n to (int)len (or set remaining = 0) and only compute buf
+ n and remaining = len - (size_t)n when remaining > 0; only call the second
snprintf when remaining > 0 and use size_t for remaining to avoid
signed/unsigned mix. Ensure these checks are applied in trace_vxlan_format
around the first snprintf and before the conditional snprintf for t->vtep.
In `@modules/l2/datapath/vxlan_input.c`:
- Around line 61-83: The VXLAN path reads the VXLAN header via
rte_pktmbuf_mtod_offset and then calls rte_pktmbuf_adj without validating mbuf
lengths; add explicit length checks before accessing struct rte_vxlan_hdr and
before trimming the UDP+VXLAN headers: verify packet length (using
rte_pktmbuf_pkt_len and/or rte_pktmbuf_data_len) is at least sizeof(struct
rte_udp_hdr) + sizeof(struct rte_vxlan_hdr) and return/skip the packet (set edge
to BAD_FLAGS or NO_TUNNEL as appropriate) if not. After calling rte_pktmbuf_adj
ensure the adjustment succeeded (check the returned pointer or remaining data
length) and bail out safely if it failed before calling iface_mbuf_data,
vxlan_decode_vni, or vxlan_get_iface so you never read past the mbuf bounds.
In `@smoke/port_mirror_test.sh`:
- Around line 27-28: The test assumes `grcli port-mirror show br0 2` will return
a disabled record, but deletion may remove the session entirely; update the
assertion around the `grcli port-mirror show br0 2` call so it accepts either a
successful show reporting 'enabled: false' or a non-zero/empty output indicating
the session no longer exists. Modify the check that currently pipes to `grep -q
"enabled: false"` and `fail "session 2 not deleted"` to first capture the `grcli
port-mirror show br0 2` exit status/output (the `grcli port-mirror show`
invocation and the `grep -q "enabled: false"` check) and only fail if the
command succeeded but the output does not contain 'enabled: false'; treat a
non-zero exit or empty output as a valid deletion.
---
Nitpick comments:
In `@modules/l2/control/lldp_priv.h`:
- Around line 31-53: The lldp_config currently embeds a large array
neighbors[LLDP_MAX_NEIGHBORS] causing ~166KB per lldp_config; change struct
lldp_config to hold a pointer (struct lldp_neighbor *neighbors) and a count
(num_neighbors) instead of the fixed array, allocate the neighbors buffer
separately (e.g., in lldp_config_alloc using rte_zmalloc for neighbors) and free
it in the corresponding free function; update any code referencing
lldp_config.neighbors to handle the pointer (and ensure bounds checks against
num_neighbors/LLDP_MAX_NEIGHBORS) and keep existing symbols lldp_config,
lldp_neighbor, LLDP_MAX_NEIGHBORS, lldp_config_alloc, and rte_zmalloc as
reference points.
In `@modules/l2/control/rstp.c`:
- Around line 617-619: BPDU packets are created but never transmitted because
the code only logs and does "(void)bpdu"; replace that TODO by actually sending
the built BPDU via the existing control path: call the bridge's transmit API
(control_output or the module's equivalent transmit function) with the bpdu
buffer/mbuf, handle and log any errors, and only increment the BPDU transmit
counter on successful send; remove the "(void)bpdu" noop and ensure proper
ownership/freeing of the bpdu buffer after send or on error.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/graph.svgis excluded by!**/*.svg
📒 Files selected for processing (88)
docs/meson.buildfrr/if_grout.cfrr/rt_grout.cfrr/rt_grout.hfrr/zebra_dplane_grout.cfrr/zebra_dplane_grout.hmodules/infra/api/gr_infra.hmodules/infra/control/bond.cmodules/infra/control/bond_bridge_test.cmodules/infra/control/ctlplane.cmodules/infra/control/gr_bond.hmodules/infra/control/iface.cmodules/infra/control/meson.buildmodules/infra/datapath/gr_rxtx.hmodules/l2/api/gr_l2.hmodules/l2/cli/dai.cmodules/l2/cli/dhcp_snooping.cmodules/l2/cli/fdb.cmodules/l2/cli/flood.cmodules/l2/cli/iface_security.cmodules/l2/cli/ipsg.cmodules/l2/cli/l2_stats.cmodules/l2/cli/lldp.cmodules/l2/cli/mcast_snooping.cmodules/l2/cli/meson.buildmodules/l2/cli/port_mirror.cmodules/l2/cli/qos.cmodules/l2/cli/rstp.cmodules/l2/cli/storm_control.cmodules/l2/cli/vlan_filtering.cmodules/l2/cli/vxlan.cmodules/l2/control/bridge.cmodules/l2/control/dai.cmodules/l2/control/dai_priv.hmodules/l2/control/dhcp_snooping.cmodules/l2/control/dhcp_snooping_priv.hmodules/l2/control/fdb.cmodules/l2/control/flood.cmodules/l2/control/gr_l2_control.hmodules/l2/control/iface_security.cmodules/l2/control/ipsg.cmodules/l2/control/ipsg_priv.hmodules/l2/control/l2_hooks_test.cmodules/l2/control/l2_stats.cmodules/l2/control/lldp.cmodules/l2/control/lldp_priv.hmodules/l2/control/mcast_snooping.cmodules/l2/control/mcast_snooping_priv.hmodules/l2/control/mcast_snooping_test.cmodules/l2/control/meson.buildmodules/l2/control/port_mirror.cmodules/l2/control/port_mirror_priv.hmodules/l2/control/port_mirror_test.cmodules/l2/control/qos.cmodules/l2/control/qos_priv.hmodules/l2/control/qos_test.cmodules/l2/control/rstp.cmodules/l2/control/rstp_priv.hmodules/l2/control/rstp_test.cmodules/l2/control/storm_control.cmodules/l2/control/storm_control_priv.hmodules/l2/control/storm_control_test.cmodules/l2/control/vlan_filtering.cmodules/l2/control/vlan_filtering_priv.hmodules/l2/control/vlan_filtering_test.cmodules/l2/control/vxlan.cmodules/l2/datapath/bridge_flood.cmodules/l2/datapath/bridge_input.cmodules/l2/datapath/l2_datapath.hmodules/l2/datapath/meson.buildmodules/l2/datapath/vxlan_flood.cmodules/l2/datapath/vxlan_input.cmodules/l2/datapath/vxlan_output.cmodules/l4/gr_l4.hmodules/l4/l4_input_local.csmoke/bridge_test.shsmoke/dai_test.shsmoke/dhcp_snooping_test.shsmoke/evpn_vxlan_frr_test.shsmoke/ipsg_test.shsmoke/lldp_test.shsmoke/mcast_snooping_test.shsmoke/port_mirror_test.shsmoke/qos_test.shsmoke/rstp_test.shsmoke/storm_control_test.shsmoke/vlan_filtering_test.shsmoke/vxlan_test.sh
| char *copy = strdup(src_list); | ||
| char *tok = strtok(copy, ","); | ||
| while (tok != NULL && req.num_sources < 16) { | ||
| src = iface_from_name(c, tok); | ||
| if (src != NULL) { | ||
| req.source_ports[req.num_sources++] = src->id; | ||
| free(src); | ||
| } | ||
| tok = strtok(NULL, ","); | ||
| } | ||
| free(copy); | ||
| } |
There was a problem hiding this comment.
Silent skip on invalid source interface name.
If a source interface name in the comma-separated list is invalid (typo, non-existent), iface_from_name returns NULL and the entry is silently skipped with no warning. The user gets a successful response with fewer sources than intended. Consider logging or returning an error.
Proposed fix
while (tok != NULL && req.num_sources < 16) {
src = iface_from_name(c, tok);
- if (src != NULL) {
- req.source_ports[req.num_sources++] = src->id;
- free(src);
+ if (src == NULL) {
+ fprintf(stderr, "unknown source interface: %s\n", tok);
+ free(copy);
+ return CMD_ERROR;
}
+ req.source_ports[req.num_sources++] = src->id;
+ free(src);
tok = strtok(NULL, ",");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| char *copy = strdup(src_list); | |
| char *tok = strtok(copy, ","); | |
| while (tok != NULL && req.num_sources < 16) { | |
| src = iface_from_name(c, tok); | |
| if (src != NULL) { | |
| req.source_ports[req.num_sources++] = src->id; | |
| free(src); | |
| } | |
| tok = strtok(NULL, ","); | |
| } | |
| free(copy); | |
| } | |
| char *copy = strdup(src_list); | |
| char *tok = strtok(copy, ","); | |
| while (tok != NULL && req.num_sources < 16) { | |
| src = iface_from_name(c, tok); | |
| if (src == NULL) { | |
| fprintf(stderr, "unknown source interface: %s\n", tok); | |
| free(copy); | |
| return CMD_ERROR; | |
| } | |
| req.source_ports[req.num_sources++] = src->id; | |
| free(src); | |
| tok = strtok(NULL, ","); | |
| } | |
| free(copy); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/cli/port_mirror.c` around lines 56 - 67, The code silently skips
invalid interface names when parsing src_list: when iface_from_name(c, tok)
returns NULL the user is not informed and req.num_sources is quietly unchanged;
update the parsing in the port mirror setup (the block using strdup(src_list),
strtok, iface_from_name, req.source_ports and req.num_sources) to detect a NULL
return and either log a clear error/warning (including the bad token) or return
an error status to the caller instead of silently skipping; ensure the behavior
is consistent (choose logging vs returning) and that you free copy before
returning on error and do not overflow req.source_ports (keep the existing limit
check).
| if (vlan_list != NULL) { | ||
| char *copy = strdup(vlan_list); | ||
| char *tok = strtok(copy, ","); | ||
| while (tok != NULL && req.num_allowed_vlans < 256) { | ||
| unsigned long v = strtoul(tok, NULL, 10); | ||
| if (v > 0 && v <= 4094) | ||
| req.allowed_vlans[req.num_allowed_vlans++] = v; | ||
| tok = strtok(NULL, ","); | ||
| } | ||
| free(copy); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the file and understand its structure
find . -name "vlan_filtering.c" -type fRepository: DPDK/grout
Length of output: 125
🏁 Script executed:
# Read the relevant section of the file (lines 80-120)
wc -l modules/l2/cli/vlan_filtering.cRepository: DPDK/grout
Length of output: 89
🏁 Script executed:
# Get context around the flagged lines
sed -n '80,120p' modules/l2/cli/vlan_filtering.c | cat -nRepository: DPDK/grout
Length of output: 1645
🏁 Script executed:
# Find the function definition that contains these lines
sed -n '1,100p' modules/l2/cli/vlan_filtering.c | cat -nRepository: DPDK/grout
Length of output: 3501
🏁 Script executed:
# Verify both occurrences and see more context
sed -n '88,125p' modules/l2/cli/vlan_filtering.c | cat -nRepository: DPDK/grout
Length of output: 1542
Add NULL check after strdup to prevent crash on allocation failure
strdup can fail and return NULL on out-of-memory, causing strtok to crash with undefined behavior. Guard both the trunk and hybrid branches with NULL checks.
🔧 Proposed fix
char *copy = strdup(vlan_list);
+if (copy == NULL)
+ return CMD_ERROR;
char *tok = strtok(copy, ",");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (vlan_list != NULL) { | |
| char *copy = strdup(vlan_list); | |
| char *tok = strtok(copy, ","); | |
| while (tok != NULL && req.num_allowed_vlans < 256) { | |
| unsigned long v = strtoul(tok, NULL, 10); | |
| if (v > 0 && v <= 4094) | |
| req.allowed_vlans[req.num_allowed_vlans++] = v; | |
| tok = strtok(NULL, ","); | |
| } | |
| free(copy); | |
| } | |
| if (vlan_list != NULL) { | |
| char *copy = strdup(vlan_list); | |
| if (copy == NULL) | |
| return CMD_ERROR; | |
| char *tok = strtok(copy, ","); | |
| while (tok != NULL && req.num_allowed_vlans < 256) { | |
| unsigned long v = strtoul(tok, NULL, 10); | |
| if (v > 0 && v <= 4094) | |
| req.allowed_vlans[req.num_allowed_vlans++] = v; | |
| tok = strtok(NULL, ","); | |
| } | |
| free(copy); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/cli/vlan_filtering.c` around lines 93 - 103, The code calls
strdup(vlan_list) without checking its return; if strdup returns NULL,
subsequent strtok(copy, ",") will crash—add a NULL check after the strdup call
(e.g., if (copy != NULL) { ... }) so parsing into tok and pushing into
req.allowed_vlans (and calling free(copy)) only happens when copy is non-NULL;
apply the same NULL-checked strdup pattern in the other branch that handles
trunk/hybrid VLAN parsing so both branches (the vlan_list strdup usage) are
guarded.
| void iface_increment_mac_count(uint16_t iface_id, uint16_t lcore_id) { | ||
| if (iface_id >= L2_MAX_IFACES) | ||
| return; | ||
| l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs++; | ||
| } | ||
|
|
||
| void iface_decrement_mac_count(uint16_t iface_id, uint16_t lcore_id) { | ||
| if (iface_id >= L2_MAX_IFACES) | ||
| return; | ||
| if (l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs > 0) | ||
| l2_iface_mac_counts[iface_id][lcore_id].dynamic_macs--; |
There was a problem hiding this comment.
Bounds-check lcore_id to avoid OOB writes
iface_increment_mac_count and iface_decrement_mac_count only check iface_id. If lcore_id is out of range, this writes past l2_iface_mac_counts, causing memory corruption.
🔧 Proposed fix
- if (iface_id >= L2_MAX_IFACES)
+ if (iface_id >= L2_MAX_IFACES || lcore_id >= RTE_MAX_LCORE)
return;
@@
- if (iface_id >= L2_MAX_IFACES)
+ if (iface_id >= L2_MAX_IFACES || lcore_id >= RTE_MAX_LCORE)
return;#!/bin/bash
# Inspect call sites to confirm lcore_id is always a valid RTE lcore.
rg -n "iface_(increment|decrement)_mac_count" -C3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/control/bridge.c` around lines 50 - 60, Both
iface_increment_mac_count and iface_decrement_mac_count validate iface_id but
not lcore_id, allowing out-of-bounds accesses into l2_iface_mac_counts; add a
guard checking lcore_id against the second-dimension size before touching
l2_iface_mac_counts (e.g., ensure lcore_id < the configured lcore count constant
or ARRAY_SIZE(l2_iface_mac_counts[0]) / L2_MAX_LCORES) in both functions
(iface_increment_mac_count and iface_decrement_mac_count) and return early if
the check fails so all writes/reads use valid lcore indices.
| static struct api_out dhcp_snooping_set_cb(const void *request, struct api_ctx *) { | ||
| const struct gr_l2_dhcp_snooping_req *req = request; | ||
| const struct iface *bridge; | ||
|
|
||
| bridge = iface_from_id(req->bridge_id); | ||
| if (bridge == NULL || bridge->type != GR_IFACE_TYPE_BRIDGE) | ||
| return api_out(ENOENT, 0, NULL); | ||
|
|
||
| dhcp_snooping_enable(req->bridge_id, req->enabled); | ||
| dhcp_snooping_set_verify_mac(req->bridge_id, req->verify_mac); | ||
| dhcp_snooping_set_max_bindings(req->bridge_id, req->max_bindings); | ||
| dhcp_snooping_set_aging_time(req->bridge_id, req->aging_time); | ||
|
|
||
| return api_out(0, 0, NULL); | ||
| } |
There was a problem hiding this comment.
dhcp_snooping_set_cb ignores the return value of dhcp_snooping_enable, which can fail with -ENOMEM.
If rte_hash_create fails inside dhcp_snooping_enable, the handler reports success to the client while DHCP snooping is left in a broken state (enabled flag set but no bindings hash).
Proposed fix
static struct api_out dhcp_snooping_set_cb(const void *request, struct api_ctx *) {
const struct gr_l2_dhcp_snooping_req *req = request;
const struct iface *bridge;
+ int ret;
bridge = iface_from_id(req->bridge_id);
if (bridge == NULL || bridge->type != GR_IFACE_TYPE_BRIDGE)
return api_out(ENOENT, 0, NULL);
- dhcp_snooping_enable(req->bridge_id, req->enabled);
+ ret = dhcp_snooping_enable(req->bridge_id, req->enabled);
+ if (ret < 0)
+ return api_out(-ret, 0, NULL);
dhcp_snooping_set_verify_mac(req->bridge_id, req->verify_mac);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/control/dhcp_snooping.c` around lines 272 - 286, The handler
dhcp_snooping_set_cb currently ignores the return value of dhcp_snooping_enable
which can fail (e.g., -ENOMEM); modify dhcp_snooping_set_cb to capture the
return value of dhcp_snooping_enable(req->bridge_id, req->enabled), and if it
returns a negative error, immediately return api_out(err, 0, NULL) (do not
continue to call dhcp_snooping_set_verify_mac, dhcp_snooping_set_max_bindings,
or dhcp_snooping_set_aging_time); ensure the error from dhcp_snooping_enable is
propagated to the client so the enabled state isn’t reported as successful when
initialization failed.
| static struct api_out dhcp_trusted_port_set_cb(const void *request, struct api_ctx *) { | ||
| const struct gr_l2_dhcp_trusted_port_req *req = request; | ||
|
|
||
| if (req->trusted) | ||
| dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id); | ||
| else | ||
| dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id); | ||
|
|
||
| return api_out(0, 0, NULL); | ||
| } |
There was a problem hiding this comment.
dhcp_trusted_port_set_cb silently discards errors from add/del trusted port operations.
Both dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port return error codes (-EINVAL, -EEXIST, -ENOENT), but this handler always returns success. An invalid bridge_id or duplicate add would go unnoticed by the client.
Proposed fix
static struct api_out dhcp_trusted_port_set_cb(const void *request, struct api_ctx *) {
const struct gr_l2_dhcp_trusted_port_req *req = request;
+ int ret;
if (req->trusted)
- dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id);
+ ret = dhcp_snooping_add_trusted_port(req->bridge_id, req->iface_id);
else
- dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id);
+ ret = dhcp_snooping_del_trusted_port(req->bridge_id, req->iface_id);
- return api_out(0, 0, NULL);
+ return api_out(ret < 0 ? -ret : 0, 0, NULL);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/control/dhcp_snooping.c` around lines 353 - 362,
dhcp_trusted_port_set_cb currently ignores return values from
dhcp_snooping_add_trusted_port and dhcp_snooping_del_trusted_port and always
returns success; change it to capture the int rc from those calls, and if rc < 0
return a failure api_out reflecting the error (e.g., api_out(-1, rc, NULL) or
the project's existing pattern) so callers see -EINVAL/-EEXIST/-ENOENT; update
dhcp_trusted_port_set_cb to call the appropriate function, check rc, and only
return api_out(0,0,NULL) on success.
| if (set_attrs & GR_VXLAN_SET_ENCAP_VRF) { | ||
| uint16_t vrf = next->encap_vrf_id; | ||
| uint16_t old = cur->encap_vrf_id; | ||
|
|
||
| if (vrf == GR_VRF_ID_UNDEF) | ||
| vrf = vrf_default_get_or_create(); | ||
|
|
||
| if (vrf != old && vrf_incref(vrf) < 0) | ||
| return -errno; | ||
|
|
||
| if (old != GR_VRF_ID_UNDEF) | ||
| vrf_decref(old); | ||
|
|
||
| cur->encap_vrf_id = vrf; | ||
| } | ||
|
|
||
| if (set_attrs & (GR_VXLAN_SET_VNI | GR_VXLAN_SET_ENCAP_VRF)) { | ||
| const struct vxlan_key next_key = {rte_cpu_to_be_32(next->vni), cur->encap_vrf_id}; | ||
|
|
||
| if (rte_hash_lookup(vxlan_hash, &next_key) >= 0) | ||
| return errno_set(EADDRINUSE); | ||
|
|
||
| if (next->vni == 0 || next->vni > 0xffffff) | ||
| return errno_set(ERANGE); | ||
|
|
||
| rte_hash_del_key(vxlan_hash, &cur_key); | ||
|
|
||
| ret = rte_hash_add_key_data(vxlan_hash, &next_key, iface); | ||
| if (ret < 0) | ||
| return errno_log(-ret, "rte_hash_add_key_data"); | ||
|
|
||
| cur->vni = next->vni; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the full vxlan_reconfig function to understand the context
rg -B20 "GR_VXLAN_SET_ENCAP_VRF" modules/l2/control/vxlan.c | head -100Repository: DPDK/grout
Length of output: 1517
🏁 Script executed:
# Find where vxlan_reconfig is called and how the flags are set
rg -n "vxlan_reconfig|GR_VXLAN_SET" modules/l2/control/vxlan.cRepository: DPDK/grout
Length of output: 555
🏁 Script executed:
# Look for call sites that invoke vxlan_reconfig
rg -n "\.reconfig\s*=" modules/l2/control/vxlan.c -A5 -B5Repository: DPDK/grout
Length of output: 476
🏁 Script executed:
# Check the entire vxlan.c file structure to understand the function
wc -l modules/l2/control/vxlan.cRepository: DPDK/grout
Length of output: 84
🏁 Script executed:
# Check if vrf_incref can fail and what errors are possible
rg -n "vrf_incref" modules/infra/control/vrf.c -A3 -B1Repository: DPDK/grout
Length of output: 191
🏁 Script executed:
# Verify the vrf_default_get_or_create behavior
rg -n "vrf_default_get_or_create" modules/infra/control/vrf.c -A10Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Check if there's any error handling for hash operations elsewhere
rg -n "rte_hash_add_key_data|rte_hash_del_key" --type c -A2 -B1 | head -80Repository: DPDK/grout
Length of output: 4014
🏁 Script executed:
# Look at the full context of iface_vxlan_reconfig to see if there are other state mutations
sed -n '41,130p' modules/l2/control/vxlan.cRepository: DPDK/grout
Length of output: 2709
🏁 Script executed:
# Check vrf_default_get_or_create to understand if it can fail
rg -n "vrf_default_get_or_create" modules/infra/control/vrf.c -A5Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Verify vrf_incref failure condition more carefully
rg -n "vrf_incref|get_vrf_iface" modules/infra/control/vrf.c -B2 -A5Repository: DPDK/grout
Length of output: 940
🏁 Script executed:
# Look at how this reconfig function is called and what set_attrs combinations are realistic
rg -n "iface_vxlan_reconfig\|GR_VXLAN_SET" modules/l2/control/vxlan.c -B3 -A3 | grep -E "iface_vxlan_reconfig|IFACE_SET_ALL"Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Search for vrf_default_get_or_create to find where it's defined
rg -n "vrf_default_get_or_create" --type cRepository: DPDK/grout
Length of output: 1033
🏁 Script executed:
# Check if there's any cleanup at the call site of iface_vxlan_reconfig
rg -n "iface_vxlan_reconfig\|\.reconfig" modules/infra/control/iface.c -A10 -B2 | head -60Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Look for the full function signature and error handling context
rg -n "iface_reconfig\|iface_type_dispatch" modules/infra/control/iface.c -B3 -A8 | head -100Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Get the implementation of vrf_default_get_or_create
sed -n '134,150p' modules/infra/control/iface.cRepository: DPDK/grout
Length of output: 491
🏁 Script executed:
# Look at the iface_reconfig dispatch pattern
rg -n "\.reconfig" modules/infra/control/iface.c -B5 -A15 | head -80Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Check for error recovery patterns in interface reconfig
rg -n "reconfig.*errno|reconfig.*ret" modules/infra/control/iface.c -B2 -A2Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Confirm the cur_key is computed before ENCAP_VRF updates
sed -n '41,84p' modules/l2/control/vxlan.c | head -50Repository: DPDK/grout
Length of output: 1223
🏁 Script executed:
# Check what happens if both flags are set and second fails
cat -n modules/l2/control/vxlan.c | sed -n '48,84p'Repository: DPDK/grout
Length of output: 1297
🏁 Script executed:
# Check how the reconfig is called and if there's rollback/cleanup
rg -n "iface_reconfig\|\.reconfig" modules/infra/control/iface.c -C10 | head -150Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Verify what happens when reconfig returns error to the caller
rg -n "iface.*reconfig" modules/infra/control/iface.c | head -20Repository: DPDK/grout
Length of output: 77
🏁 Script executed:
# Search for the dispatcher that calls .reconfig callbacks
rg -n "type->reconfig\|iface->type->reconfig" modules/infra/control/iface.c -B5 -A10Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Get the full iface_reconfig function
sed -n '290,350p' modules/infra/control/iface.cRepository: DPDK/grout
Length of output: 1528
🏁 Script executed:
# Search more broadly for reconfig invocation
rg -n "reconfig" modules/infra/control/iface.c -B3 -A8Repository: DPDK/grout
Length of output: 664
🏁 Script executed:
# Look for how vxlan_reconfig is called - check the API
rg -n "iface_reconfig" modules/l2/control/vxlan.c -B5 -A5Repository: DPDK/grout
Length of output: 36
🏁 Script executed:
# Check the error handling path in iface_reconfig
sed -n '324,380p' modules/infra/control/iface.cRepository: DPDK/grout
Length of output: 1403
🏁 Script executed:
# Look for the err label to see what cleanup happens
sed -n '360,400p' modules/infra/control/iface.cRepository: DPDK/grout
Length of output: 1047
🏁 Script executed:
# Verify if vrf_decref is idempotent or what happens on double-call
sed -n '151,160p' modules/infra/control/vrf.cRepository: DPDK/grout
Length of output: 266
🏁 Script executed:
# Look at the complete error path in iface_reconfig and see if there's any vxlan-specific cleanup
sed -n '290,375p' modules/infra/control/iface.cRepository: DPDK/grout
Length of output: 1971
🏁 Script executed:
# Check if vrf_decref on a VRF that was just incremented in the same function would work
# and verify the actual failure case scenario
rg -n "rte_hash_add_key_data" modules/l2/control/ -B3 -A3Repository: DPDK/grout
Length of output: 2256
Fix VXLAN hash/VRF state corruption on reconfig failures
The ENCAP_VRF/VNI update path mutates cur->encap_vrf_id and VRF refcounts before validating the new hash key. If rte_hash_add_key_data fails after the old key is deleted, the interface is orphaned from the hash with stale VRF refcounts, and no rollback occurs. Reorder to validate all inputs first, attempt the hash add before deleting the old key, and explicitly roll back VRF refs on failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/control/vxlan.c` around lines 52 - 84, The current VXLAN update
mutates cur->encap_vrf_id and adjusts VRF refcounts before validating the new
vxlan_key and before successfully adding it to vxlan_hash, which can orphan the
interface on failure; change the flow in the VXLAN update path so you first
validate next->vni and build the new vxlan_key (using
rte_cpu_to_be_32(next->vni) and next encap VRF), then try
rte_hash_lookup/rte_hash_add_key_data for the new key before calling
rte_hash_del_key for the old cur_key; only after rte_hash_add_key_data succeeds
update cur->vni and cur->encap_vrf_id, and if you must increment the new VRF
(vrf_incref) do so tentatively and on any later failure call vrf_decref to roll
back so vrf refcounts remain consistent (use vrf_incref/vrf_decref and handle
errors from rte_hash_add_key_data to restore state).
| if (rte_is_unicast_ether_addr(ð->src_addr) | ||
| && !(br->flags & GR_BRIDGE_F_NO_LEARN)) | ||
| fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id); | ||
| && !(br->flags & GR_BRIDGE_F_NO_LEARN)) { | ||
| // Port security: check MAC limits before learning. | ||
| uint32_t max_macs = iface_get_max_macs(d->iface->id); | ||
| if (max_macs > 0) { | ||
| uint32_t cur = iface_get_total_macs(d->iface->id); | ||
| if (cur >= max_macs) { | ||
| struct bridge_stats *st = bridge_get_stats( | ||
| bridge->id, rte_lcore_id() | ||
| ); | ||
| if (st != NULL) | ||
| st->learn_limit_iface++; | ||
| if (iface_get_shutdown_on_violation(d->iface->id)) | ||
| iface_shutdown_violation(d->iface->id); | ||
| edge = BRIDGE_INVAL; | ||
| goto next; | ||
| } | ||
| } | ||
| vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0; | ||
| fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id, vtep); | ||
| } |
There was a problem hiding this comment.
MAC limit enforcement drops traffic for already‑learned MACs
When cur >= max_macs, packets are dropped before checking whether the source MAC is already learned on this interface. That blocks legitimate traffic and prevents last_seen refresh once the limit is reached. Enforce the limit only for new MACs.
🔧 Suggested fix
- uint32_t max_macs = iface_get_max_macs(d->iface->id);
- if (max_macs > 0) {
- uint32_t cur = iface_get_total_macs(d->iface->id);
- if (cur >= max_macs) {
+ uint32_t max_macs = iface_get_max_macs(d->iface->id);
+ if (max_macs > 0) {
+ const struct gr_fdb_entry *src_fdb =
+ fdb_lookup(bridge->id, ð->src_addr, d->vlan_id);
+ if (src_fdb == NULL || src_fdb->iface_id != d->iface->id) {
+ uint32_t cur = iface_get_total_macs(d->iface->id);
+ if (cur >= max_macs) {
struct bridge_stats *st = bridge_get_stats(
bridge->id, rte_lcore_id()
);
if (st != NULL)
st->learn_limit_iface++;
if (iface_get_shutdown_on_violation(d->iface->id))
iface_shutdown_violation(d->iface->id);
edge = BRIDGE_INVAL;
goto next;
- }
- }
+ }
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (rte_is_unicast_ether_addr(ð->src_addr) | |
| && !(br->flags & GR_BRIDGE_F_NO_LEARN)) | |
| fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id); | |
| && !(br->flags & GR_BRIDGE_F_NO_LEARN)) { | |
| // Port security: check MAC limits before learning. | |
| uint32_t max_macs = iface_get_max_macs(d->iface->id); | |
| if (max_macs > 0) { | |
| uint32_t cur = iface_get_total_macs(d->iface->id); | |
| if (cur >= max_macs) { | |
| struct bridge_stats *st = bridge_get_stats( | |
| bridge->id, rte_lcore_id() | |
| ); | |
| if (st != NULL) | |
| st->learn_limit_iface++; | |
| if (iface_get_shutdown_on_violation(d->iface->id)) | |
| iface_shutdown_violation(d->iface->id); | |
| edge = BRIDGE_INVAL; | |
| goto next; | |
| } | |
| } | |
| vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0; | |
| fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id, vtep); | |
| } | |
| if (rte_is_unicast_ether_addr(ð->src_addr) | |
| && !(br->flags & GR_BRIDGE_F_NO_LEARN)) { | |
| // Port security: check MAC limits before learning. | |
| uint32_t max_macs = iface_get_max_macs(d->iface->id); | |
| if (max_macs > 0) { | |
| const struct gr_fdb_entry *src_fdb = | |
| fdb_lookup(bridge->id, ð->src_addr, d->vlan_id); | |
| if (src_fdb == NULL || src_fdb->iface_id != d->iface->id) { | |
| uint32_t cur = iface_get_total_macs(d->iface->id); | |
| if (cur >= max_macs) { | |
| struct bridge_stats *st = bridge_get_stats( | |
| bridge->id, rte_lcore_id() | |
| ); | |
| if (st != NULL) | |
| st->learn_limit_iface++; | |
| if (iface_get_shutdown_on_violation(d->iface->id)) | |
| iface_shutdown_violation(d->iface->id); | |
| edge = BRIDGE_INVAL; | |
| goto next; | |
| } | |
| } | |
| } | |
| vtep = (d->iface->type == GR_IFACE_TYPE_VXLAN) ? d->vtep : 0; | |
| fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id, vtep); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/datapath/bridge_input.c` around lines 83 - 103, The code enforces
iface MAC limits before checking if the source MAC is already learned, causing
already-learned MACs to be dropped and their last_seen not refreshed; fix by
first checking the FDB for an existing entry for bridge->id / d->iface->id /
ð->src_addr / d->vlan_id (use the existing FDB lookup function, e.g.
fdb_lookup or fdb_find in this module), and only if the lookup returns “not
found” apply iface_get_max_macs() / iface_get_total_macs() enforcement and the
shutdown/stats logic; if the entry exists, skip the limit check and call
fdb_learn(bridge->id, d->iface->id, ð->src_addr, d->vlan_id, vtep) to refresh
last_seen.
| ip4_addr_t vtep; | ||
| }; | ||
|
|
||
| int trace_vxlan_format(char *buf, size_t len, const void *data, size_t data_len); |
There was a problem hiding this comment.
trace_vxlan_format implementation has an unsafe snprintf-chaining pattern.
The implementation in modules/l2/datapath/vxlan_input.c (lines 29–35):
int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
if (t->vtep != 0)
n += snprintf(buf + n, len - n, " vtep=" IP4_F, &t->vtep);snprintf returns the number of chars that would have been written, which may be ≥ len on truncation. If that happens:
buf + nis past the end of the buffer → out-of-bounds write.len - n:lenissize_t,nisint; the subtraction wraps to ~UINT64_MAX→ the secondsnprintfis handed a near-infinite size limit, potentially writing far beyond the buffer.
Real-world risk is low (the formatted string is at most ~35 bytes), but the arithmetic is wrong and will fire if the function is ever called with a small buffer. The idiomatic fix is to guard against truncation:
🐛 Proposed fix
int trace_vxlan_format(char *buf, size_t len, const void *data, size_t /*data_len*/) {
const struct trace_vxlan_data *t = data;
- int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
- if (t->vtep != 0)
- n += snprintf(buf + n, len - n, " vtep=" IP4_F, &t->vtep);
+ int n = snprintf(buf, len, "vni=%u", rte_be_to_cpu_32(t->vni));
+ if (t->vtep != 0 && n >= 0 && (size_t)n < len)
+ n += snprintf(buf + n, len - (size_t)n, " vtep=" IP4_F, &t->vtep);
return n;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/datapath/l2_datapath.h` at line 15, trace_vxlan_format in
modules/l2/datapath/vxlan_input.c currently chains snprintf calls unsafely: it
uses the first snprintf return (int n) directly to compute buf + n and len - n,
but snprintf returns the would-be length and can be >= len causing out-of-bounds
pointer arithmetic and size_t underflow. Fix by checking snprintf's return: if n
< 0 treat as 0/error; if (size_t)n >= len clamp n to (int)len (or set remaining
= 0) and only compute buf + n and remaining = len - (size_t)n when remaining >
0; only call the second snprintf when remaining > 0 and use size_t for remaining
to avoid signed/unsigned mix. Ensure these checks are applied in
trace_vxlan_format around the first snprintf and before the conditional snprintf
for t->vtep.
| vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr)); | ||
| if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) { | ||
| edge = BAD_FLAGS; | ||
| goto next; | ||
| } | ||
|
|
||
| vni = vxlan_decode_vni(vh->vx_vni); | ||
| if (vni != last_vni || vrf_id != last_vrf_id) { | ||
| iface = vxlan_get_iface(vni, vrf_id); | ||
| last_vrf_id = vrf_id; | ||
| last_vni = vni; | ||
| } | ||
| if (iface == NULL) { | ||
| edge = NO_TUNNEL; | ||
| goto next; | ||
| } | ||
|
|
||
| rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh)); | ||
|
|
||
| d = iface_mbuf_data(m); | ||
| d->iface = iface; | ||
| d->vlan_id = 0; | ||
| d->vtep = src_vtep; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the full function containing lines 61-83
wc -l modules/l2/datapath/vxlan_input.cRepository: DPDK/grout
Length of output: 91
🏁 Script executed:
# Get context around the target lines to understand the function
sed -n '1,100p' modules/l2/datapath/vxlan_input.c | cat -nRepository: DPDK/grout
Length of output: 3087
🏁 Script executed:
# Check if there are any existing length checks in the function
rg -n "data_len|pkt_len|rte_pktmbuf_data_len" modules/l2/datapath/vxlan_input.c -B2 -A2Repository: DPDK/grout
Length of output: 266
🏁 Script executed:
# Search for how rte_pktmbuf_adj is used elsewhere in the codebase
rg -n "rte_pktmbuf_adj" --type c -B1 -A1 | head -50Repository: DPDK/grout
Length of output: 2174
🏁 Script executed:
# Check what ip_local_mbuf_data does and how packets are validated
rg -n "ip_local_mbuf_data" modules/l2/datapath/ -B5 -A5Repository: DPDK/grout
Length of output: 1464
🏁 Script executed:
# Look at the L4 input registration to understand packet preconditions
rg -n "l4_input_register_port" --type c -B3 -A3 | head -40Repository: DPDK/grout
Length of output: 1978
🏁 Script executed:
# Check DPDK documentation or implementation notes for rte_pktmbuf_adj
rg -n "rte_pktmbuf_adj" modules/ --type c -B3 -A3 | grep -A3 -B3 "return\|NULL"Repository: DPDK/grout
Length of output: 2014
🏁 Script executed:
# Search for similar VXLAN or UDP header parsing patterns in the codebase
rg -n "rte_vxlan_hdr|VXLAN" --type c -B3 -A3Repository: DPDK/grout
Length of output: 21101
🏁 Script executed:
# Check if there are any packet length validations in earlier stages (IP/UDP processing)
fd "ip_local\.c" --type f | xargs cat -n | head -80Repository: DPDK/grout
Length of output: 2513
🏁 Script executed:
# Check if there's any UDP payload length validation
rg -n "rte_be_to_cpu_16.*length|payload.*len|udp" modules/l4/ --type c -B2 -A2 | head -60Repository: DPDK/grout
Length of output: 2735
🏁 Script executed:
# Check the actual rte_pktmbuf_adj signature and behavior in DPDK headers
fd "rte.*" modules/infra/api --type f | xargs rg -l "pktmbuf_adj" 2>/dev/null || echo "Not found in headers"Repository: DPDK/grout
Length of output: 356
🏁 Script executed:
# Look for any validation of packet length before VXLAN header access in similar datapath files
rg -n "data_len|pkt_len.*<" modules/l2/datapath/ modules/infra/datapath/ --type c -B2 -A2Repository: DPDK/grout
Length of output: 10891
🏁 Script executed:
# Check if ip_local_mbuf_data.len is used for validation before accessing offset payloads
rg -n "l->len|data->len" modules/ --type c -B1 -A1 | grep -A1 -B1 "sizeof\|offset"Repository: DPDK/grout
Length of output: 463
Add length validation before VXLAN header access
The code accesses the VXLAN header without verifying the mbuf contains sufficient data. A truncated packet can cause out-of-bounds reads. Validate the packet size before rte_pktmbuf_mtod_offset and check the return value of rte_pktmbuf_adj.
🔧 Proposed fix
+ const uint16_t vxlan_min_len =
+ sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr);
+ if (rte_pktmbuf_data_len(m) < vxlan_min_len) {
+ edge = BAD_FLAGS;
+ goto next;
+ }
vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr)); | |
| if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) { | |
| edge = BAD_FLAGS; | |
| goto next; | |
| } | |
| vni = vxlan_decode_vni(vh->vx_vni); | |
| if (vni != last_vni || vrf_id != last_vrf_id) { | |
| iface = vxlan_get_iface(vni, vrf_id); | |
| last_vrf_id = vrf_id; | |
| last_vni = vni; | |
| } | |
| if (iface == NULL) { | |
| edge = NO_TUNNEL; | |
| goto next; | |
| } | |
| rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh)); | |
| d = iface_mbuf_data(m); | |
| d->iface = iface; | |
| d->vlan_id = 0; | |
| d->vtep = src_vtep; | |
| const uint16_t vxlan_min_len = | |
| sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr); | |
| if (rte_pktmbuf_data_len(m) < vxlan_min_len) { | |
| edge = BAD_FLAGS; | |
| goto next; | |
| } | |
| vh = rte_pktmbuf_mtod_offset(m, struct rte_vxlan_hdr *, sizeof(struct rte_udp_hdr)); | |
| if (!(vh->vx_flags & VXLAN_FLAGS_VNI)) { | |
| edge = BAD_FLAGS; | |
| goto next; | |
| } | |
| vni = vxlan_decode_vni(vh->vx_vni); | |
| if (vni != last_vni || vrf_id != last_vrf_id) { | |
| iface = vxlan_get_iface(vni, vrf_id); | |
| last_vrf_id = vrf_id; | |
| last_vni = vni; | |
| } | |
| if (iface == NULL) { | |
| edge = NO_TUNNEL; | |
| goto next; | |
| } | |
| rte_pktmbuf_adj(m, sizeof(struct rte_udp_hdr) + sizeof(*vh)); | |
| d = iface_mbuf_data(m); | |
| d->iface = iface; | |
| d->vlan_id = 0; | |
| d->vtep = src_vtep; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/l2/datapath/vxlan_input.c` around lines 61 - 83, The VXLAN path reads
the VXLAN header via rte_pktmbuf_mtod_offset and then calls rte_pktmbuf_adj
without validating mbuf lengths; add explicit length checks before accessing
struct rte_vxlan_hdr and before trimming the UDP+VXLAN headers: verify packet
length (using rte_pktmbuf_pkt_len and/or rte_pktmbuf_data_len) is at least
sizeof(struct rte_udp_hdr) + sizeof(struct rte_vxlan_hdr) and return/skip the
packet (set edge to BAD_FLAGS or NO_TUNNEL as appropriate) if not. After calling
rte_pktmbuf_adj ensure the adjustment succeeded (check the returned pointer or
remaining data length) and bail out safely if it failed before calling
iface_mbuf_data, vxlan_decode_vni, or vxlan_get_iface so you never read past the
mbuf bounds.
| grcli port-mirror del br0 2 | ||
| grcli port-mirror show br0 2 | grep -q "enabled: false" || fail "session 2 not deleted" |
There was a problem hiding this comment.
Verify post-deletion show behavior assumption.
If grcli port-mirror show br0 2 exits non-zero (e.g. ENOENT) with no output after deletion, grep -q "enabled: false" fails on empty input and fail "session 2 not deleted" fires — masking a correct deletion with a misleading error message. This assertion holds only if the CLI keeps the session record in a disabled state rather than removing it entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@smoke/port_mirror_test.sh` around lines 27 - 28, The test assumes `grcli
port-mirror show br0 2` will return a disabled record, but deletion may remove
the session entirely; update the assertion around the `grcli port-mirror show
br0 2` call so it accepts either a successful show reporting 'enabled: false' or
a non-zero/empty output indicating the session no longer exists. Modify the
check that currently pipes to `grep -q "enabled: false"` and `fail "session 2
not deleted"` to first capture the `grcli port-mirror show br0 2` exit
status/output (the `grcli port-mirror show` invocation and the `grep -q
"enabled: false"` check) and only fail if the command succeeded but the output
does not contain 'enabled: false'; treat a non-zero exit or empty output as a
valid deletion.
|
Lot of good stuff ! |
|
Wrong branch — this included too many features. Will submit individual feature branches instead. |
Summary
Add advanced L2 switching capabilities to the bridge module.
Depends on #504 (VXLAN/EVPN support).
Features
Each feature follows the standard grout module structure with
separate api/, control/, datapath/, and cli/ components.
Test plan
meson test)smoke/run.sh)grcli)🤖 Generated with Claude Code
Summary by CodeRabbit