Skip to content

⏺ pf route-to silently drops DF packets exceeding target interface MTU without generating ICMP Fragmentation Needed #282

@brendanbank

Description

@brendanbank

Important notices

Note: This is closely related to #4094 ("ICMP fragmentation required not returned to NAT origin"), filed in 2020 and closed without resolution. That issue described the same root cause (ICMP Fragmentation Needed not reaching LAN clients when traffic exits via WireGuard) but focused on NAT interaction. This report provides additional evidence that the issue is specifically in pf's route-to code path, includes reproducible tcpdump captures, and documents the MSS clamping workaround. Also related: #9225 (PMTUD broken with PPPoE, same pf mechanism), #9398 (MSS clamping not auto-derived from interface MTU).

Describe the bug

When using pf route-to to policy-route traffic from a VLAN interface through a WireGuard tunnel interface (MTU 1420), packets that exceed the tunnel MTU with the DF (Don't Fragment) bit set are silently dropped. OPNsense/pf does not generate ICMP Type 3 Code 4 (Destination Unreachable, Fragmentation Needed) back to the source host. This breaks Path MTU Discovery (RFC 1191) for all source-routed traffic through tunnel interfaces.

In normal IP forwarding (without route-to), the kernel correctly generates ICMP Fragmentation Needed when a DF packet exceeds the outgoing interface MTU. The bug is specific to the route-to code path.

To Reproduce

Setup:

  • OPNsense firewall with a WireGuard interface (e.g. wg0, MTU 1420)
  • A LAN/VLAN interface (e.g. vlan0.100, MTU 1500)
  • A pf pass rule with route-to directing traffic from the VLAN through the WireGuard interface:
    pass in quick on vlan0.100 route-to (wg0 10.0.0.1) inet from any to ! <LOCAL_NET> flags S/SA keep state
    
  • NAT rule on the WireGuard interface
  • A remote WireGuard peer that provides internet access (e.g. WAN breakout)

Steps to reproduce:

  1. Connect a client to the VLAN (client gets MTU 1500 from the network)
  2. Client establishes TCP connections to internet hosts
  3. Client SYN advertises MSS 1460 (standard for MTU 1500)
  4. Remote servers send TCP data packets of 1500 bytes with DF bit set
  5. These packets arrive at OPNsense on the WireGuard interface (inbound return traffic) — this direction works fine via normal forwarding
  6. For the outbound direction: client sends TCP data packets > 1420 bytes with DF set on the VLAN interface
  7. pf matches the route-to rule and attempts to forward via the WireGuard interface (MTU 1420)
  8. Bug: The packet is silently dropped. No ICMP Fragmentation Needed is sent back to the client.
  9. Client retransmits the same oversized packet repeatedly (observed via tcpdump as identical seq retransmissions every 300-500ms)

Expected behavior

When a packet with DF set exceeds the MTU of the route-to target interface, OPNsense should generate an ICMP Type 3 Code 4 (Destination Unreachable, Fragmentation Needed, Next-Hop MTU = 1420) message back to the source host, just as it does for normal IP forwarding without route-to.

This is required by RFC 1191 (Path MTU Discovery) and RFC 791 (IP specification, DF bit handling).

Describe alternatives you considered

Workaround: TCP MSS Clamping

Adding a normalization rule on the WireGuard interface (Firewall → Settings → Normalization) with max-mss 1368 clamps the TCP MSS in SYN/SYN-ACK packets. This prevents TCP sessions from ever sending packets larger than the tunnel MTU, avoiding the need for PMTUD.

MSS 1368 = 1420 (tunnel MTU) − 20 (IP header) − 20 (TCP header) − 12 (TCP timestamps option)

This workaround is effective for TCP but does not fix the underlying issue for:

  • UDP traffic that exceeds the tunnel MTU with DF set
  • Any protocol relying on PMTUD through a route-to path
  • Users who are unaware of this interaction and expect standard PMTUD behavior

Screenshots

See Mermaid diagram below showing the traffic flow and where the bug occurs.

Relevant log files

Evidence captured via tcpdump on the OPNsense firewall:

1. Client sends oversized DF packets on the VLAN interface (inbound to pf):

tcpdump -i vlan0.100 "src 10.1.80.50 and ip[2:2] > 1420 and ip[6] & 0x40 != 0"

10.1.80.50.60238 > 198.51.100.22.443: Flags [.], flags [DF], length 1388  (IP total ~1440)
10.1.80.50.60238 > 198.51.100.22.443: Flags [.], flags [DF], seq 0:1388   (retransmit)
10.1.80.50.60238 > 198.51.100.22.443: Flags [.], flags [DF], seq 0:1388   (retransmit)

2. These packets never appear on the WireGuard interface:

tcpdump -i wg0 "host 198.51.100.22"
# 0 packets captured (10 seconds, while client actively retransmitting)

3. No ICMP Fragmentation Needed sent to the client:

tcpdump -i vlan0.100 "icmp[icmptype] == 3 and icmp[icmpcode] == 4 and dst 10.1.80.50"
# 0 packets captured (12 seconds, 29781 packets observed on interface)

4. Contrast: the remote WireGuard peer (Linux) correctly generates ICMP for inbound traffic hitting the same MTU constraint:

tcpdump -i eth0 "icmp[icmptype] == 3 and icmp[icmpcode] == 4"

192.168.1.120 > 203.0.113.89: ICMP 192.168.1.120 unreachable - need to frag (mtu 1420)
192.168.1.120 > 203.0.113.50: ICMP 192.168.1.120 unreachable - need to frag (mtu 1420)
# Hundreds of these generated — Linux kernel correctly handles PMTUD

This confirms the issue is specific to OPNsense/pf's route-to code path, not a general WireGuard or MTU issue.

Additional context

flowchart LR
    subgraph LAN["LAN / VLAN (MTU 1500)"]
        Client["Client<br/>10.1.80.50<br/>MSS 1460"]
    end

    subgraph FW["OPNsense Firewall"]
        PF["pf input<br/>(vlan0.100)"]
        RT{{"route\-to<br/>(wg0 10.0.0.1)"}}
        WG["WireGuard wg0<br/>MTU 1420"]
    end

    DROP((("⛔ DROP")))

    subgraph Remote["Remote WireGuard Peer"]
        PEER["WireGuard<br/>10.0.0.1"]
        NAT["NAT/Masquerade"]
    end

    Internet["Internet"]

    Client -- "1440\-byte packet<br/>DF bit set" --> PF --> RT

    RT -- "1440 > MTU 1420, DF set" --> DROP
    DROP -. "Expected: ICMP Frag Needed<br/>(Type 3 Code 4, MTU=1420)<br/>❌ NEVER GENERATED" .-> Client

    RT -. "packets ≤ 1420<br/>(these work)" .-> WG
    WG -- "WireGuard tunnel" --> PEER --> NAT --> Internet

    style DROP fill:#dc3545,stroke:#a71d2a,color:#fff
    style RT fill:#fd7e14,stroke:#e8590c,color:#fff
Loading

The diagram shows the outbound path from client through pf policy routing. At the route-to decision point (orange hexagon), oversized packets (1440 > MTU 1420) with DF set are silently dropped (red) — pf never generates the ICMP Fragmentation Needed that RFC 1191 requires. Small packets (≤ 1420) pass through normally (dotted path).

For comparison, normal IP forwarding (without route-to) through the same WireGuard interface correctly triggers ICMP generation by the FreeBSD kernel.

This issue was debugged with the help of Claude Code (Anthropic's AI coding assistant), which performed the systematic tcpdump analysis across both ends of the tunnel to isolate the route-to code path as the root cause. If further debugging or packet captures are needed to help resolve this, I'm happy to assist.

Environment

OPNsense 26.1.1 (amd64)
FreeBSD 14.3-RELEASE-p8
Protectli VP2420 (Intel Celeron J6412 @ 2.00GHz)
Network: Intel I225-V (igc driver)
WireGuard via os-wireguard plugin

Metadata

Metadata

Assignees

No one assigned

    Labels

    supportCommunity support or awaiting triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions