Skip to content

High Process fork()/exec() rates cause drops and missing process exits #324

@gmarler

Description

@gmarler

Here are details on one of many hosts that exhibit this issue, and what results from it:

  • Every 15 second window, one process (calico-felix) fork()s ~6000 times over a 6 second window, then immediately exec()s ipset on top of those fork()'ed copies another ~6000 times - lightswitch is not running for this time range, and we don't see anything in the BPF kernel logs:
13:33:13

13:33:14

13:33:15

13:33:16

13:33:17

13:33:18

13:33:19
@[tracepoint:sched:sched_process_exec, ipset]: 425
@[tracepoint:sched:sched_process_fork, calico-felix]: 426
13:33:20
@[tracepoint:sched:sched_process_fork, calico-felix]: 1269
@[tracepoint:sched:sched_process_exec, ipset]: 1269
13:33:21
@[tracepoint:sched:sched_process_fork, calico-felix]: 1261
@[tracepoint:sched:sched_process_exec, ipset]: 1261
13:33:22
@[tracepoint:sched:sched_process_fork, calico-felix]: 1209
@[tracepoint:sched:sched_process_exec, ipset]: 1210
13:33:23
@[tracepoint:sched:sched_process_fork, calico-felix]: 1172
@[tracepoint:sched:sched_process_exec, ipset]: 1172
13:33:24
@[tracepoint:sched:sched_process_fork, calico-felix]: 480
@[tracepoint:sched:sched_process_exec, ipset]: 481
13:33:25

13:33:26

  • Now, lightswitch is started. During the above events, you will now see in the BPF kernel log the following data drops - again, only if lightswitch is running:
CPU:70 [LOST 160 EVENTS]
CPU:10 [LOST 82 EVENTS]
CPU:64 [LOST 80 EVENTS]
CPU:64 [LOST 81 EVENTS]
CPU:64 [LOST 82 EVENTS]
CPU:52 [LOST 81 EVENTS]
CPU:52 [LOST 81 EVENTS]
CPU:26 [LOST 84 EVENTS]
  • Confirmed with bpftrace that the number of exit() tracepoints firing match the number of fork() tracepoints firing whether lightswitch is running or not, so eBPF itself isn't missing any exit events.
  • Since these processes are so short lived, lightswitch is only capturing stacks for a small fraction of them. But it's enough to overload things.
  • The resource consumption logs indicate that:
    • The number of new short-lived processes we've detected via incoming samples is fairly high, as one would expect
    • But we're only detecting the exit() of a fraction of the processes that we know fired the exit() tracepoint via separate simultaneous bptrace
      Example:
      2025-10-11T13:53:25.645911Z DEBUG lightswitch::profiler: 55 new processes this session, 1493 new processes since profiler startup
      2025-10-11T13:53:25.645912Z DEBUG lightswitch::profiler: 0 process exits this session; 80 process exits since profiler startup
      
      After adding debug messages to the new process detection code as well as the process exit code, we see messages like the following, which indicate that if the process is short lived enough, the process_exit tracepoint fires before we've had a chance to add the new process to the exec_mappings map such that the BPF process_is_known() (distinct from the Rust/userspace process_is_known() function) cannot detect the process is being tracked before the process exit() has already occurred:
            <...>-188302  [042] d..h1.. 2149412.246371: bpf_trace_printk: [debug] new PID 188302 DETECTION queued
            ipset-188302  [042] ....2.1 2149412.246681: bpf_trace_printk: [debug] exit tracer PID 188302 is NOT KNOWN
      
      
  • It's not yet clear exactly where the events are being lost, continuing to investigate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions