Skip to content

processor_tda: Implement Topological Data Analysis (TDA) plugin for metrics#11250

Merged
edsiper merged 9 commits intomasterfrom
cosmo0920-ripser-for-analysis
Feb 14, 2026
Merged

processor_tda: Implement Topological Data Analysis (TDA) plugin for metrics#11250
edsiper merged 9 commits intomasterfrom
cosmo0920-ripser-for-analysis

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Dec 3, 2025

This PR introduces a new processor plugin, tda, which performs Topological Data Analysis (TDA) on stream metrics using persistent homology.

The plugin aggregates incoming counters, gauges, and untyped metrics into a unified n-dimensional feature vector, maintains a sliding window, and utilizes a C-wrapped version of Ripser to compute Betti numbers.

Implementation Details:

  1. Metric Aggregation & Normalization:
    Multiple metric streams are mapped to a fixed feature dimension. To handle varying magnitudes and bursty traffic:
    • Counters are converted to rates (differentiated against the previous snapshot).
    • Values are normalized using log1p (natural logarithm of 1 + magnitude) to dampen dynamic range before distance calculation.
  2. Sliding Window & Phase Space Reconstruction:
    The plugin keeps a ring buffer of these vectors. Before processing, it optionally applies Delay Embedding (see below) to reconstruct the phase space geometry.
  3. Persistent Homology via Ripser:
    A dense Euclidean distance matrix is computed from the window. Ripser determines the persistence intervals, which are summarized into Betti numbers exported as new gauges:
    • fluentbit.tda.betti0: Connected components (clusters).
    • fluentbit.tda.betti1: Loops/Cycles (recurrence).
    • fluentbit.tda.betti2: Voids (higher-order structures).

Delay Embedding (Takens' Theorem):

This plugin supports an optional delay embedding [2] of the aggregated metric vectors. When embed_dim > 1, we reconstruct the state space vectors $x_t$ as:

$$x_t \to (x_t, x_{t-\tau}, \dots, x_{t-(m-1)\tau})$$

Where:

  • $m =$ embed_dim
  • $\tau =$ embed_delay

This transformation allows the processor to detect cyclic or quasi-periodic regimes (loops in the trajectory) even from limited metric dimensions. These loops translate into $H_1$ features in the persistent homology. If embed_dim = 1 (default), the behavior falls back to the original "no embedding" mode.

Motivation:

TDA and persistent homology can help reveal hidden order, phase transitions, or subtle cyclic behaviors in complex systems that are not easily visible from raw time series or standard statistical aggregates. Similar approaches have been explored in condensed matter physics [1] for detecting phase transitions.

Configuration Options:

  • window_size (int, default: 60): Number of samples to keep in the TDA sliding window.
  • min_points (int, default: 10): Minimum number of samples required before running Ripser.
  • embed_dim (int, default: 3): Delay embedding dimension ($m$). Set to 1 to disable.
  • embed_delay (int, default: 1): Lag ($\tau$) in samples between successive delays.
  • threshold (double, default: 0): Distance scale selector. 0 enables auto multi-quantile scan; (0,1) uses the specific quantile.

References:

  1. Donato, I., Gori, M., & Sarti, A. (2016). Persistent homology analysis of phase transitions. Physical Review E, 93, 052138.
  2. F. Takens, "Detecting strange attractors in turbulence", in D. Rand and L.-S. Young (eds.), Dynamical Systems and Turbulence, Lecture Notes in Mathematics, vol. 898, Springer, 1981, pp. 366-381.

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
service:
  http_server: On
  http_port: 2021
pipeline:
    inputs:
      - name: dummy
        tag: log.raw
        samples: 10000
      - name: fluentbit_metrics
        tag: metrics.raw

        processors:
          metrics:
            - name: metrics_selector
              metric_name: /process_start_time_seconds/
              action: exclude
            - name: metrics_selector
              metric_name: /build_info/
              action: exclude
            - name: tda

    outputs:
      - name: stdout
        match: '*'

Additional Log:

2025-12-03T07:27:16.013990065Z fluentbit_tda_betti0 = 39
2025-12-03T07:27:16.013990065Z fluentbit_tda_betti1 = 7
2025-12-03T07:27:16.013990065Z fluentbit_tda_betti2 = 0
[2025/12/03 16:27:16.930210000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:16.930442000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:16.930461000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:16.930554000] [ warn] [engine] failed to flush chunk '30288-1764746835.908400000.flb', retry in 9 seconds: task_id=5, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:18.017259794Z fluentbit_tda_betti0 = 40
2025-12-03T07:27:18.017259794Z fluentbit_tda_betti1 = 7
2025-12-03T07:27:18.017259794Z fluentbit_tda_betti2 = 0
2025-12-03T07:27:20.024738944Z fluentbit_tda_betti0 = 41
2025-12-03T07:27:20.024738944Z fluentbit_tda_betti1 = 7
2025-12-03T07:27:20.024738944Z fluentbit_tda_betti2 = 1
[2025/12/03 16:27:21.995837000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
2025-12-03T07:27:22.033923596Z fluentbit_tda_betti0 = 42
2025-12-03T07:27:22.033923596Z fluentbit_tda_betti1 = 7
2025-12-03T07:27:22.033923596Z fluentbit_tda_betti2 = 0
[2025/12/03 16:27:23.605981000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606029000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606022000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606048000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606089000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606101000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606148000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606162000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606243000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606255000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606328000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606340000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606400000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606412000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606437000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606458000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606472000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606471000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606560000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:23.606578000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:23.606593000] [error] [engine] chunk '30288-1764746830.908242000.flb' cannot be retried: task_id=0, input=dummy.0 > output=http.0
[2025/12/03 16:27:23.606725000] [ warn] [engine] failed to flush chunk '30288-1764746841.908663000.flb', retry in 10 seconds: task_id=8, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:23.606780000] [ warn] [engine] failed to flush chunk '30288-1764746840.908811000.flb', retry in 6 seconds: task_id=13, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:23.606825000] [ warn] [engine] failed to flush chunk '30288-1764746838.907344000.flb', retry in 10 seconds: task_id=11, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:23.606845000] [error] [engine] chunk '30288-1764746827.908136000.flb' cannot be retried: task_id=2, input=dummy.0 > output=http.0
[2025/12/03 16:27:23.606905000] [error] [engine] chunk '30288-1764746831.908588000.flb' cannot be retried: task_id=6, input=dummy.0 > output=http.0
[2025/12/03 16:27:23.606941000] [error] [engine] chunk '30288-1764746828.909077000.flb' cannot be retried: task_id=3, input=dummy.0 > output=http.0
[2025/12/03 16:27:23.606991000] [error] [engine] chunk '30288-1764746832.908679000.flb' cannot be retried: task_id=7, input=dummy.0 > output=http.0
[2025/12/03 16:27:23.607060000] [ warn] [engine] failed to flush chunk '30288-1764746839.908556000.flb', retry in 8 seconds: task_id=12, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:23.607112000] [ warn] [engine] failed to flush chunk '30288-1764746837.908369000.flb', retry in 6 seconds: task_id=10, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:23.932627000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:23.932773000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:23.932794000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:24.30227000] [ warn] [engine] failed to flush chunk '30288-1764746842.908959000.flb', retry in 10 seconds: task_id=0, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:24.030021859Z fluentbit_tda_betti0 = 43
2025-12-03T07:27:24.030021859Z fluentbit_tda_betti1 = 9
2025-12-03T07:27:24.030021859Z fluentbit_tda_betti2 = 0
[2025/12/03 16:27:25.777729000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:25.777762000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:25.777780000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:25.777804000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:25.777824000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:25.777842000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:25.777864000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:25.777883000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:25.777916000] [error] [engine] chunk '30288-1764746826.908579000.flb' cannot be retried: task_id=1, input=dummy.0 > output=http.0
[2025/12/03 16:27:25.777955000] [error] [engine] chunk '30288-1764746833.907394000.flb' cannot be retried: task_id=9, input=dummy.0 > output=http.0
[2025/12/03 16:27:25.777981000] [error] [engine] chunk '30288-1764746829.908135000.flb' cannot be retried: task_id=4, input=dummy.0 > output=http.0
[2025/12/03 16:27:25.778051000] [ warn] [engine] failed to flush chunk '30288-1764746843.907694000.flb', retry in 11 seconds: task_id=2, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:25.918758000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:25.918857000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:25.918872000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:26.13749000] [ warn] [engine] failed to flush chunk '30288-1764746844.908008000.flb', retry in 8 seconds: task_id=1, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:26.25748000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:26.25885000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:26.25899000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:26.25923000] [error] [engine] chunk '30288-1764746835.908400000.flb' cannot be retried: task_id=5, input=dummy.0 > output=http.0
2025-12-03T07:27:26.013531523Z fluentbit_tda_betti0 = 44
2025-12-03T07:27:26.013531523Z fluentbit_tda_betti1 = 9
2025-12-03T07:27:26.013531523Z fluentbit_tda_betti2 = 0
[2025/12/03 16:27:27.572929000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:27.572971000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:27.572986000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:27.573010000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:27.573137000] [ warn] [engine] failed to flush chunk '30288-1764746845.906586000.flb', retry in 9 seconds: task_id=3, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:27.573176000] [error] [engine] chunk '30288-1764746834.908597000.flb' cannot be retried: task_id=14, input=dummy.0 > output=http.0
[2025/12/03 16:27:27.935499000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:27.935638000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:27.935659000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:28.50004000] [ warn] [engine] failed to flush chunk '30288-1764746846.908355000.flb', retry in 10 seconds: task_id=4, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:28.049800623Z fluentbit_tda_betti0 = 45
2025-12-03T07:27:28.049800623Z fluentbit_tda_betti1 = 10
2025-12-03T07:27:28.049800623Z fluentbit_tda_betti2 = 1
[2025/12/03 16:27:29.881491000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:29.881542000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:29.881535000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:29.881565000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:29.881633000] [error] [http_client] broken connection to localhost:8443 ?
[2025/12/03 16:27:29.881648000] [error] [output:http:http.0] could not flush records to localhost:8443 (http_do=-1)
[2025/12/03 16:27:29.881736000] [error] [engine] chunk '30288-1764746840.908811000.flb' cannot be retried: task_id=13, input=dummy.0 > output=http.0
[2025/12/03 16:27:29.881782000] [error] [engine] chunk '30288-1764746837.908369000.flb' cannot be retried: task_id=10, input=dummy.0 > output=http.0
[2025/12/03 16:27:29.881873000] [ warn] [engine] failed to flush chunk '30288-1764746847.906725000.flb', retry in 9 seconds: task_id=5, input=dummy.0 > output=http.0 (out_id=0)
[2025/12/03 16:27:29.926923000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:29.927037000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:29.927068000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:30.37614000] [ warn] [engine] failed to flush chunk '30288-1764746848.907077000.flb', retry in 11 seconds: task_id=6, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:30.037452292Z fluentbit_tda_betti0 = 46
2025-12-03T07:27:30.037452292Z fluentbit_tda_betti1 = 12
2025-12-03T07:27:30.037452292Z fluentbit_tda_betti2 = 1
[2025/12/03 16:27:30.937891000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:30.938089000] [error] [net] TCP connection failed: localhost:8443 (Connection refused)
[2025/12/03 16:27:30.938120000] [error] [output:http:http.0] no upstream connections available to localhost:8443
[2025/12/03 16:27:30.938185000] [ warn] [engine] failed to flush chunk '30288-1764746849.906988000.flb', retry in 11 seconds: task_id=7, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:32.060665835Z fluentbit_tda_betti0 = 47
2025-12-03T07:27:32.060665835Z fluentbit_tda_betti1 = 12
2025-12-03T07:27:32.060665835Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:34.073613867Z fluentbit_tda_betti0 = 48
2025-12-03T07:27:34.073613867Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:34.073613867Z fluentbit_tda_betti2 = 2
[2025/12/03 16:27:36.264575000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
[2025/12/03 16:27:36.264791000] [ info] [engine] flush chunk '30288-1764746839.908556000.flb' succeeded at retry 1: task_id=12, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:36.082592852Z fluentbit_tda_betti0 = 49
2025-12-03T07:27:36.082592852Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:36.082592852Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:38.070223396Z fluentbit_tda_betti0 = 50
2025-12-03T07:27:38.070223396Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:38.070223396Z fluentbit_tda_betti2 = 1
2025-12-03T07:27:40.066529659Z fluentbit_tda_betti0 = 51
2025-12-03T07:27:40.066529659Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:40.066529659Z fluentbit_tda_betti2 = 1
[2025/12/03 16:27:41.265885000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
2025-12-03T07:27:42.072408891Z fluentbit_tda_betti0 = 52
2025-12-03T07:27:42.072408891Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:42.072408891Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:44.090558970Z fluentbit_tda_betti0 = 53
2025-12-03T07:27:44.090558970Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:44.090558970Z fluentbit_tda_betti2 = 2
[2025/12/03 16:27:46.264449000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
2025-12-03T07:27:46.100740206Z fluentbit_tda_betti0 = 54
2025-12-03T07:27:46.100740206Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:46.100740206Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:48.094721175Z fluentbit_tda_betti0 = 55
2025-12-03T07:27:48.094721175Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:48.094721175Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:50.083926971Z fluentbit_tda_betti0 = 56
2025-12-03T07:27:50.083926971Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:50.083926971Z fluentbit_tda_betti2 = 2
[2025/12/03 16:27:51.265835000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
[2025/12/03 16:27:51.266086000] [ info] [engine] flush chunk '30288-1764746841.908663000.flb' succeeded at retry 1: task_id=8, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:52.104521303Z fluentbit_tda_betti0 = 56
2025-12-03T07:27:52.104521303Z fluentbit_tda_betti1 = 13
2025-12-03T07:27:52.104521303Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:54.104147236Z fluentbit_tda_betti0 = 56
2025-12-03T07:27:54.104147236Z fluentbit_tda_betti1 = 12
2025-12-03T07:27:54.104147236Z fluentbit_tda_betti2 = 2
[2025/12/03 16:27:56.264185000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
[2025/12/03 16:27:56.264291000] [ info] [engine] flush chunk '30288-1764746838.907344000.flb' succeeded at retry 1: task_id=11, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:27:56.107265131Z fluentbit_tda_betti0 = 56
2025-12-03T07:27:56.107265131Z fluentbit_tda_betti1 = 12
2025-12-03T07:27:56.107265131Z fluentbit_tda_betti2 = 2
2025-12-03T07:27:58.093616840Z fluentbit_tda_betti0 = 56
2025-12-03T07:27:58.093616840Z fluentbit_tda_betti1 = 12
2025-12-03T07:27:58.093616840Z fluentbit_tda_betti2 = 2
2025-12-03T07:28:00.088709102Z fluentbit_tda_betti0 = 56
2025-12-03T07:28:00.088709102Z fluentbit_tda_betti1 = 12
2025-12-03T07:28:00.088709102Z fluentbit_tda_betti2 = 2
[2025/12/03 16:28:01.264444000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
2025-12-03T07:28:02.097946671Z fluentbit_tda_betti0 = 56
2025-12-03T07:28:02.097946671Z fluentbit_tda_betti1 = 11
2025-12-03T07:28:02.097946671Z fluentbit_tda_betti2 = 2
2025-12-03T07:28:04.093339307Z fluentbit_tda_betti0 = 56
2025-12-03T07:28:04.093339307Z fluentbit_tda_betti1 = 12
2025-12-03T07:28:04.093339307Z fluentbit_tda_betti2 = 2
[2025/12/03 16:28:06.264442000] [ info] [output:http:http.0] localhost:8443, HTTP status=200
{"status":"ok","errors":false}
[2025/12/03 16:28:06.264618000] [ info] [engine] flush chunk '30288-1764746844.908008000.flb' succeeded at retry 1: task_id=1, input=dummy.0 > output=http.0 (out_id=0)
2025-12-03T07:28:06.093379532Z fluentbit_tda_betti0 = 56
2025-12-03T07:28:06.093379532Z fluentbit_tda_betti1 = 12
2025-12-03T07:28:06.093379532Z fluentbit_tda_betti2 = 2

For just one-time failing case, there is no increasing betti1 and betti2 metrics.
But intermittent failing cases just like the above, this higher order of metrics would raise and detected some of the "phase transitions" which means that there's no stable phase.

  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

This log is macOS's memory leak detector:

Process 30709 is not debuggable. Due to security restrictions, leaks can only show or save contents of readonly memory of restricted processes.

Process:         fluent-bit [30709]
Path:            /Users/USER/*/fluent-bit
Load Address:    0x104aa4000
Identifier:      fluent-bit
Version:         0
Code Type:       ARM64
Platform:        macOS
Parent Process:  leaks [30708]
Target Type:     live task

Date/Time:       2025-12-03 16:33:19.616 +0900
Launch Time:     2025-12-03 16:33:06.144 +0900
OS Version:      macOS 26.0.1 (25A362)
Report Version:  7
Analysis Tool:   /usr/bin/leaks

Physical footprint:         13.0M
Physical footprint (peak):  13.1M
Idle exit:                  untracked
----

leaks Report Version: 4.0, multi-line stacks
Process 30709: 2752 nodes malloced for 419 KB
Process 30709: 0 leaks for 0 total leaked bytes.

[2025/12/03 16:33:20] [engine] caught signal (SIGCONT)
[2025/12/03 16:33:20] Fluent Bit Dump

There's no leaks in this plugin.

Plus, there's no rules but the TDA metrics tells there's something happens with betti2 and betti1 metrics with non-zeros:

image

This metrics' detector is different direction to lighten in the depth of anomaly detections.

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#2277

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Optional Ripser integration for persistent homology and a new TDA processor that emits Betti metrics (betti0/betti1/betti2).
  • API

    • Public Ripser C API and wrapper to compute Betti numbers and persistent intervals from dense distance matrices; public headers and installable ripser headers added.
  • Documentation

    • Added Ripser README, license, and contributing guidance.
  • Tests

    • New unit test validating Ripser Betti computations.
  • Chores

    • Build and Docker flags to enable/disable Ripser and wire install/build steps.

@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Ripser v1.2.1 as an optional bundled library, exposes a C wrapper and C++ integration, introduces a new TDA processor plugin that computes Betti numbers from time-series via delay embedding, and wires build, packaging, tests, and header installation to conditionally include Ripser support.

Changes

Cohort / File(s) Summary
Top-level CMake & options
CMakeLists.txt, cmake/libraries.cmake, cmake/plugins_options.cmake, src/CMakeLists.txt, include/CMakeLists.txt
Add FLB_USE_RIPSER/FLB_RIPSER detection and public option, define FLB_PATH_LIB_RIPSER, add FLB_PROCESSOR_TDA option, set FLB_HAVE_RIPSER when enabled, conditional add_subdirectory(ripser), and enforce C++11 for Ripser paths.
Bundled Ripser library
lib/ripser-1.2.1/*
Add Ripser 1.2.1 sources, build files (CMakeLists/Makefile), license/CONTRIBUTING, README, .gitignore/.gitmodules, examples, and define ripser-static target.
Ripser wrapper & integration
include/fluent-bit/ripser/flb_ripser_wrapper.h, src/ripser/flb_ripser_wrapper.cpp, src/ripser/CMakeLists.txt, lib/ripser-1.2.1/ripser_internal.hpp
Add public C API header for Betti/intervals, implement C++ wrapper converting dense matrices to Ripser formats, interval filtering/bridging, two exported functions for Betti and intervals, and add flb-ripser-wrapper-static target linking to ripser-static.
Processor plugin (TDA)
plugins/processor_tda/tda.h, plugins/processor_tda/tda.c, plugins/processor_tda/CMakeLists.txt, plugins/CMakeLists.txt
Add tda processor plugin gated by FLB_RIPSER implementing sliding windows, grouping, delay embedding, dense distance construction, Ripser-driven Betti computation, and register processor_tda_plugin.
Tests
tests/internal/ripser.c, tests/internal/CMakeLists.txt
Add unit test test_ripser_betti_circle() and conditionally include ripser test source when FLB_RIPSER is enabled.
Packaging & Docker
dockerfiles/Dockerfile.centos7, packaging/distros/centos/Dockerfile
Add FLB_RIPSER build ARG/ENV in CentOS Dockerfiles, propagate through stages, pass -DFLB_RIPSER="${FLB_RIPSER}" to CMake; centos7 Dockerfile sets FLB_RIPSER=Off in build.
Install & headers
include/CMakeLists.txt, include/fluent-bit/ripser/*.h
Install Ripser wrapper headers to ${FLB_INSTALL_INCLUDEDIR}/fluent-bit/ripser/ when FLB_RIPSER is enabled.
Misc / examples
lib/ripser-1.2.1/examples/*
Add example distance-matrix data files and Ripser README documentation.

Sequence Diagram(s)

sequenceDiagram
    participant Metrics as Metrics Stream
    participant Processor as TDA Processor
    participant Window as Sliding Window
    participant Embed as Delay Embedding
    participant DistMat as Dense→Compressed Builder
    participant Ripser as Ripser Engine
    participant Export as Metrics Export

    Metrics->>Processor: incoming metric points
    Processor->>Window: append / rotate samples
    Window->>Processor: snapshot when window ready
    Processor->>Embed: build embedded vectors (m, τ)
    Embed->>DistMat: compute dense pairwise distances
    DistMat->>Ripser: convert to compressed & run
    Ripser-->>Processor: emit intervals / betti counts (via bridge)
    Processor->>Export: emit betti gauges (betti0, betti1, betti2)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested reviewers

  • koleini
  • patrick-stephens
  • fujimotos

Poem

🐰
I hopped through windows, stitched time's thread,
Counted loops where metrics tread,
Holes and cycles, small and grand,
Betti bells ring across the land —
A rabbit's joy in data spread.

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (23 files):

⚔️ CMakeLists.txt (content)
⚔️ cmake/libbacktrace.cmake (content)
⚔️ cmake/libraries.cmake (content)
⚔️ cmake/plugins_options.cmake (content)
⚔️ dockerfiles/Dockerfile.centos7 (content)
⚔️ include/CMakeLists.txt (content)
⚔️ packaging/distros/centos/Dockerfile (content)
⚔️ plugins/CMakeLists.txt (content)
⚔️ plugins/in_tail/tail_config.c (content)
⚔️ plugins/in_tail/tail_config.h (content)
⚔️ plugins/in_tail/tail_db.c (content)
⚔️ plugins/in_tail/tail_file.c (content)
⚔️ plugins/in_winevtlog/in_winevtlog.c (content)
⚔️ plugins/in_winevtlog/pack.c (content)
⚔️ plugins/in_winevtlog/winevtlog.c (content)
⚔️ plugins/in_winevtlog/winevtlog.h (content)
⚔️ src/CMakeLists.txt (content)
⚔️ src/flb_lock.c (content)
⚔️ src/flb_processor.c (content)
⚔️ src/tls/flb_tls.c (content)
⚔️ src/tls/openssl.c (content)
⚔️ tests/internal/CMakeLists.txt (content)
⚔️ tests/runtime_shell/processor_conditional.sh (content)

These conflicts must be resolved before merging into master.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing a Topological Data Analysis (TDA) processor plugin for metrics, which aligns with the primary objective and major changes throughout the pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-ripser-for-analysis

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cosmo0920 cosmo0920 force-pushed the cosmo0920-ripser-for-analysis branch from 47dccf4 to 327ad4a Compare December 3, 2025 07:24
@cosmo0920 cosmo0920 force-pushed the cosmo0920-ripser-for-analysis branch from d7c8e49 to 162f01e Compare December 3, 2025 08:23
@cosmo0920 cosmo0920 force-pushed the cosmo0920-ripser-for-analysis branch from 7c7cad7 to 9f9d30b Compare December 3, 2025 11:47
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (10)
lib/ripser-1.2.1/.gitignore (1)

1-4: Looks good! The patterns correctly ignore Ripser binaries and macOS build artifacts.

This covers the essential outputs (ripser, ripser-coeff binaries) and platform-specific artifacts (dSYM debug symbols, DerivedData Xcode cache).

If you anticipate building with different configurations or toolchains, consider expanding this to include other common build artifacts (e.g., *.o, *.a, *.so, *.dylib, CMakeFiles/, Makefile, cmake_install.cmake) for robustness. However, if your build system places these in a separate build/ directory (which is typical with CMake), the current minimal approach is fine.

lib/ripser-1.2.1/examples/random20.lower_distance_matrix (1)

1-1: Example data file looks appropriate.

This file correctly represents a flattened lower-triangular distance matrix for 20 points (190 values = 1+2+...+19). However, the file is missing a trailing newline, which some linters and POSIX standards expect.

Consider adding a trailing newline for better POSIX compliance:

-68,6,155,10,173,171,168,52,32,63,136,16,78,163,8,175,28,70,107,165,18,97,99,118,49,76,48,133,58,92,44,57,190,101,25,94,148,37,2,146,56,95,125,121,142,31,21,152,154,124,75,120,112,45,39,115,170,179,157,183,15,14,38,182,151,164,185,96,127,41,53,180,166,122,134,100,20,169,36,40,87,46,140,82,187,71,12,178,159,184,189,3,86,61,137,93,116,167,150,186,126,172,29,34,135,27,144,177,73,19,147,17,26,1,139,22,69,9,138,160,181,105,24,129,91,11,64,103,132,130,4,47,66,85,65,149,109,161,81,128,7,113,67,77,54,35,30,74,98,114,5,176,158,62,89,79,156,102,119,141,51,143,108,131,106,33,123,43,174,55,83,80,145,104,153,13,188,90,117,111,60,84,23,59,88,110,42,50,72,162
\ No newline at end of file
+68,6,155,10,173,171,168,52,32,63,136,16,78,163,8,175,28,70,107,165,18,97,99,118,49,76,48,133,58,92,44,57,190,101,25,94,148,37,2,146,56,95,125,121,142,31,21,152,154,124,75,120,112,45,39,115,170,179,157,183,15,14,38,182,151,164,185,96,127,41,53,180,166,122,134,100,20,169,36,40,87,46,140,82,187,71,12,178,159,184,189,3,86,61,137,93,116,167,150,186,126,172,29,34,135,27,144,177,73,19,147,17,26,1,139,22,69,9,138,160,181,105,24,129,91,11,64,103,132,130,4,47,66,85,65,149,109,161,81,128,7,113,67,77,54,35,30,74,98,114,5,176,158,62,89,79,156,102,119,141,51,143,108,131,106,33,123,43,174,55,83,80,145,104,153,13,188,90,117,111,60,84,23,59,88,110,42,50,72,162
lib/ripser-1.2.1/examples/random16.lower_distance_matrix (1)

1-15: Example data file looks appropriate.

This file correctly represents a lower-triangular distance matrix for 16 points in readable row-by-row format. However, the file is missing a trailing newline.

Consider adding a trailing newline for better POSIX compliance:

- 18, 54, 67, 79, 26, 96,  6, 20, 63,  1, 33,110,113,106,119
\ No newline at end of file
+ 18, 54, 67, 79, 26, 96,  6, 20, 63,  1, 33,110,113,106,119
lib/ripser-1.2.1/examples/projective_plane.lower_distance_matrix (1)

2-13: Data format looks correct, but missing trailing newline.

The lower-triangular matrix data (lines 2-13) appears correctly formatted for a 13-point dataset. However, the file is missing a trailing newline.

Consider adding a trailing newline for better POSIX compliance:

-2,2,2,1,1,1,1,1,1,2,1,1
\ No newline at end of file
+2,2,2,1,1,1,1,1,1,2,1,1
dockerfiles/Dockerfile.centos7 (1)

31-40: Explicitly disabling Ripser in the CentOS 7 compile-check image makes sense, but ensure it’s covered elsewhere.

-DFLB_RIPSER=Off here is reasonable given the older CentOS 7 toolchain; just confirm that at least one other CI or packaging path builds with FLB_RIPSER=On so the TDA+Ripser integration is regularly compile-tested.

CMakeLists.txt (1)

145-155: Redundant check_language(CXX) call.

The check_language(CXX) was already called at line 132. This second call is unnecessary since CMAKE_CXX_COMPILER would already be set from the first check.

 # another try for CXX
 if (NOT DEFINED FLB_USE_RIPSER)
-  check_language(CXX)
   if(CMAKE_CXX_COMPILER)
     message(STATUS "CXX compiler found, enable ripser.")
     set(FLB_USE_RIPSER Yes)
plugins/processor_tda/tda.h (1)

23-29: Missing include for lwrb_t type used in tda_window.

The tda_window struct uses lwrb_t at line 44, but the lwrb/lwrb.h header is not included in this header file. Also, struct flb_hash_table and struct flb_processor_instance are used but not declared. The forward declarations at lines 28-29 are also redundant since the structs are defined immediately below.

 #include <fluent-bit/ripser/flb_ripser_wrapper.h>
 #include <cmetrics/cmetrics.h>
 #include <cmetrics/cmt_map.h>
 #include <cfl/cfl_sds.h>
+#include <lwrb/lwrb.h>

-struct tda_window;
-struct tda_proc_ctx;
+/* forward declarations for external types */
+struct flb_hash_table;
+struct flb_processor_instance;
src/ripser/flb_ripser_wrapper.cpp (1)

119-171: Unused acc.max_dim field.

At line 136, acc.max_dim = max_dim + 1 is set but never read. The betti_accumulator struct has a max_dim field that isn't used in the callback logic.

Consider removing the unused field:

 struct betti_accumulator {
-    int max_dim;
     int num_dims;
     int betti[8];
 };

And remove line 136.

include/fluent-bit/ripser/flb_ripser_wrapper.h (1)

29-46: Mismatch between FLB_RIPSER_MAX_BETTI_DIM and betti array size.

FLB_RIPSER_MAX_BETTI_DIM is defined as 3, but betti[8] can hold 8 dimensions. The wrapper implementation also caps at 8 but filters using FLB_RIPSER_MAX_BETTI_DIM. Consider either:

  1. Making the macro 8 to match the array, or
  2. Reducing the array to betti[FLB_RIPSER_MAX_BETTI_DIM]

This would make the intent clearer and prevent silent dimension truncation.

-#define FLB_RIPSER_MAX_BETTI_DIM 3
+#define FLB_RIPSER_MAX_BETTI_DIM 8

Or if 3 is the intentional limit:

 typedef struct flb_ripser_betti {
     int max_dim;       /* maximum computed dimension */
     int num_dims;      /* number of valid dimensions (0..num_dims-1) */
-    int betti[8];      /* Betti numbers for each dimension */
+    int betti[FLB_RIPSER_MAX_BETTI_DIM];  /* Betti numbers for each dimension */
 } flb_ripser_betti;
lib/ripser-1.2.1/ripser_internal.hpp (1)

89-89: Include guard comment mismatch.

The include guard opens with RIPSER_INTERNAL_HPP but the closing comment references RIPSER_WRAPPER_H. This inconsistency could cause confusion during maintenance.

-#endif /* RIPSER_WRAPPER_H */
+#endif /* RIPSER_INTERNAL_HPP */
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 200ad59 and 87638a0.

⛔ Files ignored due to path filters (3)
  • lib/ripser-1.2.1/examples/pointsCycloOctane.csv is excluded by !**/*.csv
  • lib/ripser-1.2.1/examples/projective_plane.csv is excluded by !**/*.csv
  • lib/ripser-1.2.1/examples/rp2_600.lower_distance_matrix.csv is excluded by !**/*.csv
📒 Files selected for processing (28)
  • CMakeLists.txt (3 hunks)
  • cmake/libraries.cmake (1 hunks)
  • cmake/plugins_options.cmake (1 hunks)
  • dockerfiles/Dockerfile.centos7 (1 hunks)
  • include/CMakeLists.txt (1 hunks)
  • include/fluent-bit/ripser/flb_ripser_wrapper.h (1 hunks)
  • lib/ripser-1.2.1/.gitignore (1 hunks)
  • lib/ripser-1.2.1/.gitmodules (1 hunks)
  • lib/ripser-1.2.1/CMakeLists.txt (1 hunks)
  • lib/ripser-1.2.1/CONTRIBUTING.txt (1 hunks)
  • lib/ripser-1.2.1/COPYING.txt (1 hunks)
  • lib/ripser-1.2.1/Makefile (1 hunks)
  • lib/ripser-1.2.1/README.md (1 hunks)
  • lib/ripser-1.2.1/examples/projective_plane.lower_distance_matrix (1 hunks)
  • lib/ripser-1.2.1/examples/random16.lower_distance_matrix (1 hunks)
  • lib/ripser-1.2.1/examples/random20.lower_distance_matrix (1 hunks)
  • lib/ripser-1.2.1/ripser.cpp (1 hunks)
  • lib/ripser-1.2.1/ripser_internal.hpp (1 hunks)
  • packaging/distros/centos/Dockerfile (7 hunks)
  • plugins/CMakeLists.txt (1 hunks)
  • plugins/processor_tda/CMakeLists.txt (1 hunks)
  • plugins/processor_tda/tda.c (1 hunks)
  • plugins/processor_tda/tda.h (1 hunks)
  • src/CMakeLists.txt (2 hunks)
  • src/ripser/CMakeLists.txt (1 hunks)
  • src/ripser/flb_ripser_wrapper.cpp (1 hunks)
  • tests/internal/CMakeLists.txt (1 hunks)
  • tests/internal/ripser.c (1 hunks)
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • src/CMakeLists.txt
  • cmake/plugins_options.cmake
  • packaging/distros/centos/Dockerfile
  • dockerfiles/Dockerfile.centos7
  • include/CMakeLists.txt
  • CMakeLists.txt
  • cmake/libraries.cmake
  • src/ripser/CMakeLists.txt
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • cmake/plugins_options.cmake
  • CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • cmake/plugins_options.cmake
  • packaging/distros/centos/Dockerfile
  • CMakeLists.txt
  • cmake/libraries.cmake
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • cmake/plugins_options.cmake
  • CMakeLists.txt
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • CMakeLists.txt
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • include/fluent-bit/ripser/flb_ripser_wrapper.h
📚 Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

  • cmake/libraries.cmake
🧬 Code graph analysis (3)
src/ripser/flb_ripser_wrapper.cpp (2)
lib/ripser-1.2.1/ripser.cpp (13)
  • i (236-238)
  • i (236-236)
  • i (241-243)
  • i (241-241)
  • i (269-275)
  • i (269-269)
  • i (288-294)
  • i (288-288)
  • i (409-411)
  • i (409-409)
  • j (470-486)
  • ripser_run_from_compressed_lower (968-986)
  • ripser_run_from_compressed_lower (968-973)
lib/ripser-1.2.1/ripser_internal.hpp (4)
  • i (58-58)
  • dim (72-76)
  • dim (72-72)
  • ripser_run_from_compressed_lower (82-87)
lib/ripser-1.2.1/ripser_internal.hpp (1)
lib/ripser-1.2.1/ripser.cpp (15)
  • init_rows (219-225)
  • init_rows (219-219)
  • init_rows (227-233)
  • init_rows (227-227)
  • i (236-238)
  • i (236-236)
  • i (241-243)
  • i (241-241)
  • i (269-275)
  • i (269-269)
  • i (288-294)
  • i (288-288)
  • i (409-411)
  • i (409-409)
  • j (470-486)
lib/ripser-1.2.1/ripser.cpp (1)
lib/ripser-1.2.1/ripser_internal.hpp (3)
  • i (58-58)
  • dim (72-76)
  • dim (72-72)
🪛 checkmake (0.2.2)
lib/ripser-1.2.1/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)


[warning] 1-1: Target "build" should be declared PHONY.

(phonydeclared)


[warning] 4-4: Target "all" should be declared PHONY.

(phonydeclared)

🪛 LanguageTool
lib/ripser-1.2.1/README.md

[grammar] ~77-~77: Use a hyphen to join words.
Context: ...space, or other non-numerical character) separated list of the distance matrix en...

(QB_NEW_EN_HYPHEN)


[grammar] ~82-~82: Use a hyphen to join words.
Context: ... matrix entries below the diagonal in 32 bit float format (IEEE 754, single, litt...

(QB_NEW_EN_HYPHEN)


[grammar] ~83-~83: Use a hyphen to join words.
Context: ...se`: sparse triplet format; a whitespace separated list of entries of a sparse di...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)
lib/ripser-1.2.1/README.md

6-6: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)


14-14: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


15-15: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


28-28: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


29-29: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


33-33: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


34-34: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


36-36: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


37-37: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


38-38: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


61-61: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


62-62: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


63-63: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


64-64: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


69-69: Dollar signs used before commands without showing output

(MD014, commands-show-output)


76-76: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


77-77: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


78-78: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


79-79: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


80-80: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


81-81: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


82-82: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


83-83: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


84-84: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


85-85: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


86-86: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


87-87: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


104-104: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


106-106: Hard tabs
Column: 1

(MD010, no-hard-tabs)


107-107: Hard tabs
Column: 1

(MD010, no-hard-tabs)


108-108: Hard tabs
Column: 1

(MD010, no-hard-tabs)


109-109: Hard tabs
Column: 1

(MD010, no-hard-tabs)


110-110: Hard tabs
Column: 1

(MD010, no-hard-tabs)


111-111: Hard tabs
Column: 1

(MD010, no-hard-tabs)

🔇 Additional comments (42)
lib/ripser-1.2.1/examples/projective_plane.lower_distance_matrix (1)

1-1: The leading blank line is not inconsistent with the example files.

The projective_plane.lower_distance_matrix file is not the only example with a leading blank line. The sphere_3_192.lower_distance_matrix file also begins with a blank line, indicating this is an intentional format choice used for some datasets, not an error or anomaly. The original claim that this is inconsistent with other example distance matrix files is incorrect.

Likely an incorrect or invalid review comment.

lib/ripser-1.2.1/COPYING.txt (1)

1-21: MIT license text looks intact and appropriate for vendored Ripser.

Content matches the standard MIT template and correctly attributes Ulrich Bauer; good to keep this verbatim for compliance and ease of future Ripser upgrades.

lib/ripser-1.2.1/.gitmodules (1)

1-3: Nested robin-hood-hashing submodule registration is reasonable but implies recursive submodule updates.

This mirrors upstream Ripser and is fine; just ensure any build or packaging scripts that rely on this tree use git submodule update --init --recursive so the nested hashmap dependency is present when needed.

cmake/libraries.cmake (1)

32-32: Ripser library path wiring is consistent with other bundled libs.

FLB_PATH_LIB_RIPSER follows the existing naming and path conventions, so downstream add_subdirectory(${FLB_PATH_LIB_RIPSER}) usage should integrate cleanly.

lib/ripser-1.2.1/Makefile (1)

1-18: Keep upstream Ripser Makefile as-is; lint warnings can be ignored.

The Makefile matches typical upstream Ripser style and is not used by the Fluent Bit CMake build. checkmake’s .PHONY and test-target warnings are purely stylistic; I’d avoid modifying this vendored file to stay aligned with the official Ripser 1.2.1 release.

lib/ripser-1.2.1/README.md (1)

1-131: Vendored Ripser README looks like an unmodified upstream copy and is fine to include.

Content, options, and license references align with Ripser 1.2.1; I’d keep this file unchanged from upstream despite markdown/style lints so future upgrades remain straightforward.

lib/ripser-1.2.1/CONTRIBUTING.txt (1)

1-9: Additional contribution licensing clause is clear and consistent with vendored usage.

The text mirrors common upstream contribution terms and works alongside the MIT license in COPYING.txt; good to keep this verbatim to accurately reflect Ripser’s original contribution policy.

plugins/CMakeLists.txt (1)

344-354: Gating processor_tda registration on FLB_RIPSER is a good integration point.

This mirrors patterns like the Kafka plugins: the macro still honors FLB_PROCESSOR_TDA, and the extra if (FLB_RIPSER) avoids exposing the TDA processor when the Ripser dependency is not built.

plugins/processor_tda/CMakeLists.txt (1)

1-5: LGTM!

The CMake configuration correctly registers the TDA processor plugin and links the required Ripser dependencies.

tests/internal/CMakeLists.txt (1)

169-174: LGTM!

The conditional inclusion of ripser.c follows the established pattern for optional test components.

cmake/plugins_options.cmake (1)

77-77: LGTM!

The processor option definition follows the established pattern with a clear description and appropriate default value.

packaging/distros/centos/Dockerfile (2)

44-45: LGTM!

The FLB_RIPSER build flag configuration follows the established pattern for optional features. The defaults (Off for CentOS 7, On for CentOS 8+) are appropriate given potential C++ compiler version requirements.

Also applies to: 82-83, 125-126, 166-167, 231-232, 296-297


343-343: LGTM!

The CMake flag propagation correctly passes the FLB_RIPSER environment variable to the build configuration.

src/CMakeLists.txt (2)

366-369: LGTM!

The Ripser subdirectory inclusion follows the established pattern for optional components.


458-467: LGTM!

The Ripser dependency configuration correctly adds the required static libraries when FLB_RIPSER is enabled.

tests/internal/ripser.c (4)

1-13: LGTM!

The includes and preprocessor definitions are appropriate for the test, including the M_PI fallback for systems that don't define it.


15-65: LGTM!

The test setup correctly generates a point cloud on the unit circle and computes the pairwise Euclidean distance matrix with proper memory management.


67-107: LGTM!

The Ripser computation and validation logic correctly handles errors, manages memory properly (freed on both error and success paths), and validates the expected topological properties of a circle.


111-114: LGTM!

The test registration follows the standard pattern for internal tests.

src/ripser/CMakeLists.txt (1)

1-12: LGTM!

The Ripser wrapper library configuration correctly sets up include paths, source files, and links the required dependencies.

lib/ripser-1.2.1/CMakeLists.txt (1)

1-10: LGTM!

Clean CMake configuration for the Ripser static library. The PUBLIC visibility for include directories and C++11 compile features is appropriate since dependent targets (like the wrapper) need access to ripser_internal.hpp and the C++11 standard.

plugins/processor_tda/tda.h (1)

50-75: LGTM!

The tda_proc_ctx structure is well-organized with clear field groupings and helpful comments documenting window management, grouping, embedding parameters, output gauges, and rate conversion state.

src/ripser/flb_ripser_wrapper.cpp (2)

39-54: LGTM!

The conversion from dense matrix to compressed lower-triangular format is correct. The reserve() call pre-allocates the exact needed size for efficiency.


200-235: LGTM!

The flb_ripser_compute_intervals_from_dense_distance function correctly bridges the Ripser callback mechanism to the public C API, with proper null checks and threshold handling consistent with the Betti computation function.

include/fluent-bit/ripser/flb_ripser_wrapper.h (1)

62-93: LGTM!

Well-documented public API with clear parameter descriptions and return value semantics. The dual API design (direct Betti computation vs callback-based intervals) provides good flexibility for different use cases.

plugins/processor_tda/tda.c (6)

39-140: LGTM!

The threshold selection function correctly extracts off-diagonal distances, sorts them, and selects the appropriate quantile. Memory management and edge cases are properly handled.


142-173: LGTM!

Window creation correctly calculates sample size for the flexible array member pattern and initializes the ring buffer with appropriate error handling.


573-634: LGTM!

The ingest function properly handles ring buffer overflow by dropping oldest samples, with appropriate memory management and error handling.


1147-1176: Gauge pointers reset on every call may cause gauge recreation.

Lines 1152-1154 reset g_betti0/1/2 to NULL on every tda_proc_process_metrics call. This causes ensure_betti_gauges to create new gauges on the incoming metrics_context each time. While this may be intentional (to add gauges to each incoming cmt context), it's worth verifying this is the desired behavior rather than maintaining persistent gauges across calls.

Is the intent to add Betti gauges to each incoming metrics context? If so, this is correct. If the gauges should persist and be updated, consider not resetting the pointers, and ensuring they're only created once on a persistent context.


884-917: Distance matrix construction with delay embedding looks correct.

The embedded point indexing correctly handles the delay embedding transformation. For each pair of embedded points (i, j), the algorithm computes the Euclidean distance across all lag dimensions, which is the standard approach for time-delay embedding.


1184-1226: LGTM!

Configuration options are well-documented with sensible defaults. The plugin correctly only registers the metrics processing callback, appropriate for topological analysis on time-series metrics data.

CMakeLists.txt (1)

802-808: FLB_PATH_LIB_RIPSER is properly defined. The variable is set in cmake/libraries.cmake at line 32 as "lib/ripser-1.2.1", confirming the integration is complete. The code correctly follows the established pattern for C++ library integration.

lib/ripser-1.2.1/ripser_internal.hpp (2)

36-61: LGTM!

The compressed_distance_matrix template struct correctly implements the triangular storage pattern. The quadratic formula at line 43 correctly derives matrix dimension from the flattened distance count, and the template constructor properly populates the lower triangle from a full distance matrix.


66-77: LGTM!

The interval_recorder struct provides a clean, null-safe callback mechanism for emitting persistence intervals.

lib/ripser-1.2.1/ripser.cpp (8)

1-38: LGTM!

The license header properly acknowledges the original MIT-licensed Ripser code and notes the Fluent Bit modifications. MIT is compatible with Apache 2.0.


91-101: LGTM!

The overflow check is part of the original Ripser implementation. The check for negative values when USE_COEFFICIENTS is undefined leverages signed integer overflow detection.


227-233: Pointer arithmetic before array start.

Line 228 computes &distances[0] - 1, creating a pointer one position before the array. While this is valid in C++ as long as it's not dereferenced (the indexing pattern ensures valid accesses), this is a subtle pattern from the original Ripser code worth noting for maintainers.


245-278: LGTM!

The sparse_distance_matrix class correctly implements sparse storage with lower_bound lookups. The template constructor populates neighbors in sorted order by iterating j from 0 to size.


299-327: LGTM!

Classic union-find implementation with path compression and union by rank optimizations.


372-403: LGTM!

The ripser class is well-structured with the interval_recorder properly integrated as a member. The constructor accepts an optional recorder, defaulting to an empty one.


968-986: LGTM!

The ripser_run_from_compressed_lower entry point provides a clean interface for Fluent Bit integration. It correctly uses move semantics for the distance matrix and hardcodes Z/2Z coefficients, which is the standard efficient choice for TDA computations.


489-507: Static local variables in member functions are shared across all ripser instances and not thread-safe.

The static simplex_boundary_enumerator (line 490) and simplex_coboundary_enumerator (line 500) objects are initialized on first invocation and persist across all subsequent ripser instances. When ripser_run_from_compressed_lower() creates a new ripser instance in one thread while another thread has an active instance, the static variables will reference the first instance's this pointer, causing data races and correctness bugs. This inherited pattern from the original Ripser code manifests as a practical issue in Fluent Bit because multiple pipelines can invoke the TDA processor concurrently via tda_proc_process_metrics(), each creating independent ripser instances that share these static enumerators.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
plugins/processor_tda/tda.c (1)

1155-1157: Gauge pointers reset on every call causes unbounded allocations.

Setting ctx->g_betti* to NULL on every invocation means ensure_betti_gauges() will call cmt_gauge_create() each time, creating new gauge objects. The previous gauges remain in the cmt context but become orphaned, leading to memory growth and duplicate metric series.

The gauge pointers should be initialized once (e.g., in tda_proc_init or on first successful creation) and reused across calls:

-    ctx->g_betti0 = NULL;
-    ctx->g_betti1 = NULL;
-    ctx->g_betti2 = NULL;
+    /* Remove these lines - gauge pointers should persist across calls */

Alternatively, store gauge pointers only after first successful creation and skip recreation if already set.

🧹 Nitpick comments (1)
plugins/processor_tda/tda.c (1)

155-158: Consider overflow checks for buffer size calculations.

With default configurations, overflow is unlikely. However, if capacity or feature_dim are set to large values, the multiplications could overflow silently, leading to undersized allocations.

     w->feature_dim = feature_dim;
     /* struct tda_sample { uint64_t ts; double values[]; } */
     sample_size = sizeof(uint64_t) + (size_t) feature_dim * sizeof(double);
     w->sample_size = sample_size;

     buf_size = capacity * sample_size;
+    /* Basic overflow check */
+    if (capacity > 0 && buf_size / capacity != sample_size) {
+        flb_free(w);
+        return NULL;
+    }

     w->buf = flb_malloc(buf_size);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 87638a0 and 800849e.

📒 Files selected for processing (2)
  • include/CMakeLists.txt (1 hunks)
  • plugins/processor_tda/tda.c (1 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.003Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-12-08T05:21:45.003Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.003Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.

Applied to files:

  • plugins/processor_tda/tda.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (1)
include/CMakeLists.txt (1)

46-53: ✅ Ripser header installation looks good.

The conditional block correctly globs and installs Ripser headers when FLB_RIPSER is enabled, following the same pattern as other optional header components (simdutf, config_format, tls, wasm). The typo from the previous review (sipserripser) has been fixed. Permissions and destination path are consistent with the codebase convention.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
plugins/processor_tda/tda.c (1)

1159-1161: Gauge pointer reset is intentional (per previous discussion).

These pointers are reset on each batch because the gauges are created in the incoming metrics_context, which may differ between calls. The cmetrics library manages the lifecycle of gauge objects internally via cmt_map, preventing leaks even when new gauges are created for each batch.

🧹 Nitpick comments (1)
lib/ripser-1.2.1/README.md (1)

1-131: Consider preserving upstream documentation as-is.

Static analysis identified several Markdown style issues (heading levels, list indentation, hard tabs in code blocks), but this appears to be upstream documentation from the Ripser project. Unless these issues significantly impact readability or documentation rendering, it's generally preferable to keep upstream documentation unmodified to simplify future updates.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 800849e and 657ca61.

⛔ Files ignored due to path filters (3)
  • lib/ripser-1.2.1/examples/pointsCycloOctane.csv is excluded by !**/*.csv
  • lib/ripser-1.2.1/examples/projective_plane.csv is excluded by !**/*.csv
  • lib/ripser-1.2.1/examples/rp2_600.lower_distance_matrix.csv is excluded by !**/*.csv
📒 Files selected for processing (34)
  • CMakeLists.txt
  • cmake/libraries.cmake
  • cmake/plugins_options.cmake
  • dockerfiles/Dockerfile.centos7
  • include/CMakeLists.txt
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • lib/ripser-1.2.1/.gitignore
  • lib/ripser-1.2.1/.gitmodules
  • lib/ripser-1.2.1/CMakeLists.txt
  • lib/ripser-1.2.1/CONTRIBUTING.txt
  • lib/ripser-1.2.1/COPYING.txt
  • lib/ripser-1.2.1/Makefile
  • lib/ripser-1.2.1/README.md
  • lib/ripser-1.2.1/examples/o3_1024.txt
  • lib/ripser-1.2.1/examples/o3_2048.txt
  • lib/ripser-1.2.1/examples/o3_4096.txt
  • lib/ripser-1.2.1/examples/o3_8192.txt
  • lib/ripser-1.2.1/examples/projective_plane.dipha
  • lib/ripser-1.2.1/examples/projective_plane.lower_distance_matrix
  • lib/ripser-1.2.1/examples/random16.lower_distance_matrix
  • lib/ripser-1.2.1/examples/random20.lower_distance_matrix
  • lib/ripser-1.2.1/examples/sphere_3_192.lower_distance_matrix
  • lib/ripser-1.2.1/ripser.cpp
  • lib/ripser-1.2.1/ripser_internal.hpp
  • packaging/distros/centos/Dockerfile
  • plugins/CMakeLists.txt
  • plugins/processor_tda/CMakeLists.txt
  • plugins/processor_tda/tda.c
  • plugins/processor_tda/tda.h
  • src/CMakeLists.txt
  • src/ripser/CMakeLists.txt
  • src/ripser/flb_ripser_wrapper.cpp
  • tests/internal/CMakeLists.txt
  • tests/internal/ripser.c
✅ Files skipped from review due to trivial changes (2)
  • lib/ripser-1.2.1/COPYING.txt
  • lib/ripser-1.2.1/examples/projective_plane.lower_distance_matrix
🚧 Files skipped from review as they are similar to previous changes (10)
  • cmake/libraries.cmake
  • cmake/plugins_options.cmake
  • lib/ripser-1.2.1/.gitmodules
  • tests/internal/ripser.c
  • lib/ripser-1.2.1/examples/random16.lower_distance_matrix
  • lib/ripser-1.2.1/examples/random20.lower_distance_matrix
  • lib/ripser-1.2.1/CONTRIBUTING.txt
  • src/ripser/flb_ripser_wrapper.cpp
  • lib/ripser-1.2.1/.gitignore
  • CMakeLists.txt
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.014Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.
📚 Learning: 2025-12-08T05:21:45.014Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.014Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.

Applied to files:

  • plugins/CMakeLists.txt
  • plugins/processor_tda/tda.h
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • lib/ripser-1.2.1/ripser.cpp
  • plugins/processor_tda/tda.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • dockerfiles/Dockerfile.centos7
  • src/ripser/CMakeLists.txt
  • src/CMakeLists.txt
  • packaging/distros/centos/Dockerfile
  • include/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • include/CMakeLists.txt
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • include/CMakeLists.txt
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

  • packaging/distros/centos/Dockerfile
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • include/CMakeLists.txt
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • include/CMakeLists.txt
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
🧬 Code graph analysis (4)
include/fluent-bit/ripser/flb_ripser_wrapper.h (1)
src/ripser/flb_ripser_wrapper.cpp (4)
  • flb_ripser_compute_betti_from_dense_distance (119-171)
  • flb_ripser_compute_betti_from_dense_distance (119-124)
  • flb_ripser_compute_intervals_from_dense_distance (200-235)
  • flb_ripser_compute_intervals_from_dense_distance (200-206)
lib/ripser-1.2.1/ripser.cpp (1)
lib/ripser-1.2.1/ripser_internal.hpp (4)
  • i (58-58)
  • rows (59-59)
  • dim (72-76)
  • dim (72-72)
lib/ripser-1.2.1/ripser_internal.hpp (1)
lib/ripser-1.2.1/ripser.cpp (15)
  • init_rows (219-225)
  • init_rows (219-219)
  • init_rows (227-233)
  • init_rows (227-227)
  • i (236-238)
  • i (236-236)
  • i (241-243)
  • i (241-241)
  • i (269-275)
  • i (269-269)
  • i (288-294)
  • i (288-288)
  • i (409-411)
  • i (409-409)
  • j (470-486)
plugins/processor_tda/tda.c (2)
src/flb_hash_table.c (4)
  • flb_hash_table_get_ptr (591-603)
  • flb_hash_table_add (401-494)
  • flb_hash_table_create (99-137)
  • flb_hash_table_destroy (197-215)
lib/cmetrics/src/cmt_gauge.c (2)
  • cmt_gauge_create (27-81)
  • cmt_gauge_set (94-109)
🪛 checkmake (0.2.2)
lib/ripser-1.2.1/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)


[warning] 1-1: Target "build" should be declared PHONY.

(phonydeclared)


[warning] 4-4: Target "all" should be declared PHONY.

(phonydeclared)

🪛 LanguageTool
lib/ripser-1.2.1/README.md

[grammar] ~81-~81: Use a hyphen to join words.
Context: ...space, or other non-numerical character) separated list of coordinates of the poi...

(QB_NEW_EN_HYPHEN)


[grammar] ~82-~82: Use a hyphen to join words.
Context: ... matrix entries below the diagonal in 32 bit float format (IEEE 754, single, litt...

(QB_NEW_EN_HYPHEN)


[grammar] ~83-~83: Use a hyphen to join words.
Context: ...se`: sparse triplet format; a whitespace separated list of entries of a sparse di...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)
lib/ripser-1.2.1/README.md

6-6: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)


14-14: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


15-15: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


28-28: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


29-29: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


33-33: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


34-34: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


36-36: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


37-37: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


38-38: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


61-61: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


62-62: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


63-63: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


64-64: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


69-69: Dollar signs used before commands without showing output

(MD014, commands-show-output)


76-76: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


77-77: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


78-78: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


79-79: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


80-80: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


81-81: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


82-82: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


83-83: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


84-84: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


85-85: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


86-86: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


87-87: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


104-104: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


106-106: Hard tabs
Column: 1

(MD010, no-hard-tabs)


107-107: Hard tabs
Column: 1

(MD010, no-hard-tabs)


108-108: Hard tabs
Column: 1

(MD010, no-hard-tabs)


109-109: Hard tabs
Column: 1

(MD010, no-hard-tabs)


110-110: Hard tabs
Column: 1

(MD010, no-hard-tabs)


111-111: Hard tabs
Column: 1

(MD010, no-hard-tabs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (81)
  • GitHub Check: PR - container builds / Windows container images (2022)
  • GitHub Check: PR - container builds / Windows container images (2025)
  • GitHub Check: PR - packages build Linux / ubuntu/24.04.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/24.04 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/trixie.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / raspbian/bookworm package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/22.04.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bullseye.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/22.04 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bookworm package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bookworm.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/buster package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/7 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/buster.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bullseye package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/trixie package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/7.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2023.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2023 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2 package build and stage to S3
  • GitHub Check: PR - container builds / amd64/debug container image build
  • GitHub Check: PR - container builds / arm/v7/production container image build
  • GitHub Check: PR - container builds / amd64/production container image build
  • GitHub Check: PR - container builds / arm64/production container image build
  • GitHub Check: PR - container builds / arm64/debug container image build
  • GitHub Check: PR - container builds / arm/v7/debug container image build
  • GitHub Check: PR - packages build MacOS / call-build-macos-package (Intel macOS runner, macos-14-large, 3.31.6)
  • GitHub Check: PR - packages build MacOS / call-build-macos-package (Apple Silicon macOS runner, macos-14, 3.31.6)
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (15)
dockerfiles/Dockerfile.centos7 (1)

39-40: LGTM!

Disabling RIPSER for the CentOS 7 compilation test is appropriate, as this container is primarily used to verify that the build system works on older platforms.

lib/ripser-1.2.1/CMakeLists.txt (1)

1-10: LGTM!

The Ripser static library configuration follows CMake best practices. The C++11 requirement is correctly specified, and the public include directory is appropriately exposed.

plugins/CMakeLists.txt (1)

352-354: LGTM!

The conditional registration of processor_tda follows the established pattern for optional plugins and is correctly placed in the PROCESSORS section.

tests/internal/CMakeLists.txt (1)

169-174: LGTM!

The conditional inclusion of ripser.c in the test suite follows the established pattern for optional component tests.

include/CMakeLists.txt (1)

46-53: LGTM!

The Ripser header installation block is correctly configured and follows the established pattern for conditional header installation. The previously identified typo has been addressed.

plugins/processor_tda/CMakeLists.txt (1)

1-5: LGTM!

The TDA processor plugin configuration correctly uses the FLB_PLUGIN macro and appropriately links against the required Ripser dependencies.

src/CMakeLists.txt (1)

366-369: LGTM - Ripser integration follows established patterns.

The conditional Ripser integration is properly gated by FLB_RIPSER and follows the same pattern as other optional components (WASM, AWS, etc.). The subdirectory inclusion and dependency linking are correctly structured.

Also applies to: 458-467

packaging/distros/centos/Dockerfile (1)

46-47: LGTM - Consistent FLB_RIPSER flag propagation.

The FLB_RIPSER flag is consistently defined and propagated across all CentOS variants, with sensible defaults (enabled for newer distributions, disabled for CentOS 7). The cmake invocation correctly passes the flag through.

Also applies to: 86-87, 131-132, 174-175, 243-244, 314-315, 362-362

src/ripser/CMakeLists.txt (1)

1-12: LGTM - Clean CMake configuration for Ripser wrapper.

The CMakeLists properly sets up the static wrapper library and links against ripser-static. The conditional jemalloc linkage follows established patterns in the codebase.

plugins/processor_tda/tda.h (1)

1-77: LGTM - Well-structured header for TDA processor.

The header provides clear structure definitions with helpful inline documentation. The data model for managing time-series windows, feature grouping, and embedding parameters is well-organized.

plugins/processor_tda/tda.c (1)

1-1233: LGTM - TDA processor implementation is well-structured.

The implementation demonstrates solid engineering:

  • Window management: Proper ring buffer usage with lwrb for sliding window
  • Delay embedding: Correct Takens' reconstruction with configurable m and tau
  • Multi-scale analysis: Smart multi-quantile scan to find meaningful H1 structures
  • Memory safety: Proper allocation/deallocation patterns with error handling
  • Integration: Clean integration with Ripser wrapper and cmetrics

Previous review concerns (gauge recreation, snapshot data loss, error paths) have been addressed or documented as intentional design choices.

include/fluent-bit/ripser/flb_ripser_wrapper.h (1)

1-99: LGTM - Clean and well-documented public API.

The wrapper header provides a clear C API for Ripser integration with:

  • Well-defined data structures for intervals and Betti numbers
  • Comprehensive parameter documentation
  • Flexible callback pattern for interval processing
  • Proper extern "C" guards for C++ compatibility

The FLB_RIPSER_MAX_BETTI_DIM limit of 3 is intentional per the learned context, as higher dimensions would be computationally prohibitive for the plugin's use case.

lib/ripser-1.2.1/ripser_internal.hpp (1)

1-88: LGTM - Well-designed internal Ripser integration.

The internal header provides clean abstractions for compressed distance matrices and interval recording with proper templating for different matrix layouts (lower/upper triangular).

lib/ripser-1.2.1/ripser.cpp (2)

968-986: LGTM - Clean bridge function for Fluent Bit integration.

The ripser_run_from_compressed_lower function provides a clean entry point for the Fluent Bit wrapper, setting up the Ripser instance with appropriate parameters and computing the barcodes. The use of Z/2Z coefficients (modulus = 2) is standard for persistent homology.


990-1305: RIPSEREXE sections not compiled for Fluent Bit (per previous discussion).

The code within #ifdef RIPSEREXE blocks is only used for the standalone Ripser executable and is not compiled when building Fluent Bit. As discussed in previous reviews, these sections are left as-is to maintain compatibility with upstream Ripser and simplify future updates.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
lib/ripser-1.2.1/ripser_internal.hpp (1)

89-89: Fix include guard comment mismatch.

The closing comment references RIPSER_INTERNAL_H but the opening guard at line 20 uses RIPSER_INTERNAL_HPP. Update for consistency.

🔎 Suggested fix
-#endif /* RIPSER_INTERNAL_H */
+#endif /* RIPSER_INTERNAL_HPP */
🧹 Nitpick comments (1)
plugins/processor_tda/tda.c (1)

926-940: Unused threshold computation.

The threshold variable computed at line 936 is never used. The subsequent multi-quantile scan (lines 945-993) computes a fresh thr for each quantile candidate, making this computation dead code.

Consider removing these lines or using threshold as a fallback/default if the multi-quantile scan produces no valid results.

🔎 Option 1: Remove unused code
-    if (m == 1) {
-        q = 0.5;      /* No delay embedding: use something like the median. */
-    }
-    else {
-        q = 0.2;      /* With delay embedding: look at a smaller scale. */
-    }
-
-    /* --- choose a scale for TDA ---
-     * Use the number of embedded points n_embed to determine the threshold.
-     */
-    threshold = tda_choose_threshold_from_dist(ctx, dist, n_embed, q);
-    if (threshold <= 0.0f) {
-        threshold = 0.0f;
-    }
-
     memset(&betti, 0, sizeof(betti));
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 657ca61 and 86b83e5.

📒 Files selected for processing (17)
  • cmake/plugins_options.cmake
  • dockerfiles/Dockerfile.centos7
  • include/CMakeLists.txt
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • lib/ripser-1.2.1/CMakeLists.txt
  • lib/ripser-1.2.1/ripser.cpp
  • lib/ripser-1.2.1/ripser_internal.hpp
  • packaging/distros/centos/Dockerfile
  • plugins/CMakeLists.txt
  • plugins/processor_tda/CMakeLists.txt
  • plugins/processor_tda/tda.c
  • plugins/processor_tda/tda.h
  • src/CMakeLists.txt
  • src/ripser/CMakeLists.txt
  • src/ripser/flb_ripser_wrapper.cpp
  • tests/internal/CMakeLists.txt
  • tests/internal/ripser.c
🚧 Files skipped from review as they are similar to previous changes (7)
  • packaging/distros/centos/Dockerfile
  • plugins/processor_tda/tda.h
  • plugins/processor_tda/CMakeLists.txt
  • plugins/CMakeLists.txt
  • src/ripser/CMakeLists.txt
  • dockerfiles/Dockerfile.centos7
  • tests/internal/ripser.c
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.014Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: lib/ripser-1.2.1/Makefile:1-18
Timestamp: 2025-12-22T05:38:56.340Z
Learning: In fluent/fluent-bit, the Makefile at lib/ripser-1.2.1/Makefile is imported from upstream Ripser and is not used in the actual build process. The project uses CMake for building (lib/ripser-1.2.1/CMakeLists.txt), so changes to the imported Makefile are not necessary.
📚 Learning: 2025-12-22T05:38:56.340Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: lib/ripser-1.2.1/Makefile:1-18
Timestamp: 2025-12-22T05:38:56.340Z
Learning: In fluent/fluent-bit, the Makefile at lib/ripser-1.2.1/Makefile is imported from upstream Ripser and is not used in the actual build process. The project uses CMake for building (lib/ripser-1.2.1/CMakeLists.txt), so changes to the imported Makefile are not necessary.

Applied to files:

  • lib/ripser-1.2.1/CMakeLists.txt
  • lib/ripser-1.2.1/ripser.cpp
  • tests/internal/CMakeLists.txt
  • include/CMakeLists.txt
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • src/CMakeLists.txt
📚 Learning: 2025-12-08T05:21:45.014Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11250
File: src/ripser/flb_ripser_wrapper.cpp:73-106
Timestamp: 2025-12-08T05:21:45.014Z
Learning: In the TDA processor (processor_tda) for Fluent Bit, the dimension limit FLB_RIPSER_MAX_BETTI_DIM is intentionally capped at 3 because the plugin uses embed_dim=3 and delay=1 in practice, and computing higher dimensions (>4) would be computationally prohibitive. The 8-slot betti array allocation is conservative headroom.

Applied to files:

  • cmake/plugins_options.cmake
  • lib/ripser-1.2.1/ripser.cpp
  • plugins/processor_tda/tda.c
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
  • src/ripser/flb_ripser_wrapper.cpp
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • cmake/plugins_options.cmake
  • include/CMakeLists.txt
  • src/CMakeLists.txt
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • cmake/plugins_options.cmake
  • lib/ripser-1.2.1/ripser.cpp
  • include/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • cmake/plugins_options.cmake
  • include/CMakeLists.txt
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • cmake/plugins_options.cmake
  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • lib/ripser-1.2.1/ripser.cpp
  • include/CMakeLists.txt
  • include/fluent-bit/ripser/flb_ripser_wrapper.h
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • include/CMakeLists.txt
🧬 Code graph analysis (3)
lib/ripser-1.2.1/ripser.cpp (1)
lib/ripser-1.2.1/ripser_internal.hpp (4)
  • i (58-58)
  • rows (59-59)
  • dim (72-76)
  • dim (72-72)
include/fluent-bit/ripser/flb_ripser_wrapper.h (1)
src/ripser/flb_ripser_wrapper.cpp (4)
  • flb_ripser_compute_betti_from_dense_distance (119-171)
  • flb_ripser_compute_betti_from_dense_distance (119-124)
  • flb_ripser_compute_intervals_from_dense_distance (200-235)
  • flb_ripser_compute_intervals_from_dense_distance (200-206)
src/ripser/flb_ripser_wrapper.cpp (2)
lib/ripser-1.2.1/ripser.cpp (12)
  • i (236-238)
  • i (236-236)
  • i (241-243)
  • i (241-241)
  • i (269-275)
  • i (269-269)
  • i (288-294)
  • i (288-288)
  • i (409-411)
  • i (409-409)
  • ripser_run_from_compressed_lower (968-986)
  • ripser_run_from_compressed_lower (968-973)
lib/ripser-1.2.1/ripser_internal.hpp (4)
  • i (58-58)
  • dim (72-76)
  • dim (72-72)
  • ripser_run_from_compressed_lower (82-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (81)
  • GitHub Check: PR - container builds / Windows container images (2025)
  • GitHub Check: PR - container builds / Windows container images (2022)
  • GitHub Check: PR - packages build Linux / debian/bullseye package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bullseye.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / raspbian/bookworm package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/24.04.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/22.04.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/22.04 package build and stage to S3
  • GitHub Check: PR - packages build Linux / ubuntu/24.04 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/buster.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/buster package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/trixie package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bookworm.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/trixie.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/10 package build and stage to S3
  • GitHub Check: PR - packages build Linux / debian/bookworm package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / almalinux/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/7.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/9 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/10.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / rockylinux/8.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/9.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2023.arm64v8 package build and stage to S3
  • GitHub Check: PR - packages build Linux / amazonlinux/2023 package build and stage to S3
  • GitHub Check: PR - packages build Linux / centos/7 package build and stage to S3
  • GitHub Check: PR - container builds / arm/v7/production container image build
  • GitHub Check: PR - container builds / amd64/production container image build
  • GitHub Check: PR - container builds / amd64/debug container image build
  • GitHub Check: PR - container builds / arm64/debug container image build
  • GitHub Check: PR - container builds / arm/v7/debug container image build
  • GitHub Check: PR - container builds / arm64/production container image build
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: PR - packages build Windows / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: PR - packages build MacOS / call-build-macos-package (Intel macOS runner, macos-14-large, 3.31.6)
  • GitHub Check: PR - packages build MacOS / call-build-macos-package (Apple Silicon macOS runner, macos-14, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (30)
cmake/plugins_options.cmake (1)

77-77: LGTM!

The TDA processor option follows the established pattern for processor plugins, with a clear description and default enabled state consistent with other processors.

tests/internal/CMakeLists.txt (1)

169-174: LGTM!

The conditional test inclusion for Ripser follows the established pattern used by other optional features like LuaJIT, ensuring test coverage is properly gated by the feature flag.

include/CMakeLists.txt (1)

46-53: LGTM!

The Ripser header installation block follows the same pattern as other optional components (e.g., simdutf), with proper feature flag gating and standard permissions.

src/CMakeLists.txt (2)

366-369: LGTM!

The Ripser subdirectory inclusion follows the established pattern for optional components.


458-466: LGTM!

The Ripser dependency linking follows the established pattern, ensuring both the core Ripser library and the Fluent Bit wrapper are linked when the feature is enabled.

lib/ripser-1.2.1/CMakeLists.txt (1)

1-10: LGTM!

The CMake configuration for the Ripser static library is straightforward and correct, with appropriate include directory setup and C++11 requirement.

src/ripser/flb_ripser_wrapper.cpp (4)

39-54: LGTM!

The dense-to-compressed matrix conversion correctly extracts the lower triangular portion with proper indexing (i > j) and reserves the exact size needed.


73-106: LGTM!

The Betti interval callback correctly filters out invalid intervals (negative dimensions, non-finite values, death ≤ birth, and low persistence < 1e-3). The dimension cap at FLB_RIPSER_MAX_BETTI_DIM (3) is intentional per the design constraints.

Based on learnings, the dimension limit of 3 is intentional because the plugin uses embed_dim=3 and delay=1, and higher dimensions would be computationally prohibitive.


119-171: LGTM!

The public API correctly validates inputs, caps max_dim to 8, converts the dense matrix, runs Ripser with appropriate threshold handling (enclosing radius mode when threshold ≤ 0), and properly fills the output structure.


200-234: LGTM!

The interval computation API correctly validates inputs, sets up the callback bridge, and runs Ripser with the same threshold semantics as the Betti computation API.

lib/ripser-1.2.1/ripser_internal.hpp (2)

30-64: LGTM!

The type definitions, compressed distance matrix template, and layout enum are well-structured. The quadratic formula in the constructor (line 43) correctly computes the matrix size from the compressed vector length.


66-77: LGTM!

The interval recorder struct provides a clean callback interface with default initialization and safe null-checking in the emit method.

include/fluent-bit/ripser/flb_ripser_wrapper.h (4)

29-29: LGTM!

The dimension limit of 3 is intentional and appropriate for the TDA processor's use case with embed_dim=3 and delay=1.

Based on learnings, this cap prevents computationally prohibitive calculations for higher dimensions.


33-46: LGTM!

The data structures are well-designed with clear documentation. The 8-slot betti array provides conservative headroom while the practical limit remains at dimension 3.


62-67: LGTM!

The function signature is well-documented with clear parameter descriptions and return value semantics. The threshold behavior (≤ 0 uses enclosing radius) is properly documented.


87-93: LGTM!

The interval computation API provides a flexible callback-based interface for users who need access to individual persistence intervals rather than just the Betti number summary.

plugins/processor_tda/tda.c (9)

39-140: LGTM!

The comparison function and threshold selection logic are correctly implemented with proper null checks, memory allocation error handling, and boundary conditions for quantile calculation.


142-173: LGTM!

The window creation function properly handles allocation failures and cleans up resources on error paths.


179-295: LGTM!

The group registration helpers properly handle memory allocation failures and roll back partial allocations when hash table insertion fails.


350-462: LGTM!

The group building logic correctly handles error paths, including the fix for the potential use-after-free when last_vec allocation fails.


476-575: LGTM!

The vector construction properly handles the first sample case, computes rates with time delta safeguards, and applies log1p normalization while preserving sign.


577-638: LGTM!

The ingest function correctly handles ring buffer overflow by dropping oldest samples, and properly frees all temporary allocations.


1038-1136: LGTM!

The processor lifecycle functions properly initialize, clean up, and handle all allocated resources with appropriate null checks.


1138-1188: LGTM!

The process metrics function correctly initializes groups and window on first call, and the gauge pointer reset is intentional since each metrics_context manages its own gauge objects through the cmetrics lifecycle.


1191-1233: LGTM!

The configuration map and plugin definition are properly structured with sensible defaults and correctly wired callbacks.

lib/ripser-1.2.1/ripser.cpp (5)

1-78: LGTM!

License headers properly attribute both the original MIT-licensed Ripser code and the Fluent Bit modifications.


219-297: LGTM!

The distance matrix implementations correctly handle triangular matrix access patterns and diagonal elements.


372-817: LGTM!

The core Ripser persistence algorithm implementation is correctly integrated with the interval_recorder callback mechanism for emitting persistence intervals.


947-986: LGTM!

The edge extraction specializations and the ripser_run_from_compressed_lower entry point correctly integrate Ripser with the Fluent Bit wrapper, using Z/2Z coefficients for the homology computation.


988-1305: Standalone executable code disabled for Fluent Bit build.

The #ifdef RIPSEREXE section contains the CLI frontend and is not compiled when building for Fluent Bit. Per previous discussion, this vendored code is preserved as-is to simplify future upstream updates.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
This processor plugin performs Topological Data Analysis (TDA) on metrics
using ripser, which computes persistent homology.

The plugin aggregates incoming counters, gauges and untyped metrics into a
1-D time series, keeps a sliding window, builds a dense distance matrix and
runs ripser through the new flb_ripser_* wrapper helpers. The resulting
Betti numbers (currently betti0 and betti1) are exported as additional gauge metrics.

TDA and persistent homology can help reveal hidden order or phase transitions
in complex systems that are not easily visible from raw time series. Similar
approaches have already been explored in condensed matter physics, for example:

  Donato, I., Gori, M., & Sarti, A. (2016).
  Persistent homology analysis of phase transitions.
  Physical Review E, 93, 052138. https://doi.org/10.1103/PhysRevE.93.052138

The TDA metrics processor now supports an optional delay embedding of the
aggregated metric vectors before building the dense distance matrix
used by Ripser.

When `embed_dim > 1`, we reconstruct a Takens-style delay embedding

    x_t -> (x_t, x_{t-マм, ..., x_{t-(m-1)マм)

over the sliding window, where `m = embed_dim` and `マ= embed_delay`.
Each embedded point is a flattened vector of size

    feature_dim テm

and we keep using an Euclidean distance on this reconstructed phase
space.

This makes the processor more sensitive to occasional cyclic / quasi-
periodic regimes in the metric time series: loops in the reconstructed
trajectory translate into H1 features in the persistent homology.  When
`embed_dim = 1`, the behaviour is unchanged and we fall back to the
original "no embedding" mode.

This change also adds two configuration options:

  - `embed_dim`   (int, default: 3)
      Delay embedding dimension m.
      Set to 1 to disable delay embedding.

  - `embed_delay` (int, default: 1)
      Lag マin samples between successive delays.

The design follows the standard delay embedding approach from Takens'
theorem, which shows that (under mild conditions) the attractor of an
unknown dynamical system can be reconstructed from a single observed
time series via delay coordinates.

Reference
  - F. Takens, "Detecting strange attractors in turbulence",
    in D. Rand and L.-S. Young (eds.), Dynamical Systems and Turbulence,
    Lecture Notes in Mathematics, vol. 898, Springer, 1981, pp. 366-381.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
…tions

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-required ok-package-test Run PR packaging tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants