feat: Split +Inf overflow alerts by label-set by vikin91 · Pull Request #5 · stackrox/sensor-metrics-analyzer

vikin91 · 2026-03-13T11:20:09Z

+Inf overflow alerts were hard to act on because they only reported the base histogram metric, not the specific label combinations causing the overflow. In practice, one problematic series was hidden among many label variants, so operators could not quickly identify the offending path.

For example, the issue in:

rox_central_k8s_event_processing_duration_count{Action="SYNC_RESOURCE",Dispatcher="unstructured.Unstructured",Resource="ComplianceOperatorRule"}

was being reported as an issue with

rox_central_k8s_event_processing_duration_count

however, the metric file contained over 1k lines for the metric rox_central_k8s_event_processing_duration_count and the particular set of labels was difficult to find.

We fixed this by evaluating histogram series per label-set and changing reporting behavior:

RED/YELLOW: emit one alert per problematic label-set, with labels included in the title:
- <metric>{...labels...} (+Inf overflow check)
GREEN: keep output concise by emitting a single metric-level summary when all label-sets are healthy.
Add and render Prometheus HELP as Metric description before Message in markdown/console/TUI.

This makes overflow alerts actionable (exact offending series is visible) without adding noise when metrics are healthy.

Moreover, we will now try to guess the histogram unit based on the metric name and the help text.

Emit one RED/YELLOW result per offending histogram series (labels in title), keep one GREEN summary per metric, and render HELP as Metric description before Message for clearer triage. AI-generated: evaluator/report/template/TUI/test updates. User-provided/corrected: output contract (title format, section order, green-collapsing rule).

vikin91 added 7 commits March 13, 2026 12:14

X-Smart-Branch-Parent: main

9309801

Address AI-review issues

311fe1c

Improve the message wording to not assume that vales are durations

a996a89

Add histogram unit guesser

9f71401

Add changelog and bump version to 005

6e8baba

Fix e2e tests. Ensure CI runs them

34d3ed9

vikin91 merged commit 438dc67 into main Mar 13, 2026
7 checks passed

vikin91 deleted the feat/inf-bucket-respects-labels branch March 13, 2026 13:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Split +Inf overflow alerts by label-set#5

feat: Split +Inf overflow alerts by label-set#5
vikin91 merged 7 commits intomainfrom
feat/inf-bucket-respects-labels

vikin91 commented Mar 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vikin91 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vikin91 commented Mar 13, 2026 •

edited

Loading