-
Notifications
You must be signed in to change notification settings - Fork 48
fix: update acstor node-agent pod selector for label changes #1369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Update the pod selector regex in acstorMetricsExporterDefaultFile.yml to support both old and new node-agent pod labels: - Old: app.kubernetes.io/name=storage-operator, app.kubernetes.io/component=node-agent - New: app.kubernetes.io/name=node-agent, app.kubernetes.io/component=node-agent This ensures backward compatibility with existing releases while supporting the updated label schema in newer releases.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
085a138 to
8d118bc
Compare
|
rebased main on top of this branch. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
|
This PR was closed because it has been stalled for 12 days with no activity. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This PR upgrades the otelcollector to the latest version available for the opentelemetry-collector and opentelemetry-operator. It was automatically generated by the GitHub Actions workflow. The summary of the OSS changelog is below: # Prometheusreceiver Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:06:49 --- ### v0.142.0 - [**BREAKING**] `receiver/prometheus`: Promote the receiver.prometheusreceiver.RemoveStartTimeAdjustment feature gate to stable and remove in-receiver metric start time adjustment in favor of the metricstarttime processor, including disabling the created-metric feature gate. ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) Previously, users could disable the RemoveStartTimeAdjustment feature gate to temporarily keep the legacy start time adjustment behavior in the Prometheus receiver. With this promotion to stable and bounded registration, that gate can no longer be disabled; the receiver will no longer set StartTime on metrics based on process_start_time_seconds, and users should migrate to the metricstarttime processor for equivalent functionality. This change also disables the receiver.prometheusreceiver.UseCreatedMetric feature gate, which previously used the `<metric>_created` series to derive start timestamps for counters, summaries, and histograms when scraping non OpenMetrics protocols. However, this does not mean that the `_created` series is always ignored: when using the OpenMetrics 1.0 protocol, Prometheus itself continues to interpret the `_created` series as the start timestamp, so only the receiver-side handling for other scrape protocols has been removed. - [**BREAKING**] `receiver/prometheus`: Native histogram scraping and ingestion is now controlled by the scrape configuration option `scrape_native_histograms`. ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The feature gate `receiver.prometheusreceiver.EnableNativeHistograms` is now stable and enabled by default. Native histograms scraped from Prometheus will automatically be converted to OpenTelemetry exponential histograms. To enable scraping of native histograms, you must configure `scrape_native_histograms: true` in your Prometheus scrape configuration (either globally or per-job). Additionally, the protobuf scrape protocol must be enabled by setting `scrape_protocols` to include `PrometheusProto`. - [**BREAKING**] `receiver/prometheusremotewrite`: Updated to Remote Write 2.0 spec rc.4, requiring Prometheus 3.8.0 or later ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The upstream Prometheus library updated the Remote Write 2.0 protocol from rc.3 to rc.4 in prometheus/prometheus[#17411](open-telemetry/opentelemetry-collector-contrib#17411). This renamed `CreatedTimestamp` to `StartTimestamp` and moved it from the `TimeSeries` message to individual `Sample` and `Histogram` messages. This is a wire-protocol incompatibility, so Prometheus versions 3.7.x and earlier will no longer work correctly with this receiver. Please upgrade to Prometheus 3.8.0 or later. - [**OTHER**] `receiver/prometheus`: Deprecate `use_start_time_metric` and `start_time_metric_regex` config in favor of the processor `metricstarttime` ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) - [**FEATURE**] `receiver/prometheusremotewrite`: Map.PutStr causes excessive memory allocations due to repeated slice expansions ([#44612](open-telemetry/opentelemetry-collector-contrib#44612)) - [**BUG FIX**] `receiver/prometheus`: Fix HTTP response body leak in target allocator when fetching scrape configs fails ([#44921](open-telemetry/opentelemetry-collector-contrib#44921)) The getScrapeConfigsResponse function did not close resp.Body on error paths. If io.ReadAll or yaml.Unmarshal failed, the response body would leak, potentially causing HTTP connection exhaustion. - [**BUG FIX**] `receiver/prometheus`: Fixes yaml marshaling of prometheus/common/config.Secret types ([#44445](open-telemetry/opentelemetry-collector-contrib#44445)) ### v0.141.0 - [**FEATURE**] `receiver/prometheus`: Add feature gate for extra scrape metrics in Prometheus receiver ([#44181](open-telemetry/opentelemetry-collector-contrib#44181)) deprecation of extra scrape metrics in Prometheus receiver will be removed eventually. - [**FEATURE**] `receiver/prometheus`: Support JWT Profile for Authorization Grant (RFC 7523 3.1) ([#44381](open-telemetry/opentelemetry-collector-contrib#44381)) ### v0.140.0 - [**BREAKING**] `receiver/prometheus`: The prometheus receiver no longer adjusts the start time of metrics by default. ([#43656](open-telemetry/opentelemetry-collector-contrib#43656)) Disable the receiver.prometheusreceiver.RemoveStartTimeAdjustment | feature gate to temporarily re-enable this functionality. Users that need | this functionality should migrate to the metricstarttime processor, | and use the true_reset strategy for equivalent behavior. - [**FEATURE**] `receiver/prometheusremotewrite`: Skip emitting empty metrics. ([#44149](open-telemetry/opentelemetry-collector-contrib#44149)) - [**FEATURE**] `receiver/prometheusremotewrite`: prometheusremotewrite receiver now accepts metric type unspcified histograms. ([#41840](open-telemetry/opentelemetry-collector-contrib#41840)) ### v0.139.0 - [**BUG FIX**] `receiver/prometheus`: Fix missing staleness tracking leading to missing no recorded value data points. ([#43893](open-telemetry/opentelemetry-collector-contrib#43893)) - [**BUG FIX**] `receiver/prometheusremotewrite`: Fixed a concurrency bug in the Prometheus remote write receiver where concurrent requests with identical job/instance labels would return empty responses after the first successful request. ([#42159](open-telemetry/opentelemetry-collector-contrib#42159)) ### v0.138.0 - [**FEATURE**] `receiver/prometheus`: added NHCB(native histogram wit custom buckets) to explicit histogram conversion ([#41131](open-telemetry/opentelemetry-collector-contrib#41131)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 4 | | Features | 6 | | Bug Fixes | 4 | | Other Changes | 1 | | **Total** | **15** | # Target-allocator Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:07:05 --- ### 0.142.0 - [**FEATURE**] `target allocator`: Add support for prometheus scrape classes ([#3600](open-telemetry/opentelemetry-operator#3600)) Added support for configuring `scrapeClasses` when using the PrometheusCR-feature of the target allocator. The format of the `scrapeClasses` array is exactly as same as `spec.scrapeClasses` of the `Prometheus` CRD. - [**BUG FIX**] `target allocator`: Fix CA certificate race condition with client cert renewals by extending its duration and and renewal attempt. ([#4441](open-telemetry/opentelemetry-operator#4441)) The CA certificate now has a 2-year duration (instead of the default 90 days) to prevent race conditions where client and server certificates could be signed by different CA versions during simultaneous renewal. This ensures the CA remains stable while dependent certificates renew regularly. ### 0.141.0 - [**FEATURE**] `target allocator`: make evaluation_interval configurable for Prometheus CR watcher ([#4520](open-telemetry/opentelemetry-operator#4520)) ### 0.140.0 - [**BUG FIX**] `github action`: Remove unused VERSION and VERSION_DATE environment variables from publish workflows ([#4470](open-telemetry/opentelemetry-operator#4470)) Removed the unused "Read version" step that set VERSION and VERSION_DATE environment variables in both publish-target-allocator.yaml and publish-operator-opamp-bridge.yaml workflows. These variables were never referenced anywhere in the workflows. ### 0.138.0 - [**BREAKING**] `target allocator`: Remove the operator.collector.targetallocatorcr feature flag ([#2422](open-telemetry/opentelemetry-operator#2422)) This behavior has been enabled by default since version 0.127.0. - [**BUG FIX**] `target allocator`: Add missing TA ownership watches to cert-manager Certificate and Issuer ([#4368](open-telemetry/opentelemetry-operator#4368)) ### 0.137.0 - [**BREAKING**] `target allocator`: Promote the operator.collector.targetallocatorcr feature flag to Stable ([#2422](open-telemetry/opentelemetry-operator#2422)) The flag can no longer be disabled. It will be completely removed in 0.138.0. - [**BUG FIX**] `target allocator, opamp`: Fix version not being updated after version upgrade. ([#4378](open-telemetry/opentelemetry-operator#4378)) - [**BUG FIX**] `target-allocator`: Fixed potential duplicate scrape targets caused by Prometheus relabeling. ([#3617](open-telemetry/opentelemetry-operator#3617)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 2 | | Features | 2 | | Bug Fixes | 5 | | Other Changes | 0 | | **Total** | **9** | --------- [comment]: # (Note that your PR title should follow the conventional commit format: https://conventionalcommits.org/en/v1.0.0/#summary) # PR Description [comment]: # (The below checklist is for PRs adding new features. If a box is not checked, add a reason why it's not needed.) # New Feature Checklist - [ ] List telemetry added about the feature. - [ ] Link to the one-pager about the feature. - [ ] List any tasks necessary for release (3P docs, AKS RP chart changes, etc.) after merging the PR. - [ ] Attach results of scale and perf testing. [comment]: # (The below checklist is for code changes. Not all boxes necessarily need to be checked. Build, doc, and template changes do not need to fill out the checklist.) # Tests Checklist - [ ] Have end-to-end Ginkgo tests been run on your cluster and passed? To bootstrap your cluster to run the tests, follow [these instructions](/otelcollector/test/README.md#bootstrap-a-dev-cluster-to-run-ginkgo-tests). - Labels used when running the tests on your cluster: - [ ] `operator` - [ ] `windows` - [ ] `arm64` - [ ] `arc-extension` - [ ] `fips` - [ ] Have new tests been added? For features, have tests been added for this feature? For fixes, is there a test that could have caught this issue and could validate that the fix works? - [ ] Is a new scrape job needed? - [ ] The scrape job was added to the folder [test-cluster-yamls](/otelcollector/test/test-cluster-yamls/) in the correct configmap or as a CR. - [ ] Was a new test label added? - [ ] A string constant for the label was added to [constants.go](/otelcollector/test/utils/constants.go). - [ ] The label and description was added to the [test README](/otelcollector/test/README.md). - [ ] The label was added to this [PR checklist](/.github/pull_request_template). - [ ] The label was added as needed to [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml). - [ ] Are additional API server permissions needed for the new tests? - [ ] These permissions have been added to [api-server-permissions.yaml](/otelcollector/test/testkube/api-server-permissions.yaml). - [ ] Was a new test suite (a new folder under `/tests`) added? - [ ] The new test suite is included in [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml). Co-authored-by: azure-monitor-assistant[bot] <217255729+azure-monitor-assistant[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Rashmi Chandrashekar <rashmy@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
This PR is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Update the pod selector regex in acstorMetricsExporterDefaultFile.yml to support both old and new node-agent pod labels:
This ensures backward compatibility with existing releases while supporting the updated label schema in newer releases.
PR Description
New Feature Checklist
Tests Checklist
operatorwindowsarm64arc-extensionfips/tests) added?