Skip to content

OCPBUGS-83267: Use upgrades.Skippable for Gateway API upgrade test skip logic#31000

Open
gcs278 wants to merge 1 commit intoopenshift:mainfrom
gcs278:fix-gwapi-upgrade-ipv6-skip
Open

OCPBUGS-83267: Use upgrades.Skippable for Gateway API upgrade test skip logic#31000
gcs278 wants to merge 1 commit intoopenshift:mainfrom
gcs278:fix-gwapi-upgrade-ipv6-skip

Conversation

@gcs278
Copy link
Copy Markdown
Contributor

@gcs278 gcs278 commented Apr 12, 2026

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • [ ] Verify Gateway API upgrade test is skipped cleanly on OKD clusters I don't see any OKD jobs
  • [ ] Verify Gateway API upgrade test is skipped cleanly on unsupported platforms Looks like we support all platforms in the current jobs I can search though
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 12, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 12, 2026

@gcs278: This pull request references NE-2292 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • The Gateway API upgrade test called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Added a pre-Setup skip hook on Gateway API upgrade tests that performs platform/OLM capability and eligibility checks before Setup. Skip collects precheck errors (stored on the test), performs Ginkgo skip when appropriate, and Setup asserts no precheck error and uses a simplified capability resolver.

Changes

Cohort / File(s) Summary
Gateway API Upgrade test
test/extended/router/gatewayapi_upgrade.go
Added func (t *GatewayAPIUpgradeTest) Skip(_ upgrades.UpgradeContext) bool and precheckErr error. Skip calls shouldSkipGatewayAPITests, stores any error into t.precheckErr (and returns false), issues a Ginkgo skip when skip==true with the provided reason, and when NO-OLM is not enabled performs OLM/Marketplace capability checks (errors recorded into t.precheckErr, non-enabled capability causes skip). Setup now asserts t.precheckErr==nil, uses getPlatformCapabilities, and no longer performs capability-based skipping at Setup time.
Gateway API controller / platform capabilities
test/extended/router/gatewayapicontroller.go
Replaced checkPlatformSupportAndGetCapabilities with shouldSkipGatewayAPITests(oc *exutil.CLI) (bool, string, error) and getPlatformCapabilities(oc *exutil.CLI) (loadBalancerSupported bool, managedDNS bool). shouldSkipGatewayAPITests determines OKD, reads Infrastructure.PlatformStatus, and returns skip decisions/reasons (including unsupported platform types and IPv6/dual-stack) or errors (no Ginkgo/Gomega calls). getPlatformCapabilities only maps PlatformStatus.Type to loadBalancerSupported and computes managedDNS via isDNSManaged(oc), logging results; it no longer makes skip decisions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from Thealisyed and rfredette April 12, 2026 20:57
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gcs278

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2026
@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch 2 times, most recently from 73d03cd to 853bb58 Compare April 12, 2026 21:00
@gcs278 gcs278 changed the title NE-2292: Use upgrades.Skippable for Gateway API upgrade test skip logic OCPBUGS-82586: Use upgrades.Skippable for Gateway API upgrade test skip logic Apr 12, 2026
@openshift-ci-robot openshift-ci-robot added the jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. label Apr 12, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-82586, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • The Gateway API upgrade test called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Apr 12, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 53-60: Replace the helper that panics inside Skip by adding a new
function gatewayAPIWithoutOLMEnabled(oc *exutil.CLI) (bool, error) (in
gatewayapicontroller.go) which performs the same FeatureGates check as
isNoOLMFeatureGateEnabled but returns (bool, error) instead of using o.Expect;
then change the Skip() call site to call gatewayAPIWithoutOLMEnabled(oc), check
the returned error and treat any error as a condition to skip the test
(log/return true) rather than allowing a panic, and ensure no o.Expect-based
helpers are invoked from within Skip().

In `@test/extended/router/gatewayapicontroller.go`:
- Around line 554-587: shouldSkipGatewayAPITests must not invoke
isIPv6OrDualStack because that helper uses o.Expect and can panic; change the
code so it calls a non-panicking variant that returns (bool, error) (either
modify isIPv6OrDualStack to return (bool, error) or add isIPv6OrDualStackSafe),
check the error and on error return (true, fmt.Sprintf("Failed to determine IP
stack: %v", err)), and only skip for IPv6/dual-stack when the boolean is true;
reference the functions shouldSkipGatewayAPITests and isIPv6OrDualStack (or the
new isIPv6OrDualStackSafe) and ensure no o.Expect is used in the path called
from GatewayAPIUpgradeTest.Skip.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c5b789a9-7819-41a8-b936-721a3717190d

📥 Commits

Reviewing files that changed from the base of the PR and between 1a93dad and 090b384.

📒 Files selected for processing (2)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from 853bb58 to d765284 Compare April 12, 2026 21:04
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
test/extended/router/gatewayapi_upgrade.go (1)

54-58: ⚠️ Potential issue | 🔴 Critical

Avoid o.Expect-based helper calls inside Skip()

Line 54 calls isNoOLMFeatureGateEnabled(oc), which uses o.Expect internally and can panic. In this pre-Setup() skip path, that can still produce unrecoverable panics and defeats the upgrades.Skippable refactor.

Proposed fix
-	// Skip on clusters missing OLM/Marketplace capabilities if not using NO-OLM mode
-	if !isNoOLMFeatureGateEnabled(oc) {
+	// Skip on clusters missing OLM/Marketplace capabilities if not using NO-OLM mode
+	noOLMEnabled, err := gatewayAPIWithoutOLMEnabled(oc)
+	if err != nil {
+		g.By(fmt.Sprintf("skipping test: failed to read FeatureGates: %v", err))
+		return true
+	}
+	if !noOLMEnabled {
 		enabled, err := exutil.AllCapabilitiesEnabled(oc, olmCapabilities...)
 		if err != nil || !enabled {
 			g.By("skipping test: OLM/Marketplace capabilities are not enabled and GatewayAPIWithoutOLM is not enabled")
 			return true
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapi_upgrade.go` around lines 54 - 58, The skip
path calls isNoOLMFeatureGateEnabled(oc) which uses o.Expect and can panic
before Setup; replace that call with a non-panicking check (either introduce a
new helper like isNoOLMFeatureGateEnabledNoExpect(oc) that returns (bool, error)
or inline the feature-gate query without o.Expect), handle any error by treating
the gate as disabled, and then use that boolean in the existing condition
alongside exutil.AllCapabilitiesEnabled; ensure all o.Expect usage remains in
Setup or later so Skip() cannot cause a panic.
test/extended/router/gatewayapicontroller.go (1)

583-584: ⚠️ Potential issue | 🔴 Critical

shouldSkipGatewayAPITests still has a transitive panic path

The function is documented as panic-safe, but Line 583 calls isIPv6OrDualStack(oc), and that helper uses o.Expect (Line 622). This can still panic instead of returning a skip reason.

Proposed fix
-	if isIPv6OrDualStack(oc) {
+	ipv6OrDualStack, err := isIPv6OrDualStack(oc)
+	if err != nil {
+		return true, fmt.Sprintf("Failed to determine IP stack: %v", err)
+	}
+	if ipv6OrDualStack {
 		return true, "Skipping Gateway API tests on IPv6/dual-stack cluster"
 	}
-func isIPv6OrDualStack(oc *exutil.CLI) bool {
+func isIPv6OrDualStack(oc *exutil.CLI) (bool, error) {
 	networkConfig, err := oc.AdminOperatorClient().OperatorV1().Networks().Get(context.Background(), "cluster", metav1.GetOptions{})
-	o.Expect(err).NotTo(o.HaveOccurred(), "Failed to get network config")
+	if err != nil {
+		return false, err
+	}
 
 	for _, cidr := range networkConfig.Spec.ServiceNetwork {
 		if utilnet.IsIPv6CIDRString(cidr) {
-			return true
+			return true, nil
 		}
 	}
-	return false
+	return false, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapicontroller.go` around lines 583 - 584,
shouldSkipGatewayAPITests currently calls isIPv6OrDualStack(oc) which uses
o.Expect and can panic; change the helper to a panic-safe signature (e.g.,
isIPv6OrDualStackSafe/ isIPv6OrDualStack returning (bool, error)) that does not
call o.Expect but returns an error on failure, then update
shouldSkipGatewayAPITests to call the new safe helper, handle the error path by
returning a non-skip or a clear skip reason without panicking, and remove any
direct uses of o.Expect inside isIPv6OrDualStack so tests remain panic-safe;
refer to the functions shouldSkipGatewayAPITests and isIPv6OrDualStack and the
use of o.Expect when locating code to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 54-58: The skip path calls isNoOLMFeatureGateEnabled(oc) which
uses o.Expect and can panic before Setup; replace that call with a non-panicking
check (either introduce a new helper like isNoOLMFeatureGateEnabledNoExpect(oc)
that returns (bool, error) or inline the feature-gate query without o.Expect),
handle any error by treating the gate as disabled, and then use that boolean in
the existing condition alongside exutil.AllCapabilitiesEnabled; ensure all
o.Expect usage remains in Setup or later so Skip() cannot cause a panic.

In `@test/extended/router/gatewayapicontroller.go`:
- Around line 583-584: shouldSkipGatewayAPITests currently calls
isIPv6OrDualStack(oc) which uses o.Expect and can panic; change the helper to a
panic-safe signature (e.g., isIPv6OrDualStackSafe/ isIPv6OrDualStack returning
(bool, error)) that does not call o.Expect but returns an error on failure, then
update shouldSkipGatewayAPITests to call the new safe helper, handle the error
path by returning a non-skip or a clear skip reason without panicking, and
remove any direct uses of o.Expect inside isIPv6OrDualStack so tests remain
panic-safe; refer to the functions shouldSkipGatewayAPITests and
isIPv6OrDualStack and the use of o.Expect when locating code to modify.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85458f16-0cb3-47cb-9f62-6d36fc11beb4

📥 Commits

Reviewing files that changed from the base of the PR and between 090b384 and 853bb58.

📒 Files selected for processing (2)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch 4 times, most recently from f580b21 to 6a7a94e Compare April 12, 2026 21:09
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-82586, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

Most importantly, I should have kicked off "periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6" in #30897 and I missed that:

/payload-job periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@gcs278: trigger 4 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5cc518b0-36b4-11f1-833a-a696c774475a-0

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

@CodeRabbit does everything look good?

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh pr diff 31000 --repo openshift/origin

Length of output: 7475


@gcs278 The changes look correct and well-structured overall. Here's a brief summary of what I checked:

✅ Core fix is sound: The Skip() method correctly implements the upgrades.Skippable interface and is called by the disruption framework before Setup() runs in its goroutine. This properly avoids the g.Skip() panic.

shouldSkipGatewayAPITests is safe: No g.Skip() or o.Expect() calls — pure error returns, safe to invoke outside Ginkgo leaf nodes.

getPlatformCapabilities in Setup() is safe in practice: It uses o.Expect(), but by the time Setup() is reached, Skip() has already validated that the infrastructure is accessible and PlatformStatus is non-nil. The only theoretical risk is a TOCTOU race, which is negligible.

gatewayapicontroller.go BeforeEach is correct: shouldSkipGatewayAPITests + g.Skip(reason) + o.Expect(err).NotTo(...) is a clean and idiomatic pattern inside a Ginkgo leaf node.

A few minor observations (no blockers):

  1. isNoOLMFeatureGateEnabled called twice — once in Skip() and once in Setup(). Minor inefficiency.
  2. OLM capability error → skip in Skip(): If AllCapabilitiesEnabled returns an error, the test is silently skipped rather than failing. This is conservative but could mask infrastructure issues; logging the error before skipping would help observability (similar to how shouldSkipGatewayAPITests errors are handled with e2e.Logf).
  3. g.By() in Skip(): Harmless, but By is typically meant for step annotation inside test bodies. A plain e2e.Logf might be more appropriate here since Skip() is not a Ginkgo node.

None of these are correctness issues — the PR achieves its goal of preventing the goroutine panic on IPv6/dual-stack, OKD, and unsupported platforms. Looks good to merge. 🐇

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from 6a7a94e to d44a553 Compare April 12, 2026 21:17
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

/payload-abort

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@gcs278: aborted 4 active payload job(s) for pull request #31000

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

I hate to kick off so many jobs, but I don't want to miss any jobs that could be failing:
/payload 4.22 nightly informing

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
test/extended/router/gatewayapicontroller.go (1)

556-589: ⚠️ Potential issue | 🔴 Critical

shouldSkipGatewayAPITests still has a panic path.

The contract on Lines 556-560 says this helper is safe outside Ginkgo nodes, but Line 587 still calls isIPv6OrDualStack(oc), and isIPv6OrDualStack (Lines 623-633) uses o.Expect(...) for the network lookup. A transient API error here will panic from GatewayAPIUpgradeTest.Skip() instead of returning an error.

Suggested fix
-	if isIPv6OrDualStack(oc) {
-		return true, "Skipping Gateway API tests on IPv6/dual-stack cluster", nil
-	}
+	ipv6OrDualStack, err := isIPv6OrDualStackSafe(oc)
+	if err != nil {
+		return false, "", fmt.Errorf("failed to determine IP stack: %w", err)
+	}
+	if ipv6OrDualStack {
+		return true, "Skipping Gateway API tests on IPv6/dual-stack cluster", nil
+	}

Then make isIPv6OrDualStackSafe return (bool, error) without using o.Expect, or change isIPv6OrDualStack itself to that signature if you want all callers on the safer contract.

Verify by inspecting the helper chain. Expected: shouldSkipGatewayAPITests still calls isIPv6OrDualStack, and that helper still contains o.Expect(...).

#!/bin/bash
set -euo pipefail

echo "=== shouldSkipGatewayAPITests ==="
nl -ba test/extended/router/gatewayapicontroller.go | sed -n '556,592p'

echo
echo "=== isIPv6OrDualStack ==="
nl -ba test/extended/router/gatewayapicontroller.go | sed -n '621,633p'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapicontroller.go` around lines 556 - 589,
shouldSkipGatewayAPITests calls isIPv6OrDualStack which uses o.Expect and can
panic on transient API errors; change the contract so the network helper returns
(bool, error) (either rename to isIPv6OrDualStackSafe or modify
isIPv6OrDualStack) and remove all o.Expect usages inside it—perform API calls
with proper error returns (e.g., query Networks via oc, check status, return
true/false on IPv6/dual-stack or the error), then update
shouldSkipGatewayAPITests to call the new (bool, error) helper and on error
propagate it as (false, "", err) instead of relying on Ginkgo assertions; ensure
all other callers of isIPv6OrDualStack are updated to handle the (bool, error)
signature.
test/extended/router/gatewayapi_upgrade.go (1)

60-71: ⚠️ Potential issue | 🔴 Critical

Avoid o.Expect-based FeatureGate checks in Skip().

Line 61 still calls isNoOLMFeatureGateEnabled(oc), and that helper in test/extended/router/gatewayapicontroller.go (Lines 635-646) uses o.Expect(...) on the FeatureGates lookup. If that API call fails, Skip() can panic outside Ginkgo recovery and reintroduce the unrecoverable failure path this PR is trying to remove. Use an error-returning helper here and surface failures via t.precheckErr.

Suggested fix
-	if !isNoOLMFeatureGateEnabled(oc) {
+	noOLMEnabled, err := isNoOLMFeatureGateEnabledSafe(oc)
+	if err != nil {
+		t.precheckErr = fmt.Errorf("failed to read FeatureGates: %w", err)
+		return false
+	}
+	if !noOLMEnabled {
 		enabled, err := exutil.AllCapabilitiesEnabled(oc, olmCapabilities...)
 		if err != nil {
 			t.precheckErr = fmt.Errorf("failed to check OLM capabilities: %w", err)
 			return false
 		}

Add a non-panicking helper alongside the existing leaf-node helper:

func isNoOLMFeatureGateEnabledSafe(oc *exutil.CLI) (bool, error) {
	fgs, err := oc.AdminConfigClient().ConfigV1().FeatureGates().Get(context.TODO(), "cluster", metav1.GetOptions{})
	if err != nil {
		return false, err
	}
	for _, fg := range fgs.Status.FeatureGates {
		for _, enabledFG := range fg.Enabled {
			if enabledFG.Name == "GatewayAPIWithoutOLM" {
				return true, nil
			}
		}
	}
	return false, nil
}

Verify by inspecting the call path. Expected: Skip() still calls isNoOLMFeatureGateEnabled, and the helper still contains o.Expect(...).

#!/bin/bash
set -euo pipefail

echo "=== GatewayAPIUpgradeTest.Skip ==="
nl -ba test/extended/router/gatewayapi_upgrade.go | sed -n '46,74p'

echo
echo "=== isNoOLMFeatureGateEnabled ==="
nl -ba test/extended/router/gatewayapicontroller.go | sed -n '635,647p'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapi_upgrade.go` around lines 60 - 71, Replace the
direct call to the panicking helper isNoOLMFeatureGateEnabled(oc) inside
GatewayAPIUpgradeTest.Skip with a non-panicking variant that returns (bool,
error): call isNoOLMFeatureGateEnabledSafe(oc), and if it returns an error set
t.precheckErr = fmt.Errorf("failed to check feature gate: %w", err) and return
false; if no error use the bool to decide skip. Add the helper
isNoOLFeatureGateEnabledSafe (or isNoOLMFeatureGateEnabledSafe) which uses
oc.AdminConfigClient().ConfigV1().FeatureGates().Get(context.TODO(), "cluster",
metav1.GetOptions{}) and inspects fgs.Status.FeatureGates to detect
"GatewayAPIWithoutOLM", returning (true, nil) if found, (false, nil) otherwise,
and any API error as the returned error; leave the original
isNoOLMFeatureGateEnabled unchanged for other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 60-71: Replace the direct call to the panicking helper
isNoOLMFeatureGateEnabled(oc) inside GatewayAPIUpgradeTest.Skip with a
non-panicking variant that returns (bool, error): call
isNoOLMFeatureGateEnabledSafe(oc), and if it returns an error set t.precheckErr
= fmt.Errorf("failed to check feature gate: %w", err) and return false; if no
error use the bool to decide skip. Add the helper isNoOLFeatureGateEnabledSafe
(or isNoOLMFeatureGateEnabledSafe) which uses
oc.AdminConfigClient().ConfigV1().FeatureGates().Get(context.TODO(), "cluster",
metav1.GetOptions{}) and inspects fgs.Status.FeatureGates to detect
"GatewayAPIWithoutOLM", returning (true, nil) if found, (false, nil) otherwise,
and any API error as the returned error; leave the original
isNoOLMFeatureGateEnabled unchanged for other tests.

In `@test/extended/router/gatewayapicontroller.go`:
- Around line 556-589: shouldSkipGatewayAPITests calls isIPv6OrDualStack which
uses o.Expect and can panic on transient API errors; change the contract so the
network helper returns (bool, error) (either rename to isIPv6OrDualStackSafe or
modify isIPv6OrDualStack) and remove all o.Expect usages inside it—perform API
calls with proper error returns (e.g., query Networks via oc, check status,
return true/false on IPv6/dual-stack or the error), then update
shouldSkipGatewayAPITests to call the new (bool, error) helper and on error
propagate it as (false, "", err) instead of relying on Ginkgo assertions; ensure
all other callers of isIPv6OrDualStack are updated to handle the (bool, error)
signature.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d33e618e-a713-4405-a34c-3a5ed7e425b4

📥 Commits

Reviewing files that changed from the base of the PR and between d60a845 and 57bc1c8.

📒 Files selected for processing (2)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from 57bc1c8 to a3115fa Compare April 13, 2026 01:36
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/payload-abort

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 13, 2026

@gcs278: aborted 13 active payload job(s) for pull request #31000

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/payload-job periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-aws-upgrade-ovn-single-node periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips-no-nat-instance periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-upgrade-out-of-change periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-upgrade-out-of-change periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-upgrade-ovn-ipv6 periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6 periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade periodic-ci-openshift-release-main-ci-4.22-e2e-vsphere-ovn-upgrade periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 13, 2026

@gcs278: trigger 13 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips-no-nat-instance
  • periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-main-ci-4.22-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/93775330-36d9-11f1-8316-56200df53e4a-0

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

@CodeRabbit how does it look now?

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh pr diff 31000 --repo openshift/origin

Length of output: 14312


@gcs278 The latest diff is a clear improvement over what I reviewed before — all three minor observations I raised earlier have been addressed:

  1. isNoOLMFeatureGateEnabled duplication eliminated — It's now called once in Skip() and the result passed into shouldSkipGatewayAPITests. In gatewayapicontroller.go, noOLM is set once in BeforeEach and reused at all five downstream call sites — no repeated API calls.

  2. ✅ OLM capability errors are now surfaced — The !noOLM capability check was moved into shouldSkipGatewayAPITests and returns (false, "", fmt.Errorf(...)) on error, which propagates to t.precheckErr and surfaces as a test failure in Setup() rather than silently skipping.

  3. isIPv6OrDualStack and isNoOLMFeatureGateEnabled no longer use o.Expect() — Both now return errors, making them fully safe to call outside Ginkgo leaf nodes.

The only remaining minor nit is the g.By(fmt.Sprintf("skipping test: %s", reason)) call inside Skip() (line 65 in gatewayapi_upgrade.go), but as noted before it's harmless.

Overall: the PR looks good and is ready to merge. 🐇

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

Unrelated failures:

/test e2e-gcp-ovn

/test e2e-aws-ovn-fips

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

All of the payload jobs looks good (passing on non-IPv6 and skipping on IPv6).

…ip logic

The Gateway API upgrade test was calling g.Skip() from Setup(), which
runs inside a goroutine managed by the disruption framework. Since
g.Skip() panics and Ginkgo can only recover panics inside leaf nodes,
this caused unrecoverable panics on IPv6/dual-stack, OKD, and
unsupported platform clusters.

Implement the upgrades.Skippable interface with a Skip() method that
the disruption framework calls before Setup, avoiding the goroutine
panic. Refactor checkPlatformSupportAndGetCapabilities into
shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and
getPlatformCapabilities (returns LB/DNS support).

https://redhat.atlassian.net/browse/OCPBUGS-83267

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from a3115fa to 8ef51c3 Compare April 13, 2026 13:03
@gcs278 gcs278 changed the title OCPBUGS-82586: Use upgrades.Skippable for Gateway API upgrade test skip logic OCPBUGS-83267: Use upgrades.Skippable for Gateway API upgrade test skip logic Apr 13, 2026
@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Apr 13, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is invalid:

  • expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "4.23.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Apr 13, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Apr 13, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

cluster bootstrap
/e2e-aws-ovn-serial-1of2

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/test e2e-aws-ovn-serial-1of2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants