Skip to content

Conversation

@huangmingxia
Copy link
Contributor

Fix isControlledByMachinePool to use infraID when building the prefix for identifying MachineSets. MachineSet names follow the format {infraID}-{poolName}-{zone}, not {clusterName}-{poolName}-{zone}.

…cation

Fix isControlledByMachinePool to use infraID when building the prefix for
identifying MachineSets. MachineSet names follow the format
{infraID}-{poolName}-{zone}, not {clusterName}-{poolName}-{zone}.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 13, 2026

@huangmingxia: This pull request references HIVE-2199 which is a valid jira issue.

Details

In response to this:

Fix isControlledByMachinePool to use infraID when building the prefix for identifying MachineSets. MachineSet names follow the format {infraID}-{poolName}-{zone}, not {clusterName}-{poolName}-{zone}.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@huangmingxia
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 13, 2026
@openshift-ci openshift-ci bot requested review from 2uasimojo and suhanime January 13, 2026 11:13
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: huangmingxia
Once this PR has been reviewed and has the lgtm label, please assign 2uasimojo for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 13, 2026

@huangmingxia: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 50.40%. Comparing base (b574bb2) to head (43903d4).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2829   +/-   ##
=======================================
  Coverage   50.40%   50.40%           
=======================================
  Files         279      279           
  Lines       34194    34194           
=======================================
  Hits        17235    17235           
  Misses      15595    15595           
  Partials     1364     1364           
Files with missing lines Coverage Δ
...g/controller/machinepool/machinepool_controller.go 63.57% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// actually managed by Hive. MachineSets managed by Hive (either created by Hive or adopted by Hive)
// will have the HiveManagedLabel.
prefix := strings.Join([]string{cd.Spec.ClusterName, pool.Spec.Name, ""}, "-")
// MachineSet names use infraID, not cluster name (format: {infraID}-{poolName}-{zone})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get this from? Assume by looking at the installer code. My concerns:

  • Has it always been this way?
  • Does it apply to all cloud platforms?

I'm also concerned that this can still result in false positives, e.g. if we have pools named worker and worker-2. (Guess I could have caught that on the last PR...)

I wonder if we're trying too hard with the name matching. If we have a "managed" label but no "name" label, that seems like it should be an error. It should be impossible to get into that situation unless the user munged something up.

So here's what I think we should do:

  • Change the spec of this func to include an error return.
  • Error if "managed" is set but "name" is not.
  • I think probably error the other way around as well (if "name" is set but "managed" is not). Can you think of a use case where this would be valid?*
  • Don't attempt to match against the mset name at all, ever.

Thoughts?

*Just to get this on paper: if both labels must either be set or unset, we really don't need the "managed" label at all -- the existence of the "name" label is sufficient to indicate that it's hive-managed. However, for backward-compatibility reasons, speculative/paranoid though they may be, I reckon we should just keep them both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I noticed this when testing the hiveutil machinepool CLI:) I have confirmed from the installer code that all platforms use InfraID instead of ClusterName. All MachineSet names start with {InfraID}-{PoolName}- (except for Nutanix when there is no failure domain, use {InfraID}-{PoolName} ).

When a user manually labels a MachineSet with hive.openshift.io/machine-pool during adoption, there's a brief period before Hive's reconcile logic adds the hive.openshift.io/managed=true label. So name label present but
managed label absent is a valid transitional state.

I agree with your points, I now also feel that the managed label is not necessary - the name label should be sufficient. I'll remove the mset name-matching logic and only check the hive.openshift.io/machine-pool label for control determination. Does this sound good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user manually labels a MachineSet with hive.openshift.io/machine-pool during adoption, there's a brief period before Hive's reconcile logic adds the hive.openshift.io/managed=true label. So name label present but managed label absent is a valid transitional state.

This behavior is an artifact of this function, though, right? Since we don't (yet) document that adding the name label is a supported thing, I think we have the flexibility to decide what the UX should be.

I agree with your points, I now also feel that the managed label is not necessary - the name label should be sufficient. I'll remove the mset name-matching logic and only check the hive.openshift.io/machine-pool label for control determination. Does this sound good?

Yes.
I've now been through all the possible permutations in my head :)
I think we should continue adding the "managed" label, just in case anyone is using it. It's also used for other spoke artifacts, like syncset resources -- in that regard it's a good signal for the spoke admin that they shouldn't muck with the object.
BUT
I think we can stop reading it in our code. (AFAICS this is the only spot we were doing that.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior is an artifact of this function, though, right? Since we don't (yet) document that adding the name label is a supported thing, I think we have the flexibility to decide what the UX should be.

Yes, we can update the UX to avoid this issue with the function.
For the adopt process: I've implemented both labels being added together in the hiveutil machinepool adopt CLI, and tests are passing. This avoids the transition state. I'm still thinking through whether this approach is reasonable, but I haven't identified any issues yet :)

For the control logic: remove the MachineSet name matching and check the hive.openshift.io/machine-pool label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants