-
Notifications
You must be signed in to change notification settings - Fork 253
HIVE-2199: Use infraID for MachineSet identification #2829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…cation
Fix isControlledByMachinePool to use infraID when building the prefix for
identifying MachineSets. MachineSet names follow the format
{infraID}-{poolName}-{zone}, not {clusterName}-{poolName}-{zone}.
|
@huangmingxia: This pull request references HIVE-2199 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: huangmingxia The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@huangmingxia: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2829 +/- ##
=======================================
Coverage 50.40% 50.40%
=======================================
Files 279 279
Lines 34194 34194
=======================================
Hits 17235 17235
Misses 15595 15595
Partials 1364 1364
🚀 New features to boost your workflow:
|
| // actually managed by Hive. MachineSets managed by Hive (either created by Hive or adopted by Hive) | ||
| // will have the HiveManagedLabel. | ||
| prefix := strings.Join([]string{cd.Spec.ClusterName, pool.Spec.Name, ""}, "-") | ||
| // MachineSet names use infraID, not cluster name (format: {infraID}-{poolName}-{zone}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did you get this from? Assume by looking at the installer code. My concerns:
- Has it always been this way?
- Does it apply to all cloud platforms?
I'm also concerned that this can still result in false positives, e.g. if we have pools named worker and worker-2. (Guess I could have caught that on the last PR...)
I wonder if we're trying too hard with the name matching. If we have a "managed" label but no "name" label, that seems like it should be an error. It should be impossible to get into that situation unless the user munged something up.
So here's what I think we should do:
- Change the spec of this func to include an
errorreturn. - Error if "managed" is set but "name" is not.
- I think probably error the other way around as well (if "name" is set but "managed" is not). Can you think of a use case where this would be valid?*
- Don't attempt to match against the mset name at all, ever.
Thoughts?
*Just to get this on paper: if both labels must either be set or unset, we really don't need the "managed" label at all -- the existence of the "name" label is sufficient to indicate that it's hive-managed. However, for backward-compatibility reasons, speculative/paranoid though they may be, I reckon we should just keep them both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review! I noticed this when testing the hiveutil machinepool CLI:) I have confirmed from the installer code that all platforms use InfraID instead of ClusterName. All MachineSet names start with {InfraID}-{PoolName}- (except for Nutanix when there is no failure domain, use {InfraID}-{PoolName} ).
When a user manually labels a MachineSet with hive.openshift.io/machine-pool during adoption, there's a brief period before Hive's reconcile logic adds the hive.openshift.io/managed=true label. So name label present but
managed label absent is a valid transitional state.
I agree with your points, I now also feel that the managed label is not necessary - the name label should be sufficient. I'll remove the mset name-matching logic and only check the hive.openshift.io/machine-pool label for control determination. Does this sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user manually labels a MachineSet with
hive.openshift.io/machine-poolduring adoption, there's a brief period before Hive's reconcile logic adds thehive.openshift.io/managed=truelabel. So name label present but managed label absent is a valid transitional state.
This behavior is an artifact of this function, though, right? Since we don't (yet) document that adding the name label is a supported thing, I think we have the flexibility to decide what the UX should be.
I agree with your points, I now also feel that the managed label is not necessary - the name label should be sufficient. I'll remove the mset name-matching logic and only check the
hive.openshift.io/machine-poollabel for control determination. Does this sound good?
Yes.
I've now been through all the possible permutations in my head :)
I think we should continue adding the "managed" label, just in case anyone is using it. It's also used for other spoke artifacts, like syncset resources -- in that regard it's a good signal for the spoke admin that they shouldn't muck with the object.
BUT
I think we can stop reading it in our code. (AFAICS this is the only spot we were doing that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior is an artifact of this function, though, right? Since we don't (yet) document that adding the name label is a supported thing, I think we have the flexibility to decide what the UX should be.
Yes, we can update the UX to avoid this issue with the function.
For the adopt process: I've implemented both labels being added together in the hiveutil machinepool adopt CLI, and tests are passing. This avoids the transition state. I'm still thinking through whether this approach is reasonable, but I haven't identified any issues yet :)
For the control logic: remove the MachineSet name matching and check the hive.openshift.io/machine-pool label.
Fix isControlledByMachinePool to use infraID when building the prefix for identifying MachineSets. MachineSet names follow the format {infraID}-{poolName}-{zone}, not {clusterName}-{poolName}-{zone}.