|
35 | 35 | - [Example Adoption ClusterDeployment](#example-adoption-clusterdeployment) |
36 | 36 | - [Adopting with hiveutil](#adopting-with-hiveutil) |
37 | 37 | - [Transferring ownership](#transferring-ownership) |
| 38 | + - [MachinePool Adoption](#machinepool-adoption) |
38 | 39 | - [Configuration Management](#configuration-management) |
39 | 40 | - [Vertical Scaling](#vertical-scaling) |
40 | 41 | - [SyncSet](#syncset) |
@@ -1212,7 +1213,7 @@ Hive will then: |
1212 | 1213 |
|
1213 | 1214 | It is possible to adopt cluster deployments into Hive. |
1214 | 1215 | This will allow you to manage the cluster as if it had been provisioned by Hive, including: |
1215 | | -- [MachinePools](#machine-pools) |
| 1216 | +- [MachinePools](#machine-pools) - See [MachinePool Adoption](#machinepool-adoption) for how to adopt existing MachineSets when adopting a cluster |
1216 | 1217 | - [SyncSets and SelectorSyncSets](syncset.md) |
1217 | 1218 | - [Deprovisioning](#cluster-deprovisioning) |
1218 | 1219 |
|
@@ -1253,10 +1254,46 @@ spec: |
1253 | 1254 | name: pull-secret |
1254 | 1255 | ``` |
1255 | 1256 |
|
| 1257 | +### Example Adoption ClusterDeployment for vSphere |
| 1258 | +```yaml |
| 1259 | +apiVersion: hive.openshift.io/v1 |
| 1260 | +kind: ClusterDeployment |
| 1261 | +metadata: |
| 1262 | + name: my-vsphere-cluster |
| 1263 | + namespace: mynamespace |
| 1264 | +spec: |
| 1265 | + baseDomain: vsphere.example.com |
| 1266 | + clusterMetadata: |
| 1267 | + adminKubeconfigSecretRef: |
| 1268 | + name: my-vsphere-cluster-adopted-admin-kubeconfig |
| 1269 | + clusterID: f2e99580-389c-4ec5-b07f-4f489d6c0929 |
| 1270 | + infraID: my-vsphere-cluster-khjpw |
| 1271 | + metadataJSONSecretRef: |
| 1272 | + name: my-vsphere-cluster-metadata-json |
| 1273 | + clusterName: my-vsphere-cluster |
| 1274 | + controlPlaneConfig: |
| 1275 | + servingCertificates: {} |
| 1276 | + installed: true |
| 1277 | + preserveOnDelete: true |
| 1278 | + platform: |
| 1279 | + vsphere: |
| 1280 | + certificatesSecretRef: |
| 1281 | + name: my-vsphere-cluster-adopted-vsphere-certificates |
| 1282 | + cluster: <cluster> # vSphere cluster name where VMs are deployed |
| 1283 | + credentialsSecretRef: |
| 1284 | + name: my-vsphere-cluster-adopted-vsphere-credentials |
| 1285 | + datacenter: <vSphere-datacenter-name> |
| 1286 | + defaultDatastore: <default-datastore-name> |
| 1287 | + network: <network name used by the cluster> |
| 1288 | + vCenter: <vCenter server domain name or IP address> |
| 1289 | + pullSecretRef: |
| 1290 | + name: my-vsphere-cluster-adopted-pull-secret |
| 1291 | +``` |
| 1292 | +
|
1256 | 1293 | Note for `metadataJSONSecretRef`: |
1257 | 1294 | 1. If the referenced Secret is available -- e.g. if the cluster was previously managed by hive -- simply copy it in. |
1258 | | -1. If you have the original metadata.json file -- e.g. if the cluster was provisioned directly via openshift-install -- create the Secret from it: `oc create secret generic my-gcp-cluster-metadata-json -n mynamespace --from-file=metadata.json=/tmp/metadata.json` |
1259 | | -1. Otherwise, you may need to compose the file by hand. See the samples below. |
| 1295 | +2. If you have the original metadata.json file -- e.g. if the cluster was provisioned directly via openshift-install -- create the Secret from it: `oc create secret generic my-gcp-cluster-metadata-json -n mynamespace --from-file=metadata.json=/tmp/metadata.json` |
| 1296 | +3. Otherwise, you may need to compose the file by hand. See the samples below. |
1260 | 1297 |
|
1261 | 1298 | If the cluster you are looking to adopt is on AWS and leverages Privatelink, you'll also need to include that setting under `spec.platform.aws` to ensure the VPC Endpoint Service for the cluster is tracked in the ClusterDeployment. |
1262 | 1299 |
|
@@ -1360,6 +1397,274 @@ If you wish to transfer ownership of a cluster which is already managed by hive, |
1360 | 1397 | 1. Edit the `ClusterDeployment`, setting `spec.preserveOnDelete` to `true`. This ensures that the next step will only release the hive resources without destroying the cluster in the cloud infrastructure. |
1361 | 1398 | 1. Delete the `ClusterDeployment` |
1362 | 1399 | 1. From the hive instance that will adopt the cluster, `oc apply` the `ClusterDeployment`, creds and certs manifests you saved in the first step. |
| 1400 | + |
| 1401 | +### MachinePool Adoption |
| 1402 | + |
| 1403 | +When adopting a cluster, you can also adopt existing MachineSets by creating MachinePools that match the existing MachineSets. |
| 1404 | + |
| 1405 | +**Terminology:** |
| 1406 | + |
| 1407 | +In this section, we use the following terms to avoid confusion: |
| 1408 | + |
| 1409 | +- **MachinePool resource name** (`metadata.name`): The Kubernetes resource name of the MachinePool, e.g., `mycluster-worker` |
| 1410 | + - Must follow the pattern: `<clusterdeployment-name>-<pool-spec-name>` |
| 1411 | + - This restriction is enforced by webhook validation. If you attempt to create a MachinePool with a `metadata.name` that does not match this pattern, the webhook will reject it. |
| 1412 | + - Example: If your `ClusterDeployment` is named `mycluster` and `MachinePool.spec.name` is `worker`, then `MachinePool.metadata.name` must be exactly `mycluster-worker` |
| 1413 | +- **Pool spec name** (`spec.name`): The pool name defined in the MachinePool specification, e.g., `worker` |
| 1414 | + - Used in the `hive.openshift.io/machine-pool` label |
| 1415 | + - Used to generate MachineSet names |
| 1416 | + - You can choose any value for `spec.name` - it does NOT need to match existing MachineSet names. The only requirement is that it matches the `hive.openshift.io/machine-pool` label value when adopting existing MachineSets. |
| 1417 | + |
| 1418 | +When adopting MachineSets, the `hive.openshift.io/machine-pool` label value must match the **pool spec name** (`spec.name`), not the MachinePool resource name. |
| 1419 | + |
| 1420 | +**Environment:** |
| 1421 | + |
| 1422 | +In this section, we distinguish between two cluster environments: |
| 1423 | + |
| 1424 | +- **Hub cluster**: Where Hive is running and where MachinePool resources are created |
| 1425 | +- **Spoke cluster**: The managed cluster where MachineSets exist |
| 1426 | + |
| 1427 | +All commands in this procedure will be clearly marked with either `# On hub cluster` or `# On spoke cluster` to indicate where each command should be executed. |
| 1428 | + |
| 1429 | +Hive supports adopting existing MachineSets into MachinePool management in two scenarios: |
| 1430 | + |
| 1431 | +#### Scenario 1: Adopt MachinePools When Adopting a Cluster |
| 1432 | + |
| 1433 | +This scenario applies when you are adopting a cluster that was previously unmanaged by Hive. After adopting the cluster, you can bring the MachinePools along by labeling the existing MachineSets and creating corresponding MachinePools. |
| 1434 | + |
| 1435 | +Steps: |
| 1436 | + |
| 1437 | +1. Adopt the cluster (see [Cluster Adoption](#cluster-adoption) above) |
| 1438 | +2. Adopt the MachinePools using the [MachinePool Adoption Procedure](#machinepool-adoption-procedure) outlined below |
| 1439 | + - If there are additional MachineSets that should also be managed by Hive, create separate MachinePools for each distinct configuration |
| 1440 | + |
| 1441 | +#### Scenario 2: Adopt Additional MachineSets for a Cluster Already Managed by Hive |
| 1442 | + |
| 1443 | +If you want to adopt additional MachineSets for a cluster that is already managed by Hive, you can do so by creating MachinePools that match the existing MachineSets. |
| 1444 | + |
| 1445 | +Steps: |
| 1446 | +1. Label the existing MachineSets with `hive.openshift.io/machine-pool=<machine-pool-name>`, where `<machine-pool-name>` is the value you will use for `MachinePool.spec.name` (the machine pool name, not the MachinePool resource name). |
| 1447 | +2. Create a corresponding MachinePool in the Hive hub cluster to manage these MachineSets |
| 1448 | + |
| 1449 | +#### MachinePool Adoption Procedure |
| 1450 | + |
| 1451 | +To adopt existing MachineSets: |
| 1452 | + |
| 1453 | +1. Identify and inspect the existing MachineSets in the cluster that you want to manage: |
| 1454 | + ```bash |
| 1455 | + # On spoke cluster - List all MachineSets |
| 1456 | + oc get machinesets -n openshift-machine-api |
| 1457 | + |
| 1458 | + # On spoke cluster - Get detailed information about a specific MachineSet |
| 1459 | + oc get machineset <machineset-name> -n openshift-machine-api -o yaml |
| 1460 | + ``` |
| 1461 | + |
| 1462 | + **Important**: Note the following details for each MachineSet you want to adopt: |
| 1463 | + - Instance type (e.g., `m5.xlarge` for AWS) |
| 1464 | + - Availability zone/failure domain (e.g., `us-east-1a`) |
| 1465 | + - Current replica count |
| 1466 | + - Any platform-specific configurations (root volume settings, etc.) |
| 1467 | + |
| 1468 | +2. **Label the existing MachineSets** with the `hive.openshift.io/machine-pool` label. The label value must match the `spec.name` (machine pool name) you will use in the MachinePool: |
| 1469 | + ```bash |
| 1470 | + # On spoke cluster - Label the MachineSet |
| 1471 | + oc label machineset <machineset-name> -n openshift-machine-api hive.openshift.io/machine-pool=<machine-pool-name> |
| 1472 | + ``` |
| 1473 | + |
| 1474 | + **Note**: You must label each MachineSet you want to adopt. Each MachineSet in each availability zone needs the label. |
| 1475 | + |
| 1476 | +3. **Create a MachinePool** with specifications that exactly match the existing MachineSets: |
| 1477 | + - The `spec.name` (machine pool name) must match the label value you applied in step 2 |
| 1478 | + - The `spec.platform` configuration (instance type, zones, etc.) must exactly match the existing MachineSets. For platform-specific limitations, see [Platform-Specific Limitations](#platform-specific-limitations) |
| 1479 | + - The `spec.replicas` should match the current total replica count across all zones, or you can adjust it and Hive will reconcile |
| 1480 | + - The `spec.platform.<cloud>.zones` array must include all zones where MachineSets are labeled, and the order matters (see [Zone Configuration Warnings](#zone-configuration-warnings) below) |
| 1481 | + |
| 1482 | + Example MachinePool for adopting existing worker MachineSets on AWS: |
| 1483 | + ```yaml |
| 1484 | + apiVersion: hive.openshift.io/v1 |
| 1485 | + kind: MachinePool |
| 1486 | + metadata: |
| 1487 | + name: mycluster-worker # MachinePool resource name |
| 1488 | + namespace: mynamespace |
| 1489 | + spec: |
| 1490 | + clusterDeploymentRef: |
| 1491 | + name: mycluster |
| 1492 | + name: worker # Machine pool name (spec.name) - must match the label value from step 2 |
| 1493 | + platform: |
| 1494 | + aws: |
| 1495 | + type: m5.xlarge # Must exactly match existing MachineSet instance type |
| 1496 | + zones: # Must match all zones where MachineSets are labeled |
| 1497 | + - us-east-1a |
| 1498 | + - us-east-1b |
| 1499 | + - us-east-1c |
| 1500 | + replicas: 3 # Total replicas across all zones |
| 1501 | + ``` |
| 1502 | + Example MachinePool for adopting existing worker MachineSets on GCP: |
| 1503 | + ```yaml |
| 1504 | + apiVersion: hive.openshift.io/v1 |
| 1505 | + kind: MachinePool |
| 1506 | + metadata: |
| 1507 | + name: mihuanggcp-worker |
| 1508 | + spec: |
| 1509 | + clusterDeploymentRef: |
| 1510 | + name: mihuanggcp |
| 1511 | + name: worker |
| 1512 | + platform: |
| 1513 | + gcp: |
| 1514 | + osDisk: |
| 1515 | + diskSizeGB: 128 |
| 1516 | + diskType: pd-ssd |
| 1517 | + type: n1-standard-4 |
| 1518 | + zones: |
| 1519 | + - us-central1-a |
| 1520 | + - us-central1-c |
| 1521 | + - us-central1-f |
| 1522 | + replicas: 3 |
| 1523 | + ``` |
| 1524 | + |
| 1525 | + Example MachinePool for adopting existing worker MachineSets on vSphere: |
| 1526 | + ```yaml |
| 1527 | + apiVersion: hive.openshift.io/v1 |
| 1528 | + kind: MachinePool |
| 1529 | + metadata: |
| 1530 | + name: mihuang-1213a-worker |
| 1531 | + namespace: adopt |
| 1532 | + spec: |
| 1533 | + clusterDeploymentRef: |
| 1534 | + name: mihuang-1213a |
| 1535 | + name: worker |
| 1536 | + platform: |
| 1537 | + vsphere: |
| 1538 | + coresPerSocket: 4 |
| 1539 | + cpus: 8 |
| 1540 | + memoryMB: 16384 |
| 1541 | + osDisk: |
| 1542 | + diskSizeGB: 120 |
| 1543 | + replicas: 2 |
| 1544 | + ``` |
| 1545 | +4. **Apply the MachinePool**: |
| 1546 | + ```bash |
| 1547 | + # On hub cluster - Create the MachinePool |
| 1548 | + oc apply -f machinepool-adopt.yaml |
| 1549 | + ``` |
| 1550 | + |
| 1551 | +5. **Verify the adoption**: |
| 1552 | + ```bash |
| 1553 | + # On hub cluster - Check MachinePool status |
| 1554 | + oc get machinepool mycluster-worker -n mynamespace -o yaml |
| 1555 | + |
| 1556 | + # On spoke cluster - Verify MachineSets were not recreated |
| 1557 | + oc get machinesets -n openshift-machine-api |
| 1558 | + ``` |
| 1559 | + |
| 1560 | +#### Warning: Avoid Unintended Hive Management |
| 1561 | + |
| 1562 | +Hive determines which MachineSets it manages based on two criteria: |
| 1563 | + |
| 1564 | +1. **Name pattern match**: MachineSet name starts with `<cluster-name>-<pool-spec-name>-` (e.g., `mycluster-worker-us-east-1a-xxx`) |
| 1565 | +2. **Label match**: MachineSet has the `hive.openshift.io/machine-pool` label with a value matching the MachinePool's `spec.name` |
| 1566 | +
|
| 1567 | +If a MachineSet meets either of these criteria, Hive will consider it managed and may modify or delete it to match the MachinePool specification. |
| 1568 | +
|
| 1569 | +**Important**: If you manually create MachineSets with names matching the Hive naming pattern (e.g., `mycluster-worker-us-east-1a-xxx`) but do NOT want Hive to manage them, ensure: |
| 1570 | +- The MachineSet name does NOT start with `<cluster-name>-<pool-spec-name>-` |
| 1571 | +- The MachineSet does NOT have the `hive.openshift.io/machine-pool` label |
| 1572 | +
|
| 1573 | +If both the naming pattern and the label are present, Hive will assume this is a Hive-managed MachineSet and may modify or delete it to match the MachinePool specification. This can lead to unexpected MachineSet deletion or modification. |
| 1574 | +
|
| 1575 | +#### Zone Configuration Warnings |
| 1576 | +
|
| 1577 | +Zone configuration (failure domain configuration) is one of the most error-prone aspects of MachinePool adoption. Incorrect zone configuration can cause Hive to create new MachineSets and delete existing ones, leading to unexpected resource creation and potential service disruption. |
| 1578 | +
|
| 1579 | +1: Zone Mismatch Causes New MachineSet Creation |
| 1580 | +
|
| 1581 | +If the configured zones in `MachinePool.spec.platform.<cloud>.zones` do not match the existing MachineSets' failure domains (availability zones), Hive will: |
| 1582 | +- NOT adopt the existing MachineSets (even if they have the correct label) |
| 1583 | +- Create new MachineSets in the configured zones |
| 1584 | +- This can lead to unexpected resource creation and costs |
| 1585 | + |
| 1586 | +Example of zone mismatch: |
| 1587 | +- Existing MachineSets: in zones `us-east-1a` and `us-east-1f` (with `hive.openshift.io/machine-pool=worker` label) |
| 1588 | +- MachinePool configured with zones: `us-east-1b` and `us-east-1c` |
| 1589 | +- Result: |
| 1590 | + - Existing MachineSets in `us-east-1a` and `us-east-1f` are not adopted (zone mismatch) |
| 1591 | + - If the existing MachineSets have the `hive.openshift.io/machine-pool` label, they will be deleted because they are considered controlled by the MachinePool but don't match the generated MachineSets |
| 1592 | + - New MachineSets are created in `us-east-1b` and `us-east-1c` to match MachinePool config |
| 1593 | +
|
| 1594 | +2: Zone Order Affects Replica Distribution |
| 1595 | +
|
| 1596 | +When using fixed replicas (not autoscaling), the order of zones (failure domains) in the array determines how replicas are distributed. You must ensure the zone order in `MachinePool.spec.platform.<cloud>.zones` matches the current replica distribution across zones, as incorrect zone order will cause Hive to redistribute replicas, leading to Machine creation or deletion. |
| 1597 | +
|
| 1598 | +Hive distributes replicas using this algorithm: |
| 1599 | +
|
| 1600 | +```go |
| 1601 | +replicas := int32(total / numOfAZs) |
| 1602 | +if int64(idx) < total % numOfAZs { |
| 1603 | + replicas++ // Earlier zones in the array get extra replicas |
| 1604 | +} |
| 1605 | +``` |
| 1606 | +
|
| 1607 | +Example of zone order impact: |
| 1608 | +
|
| 1609 | +Current state (total: 3 replicas): |
| 1610 | +- `us-east-1f`: 2 replicas |
| 1611 | +- `us-east-1a`: 1 replica |
| 1612 | +
|
| 1613 | +Correct zone order (preserves current distribution): |
| 1614 | +```yaml |
| 1615 | +spec: |
| 1616 | + platform: |
| 1617 | + aws: |
| 1618 | + zones: |
| 1619 | + - us-east-1f # Index 0: gets 2 replicas |
| 1620 | + - us-east-1a # Index 1: gets 1 replica |
| 1621 | + replicas: 3 |
| 1622 | +``` |
| 1623 | +
|
| 1624 | +Incorrect zone order (causes Machine recreation): |
| 1625 | +```yaml |
| 1626 | +spec: |
| 1627 | + platform: |
| 1628 | + aws: |
| 1629 | + zones: |
| 1630 | + - us-east-1a # Index 0: will get 2 replicas |
| 1631 | + - us-east-1f # Index 1: will get 1 replica |
| 1632 | + replicas: 3 |
| 1633 | +``` |
| 1634 | +
|
| 1635 | +Result of incorrect order: |
| 1636 | +- Hive will scale `us-east-1a` from 1 to 2 replicas → 1 new Machine created |
| 1637 | +- Hive will scale `us-east-1f` from 2 to 1 replica → 1 Machine deleted |
| 1638 | +
|
| 1639 | +#### Platform-Specific Limitations |
| 1640 | +
|
| 1641 | +##### Nutanix and vSphere: Multiple Failure Domains |
| 1642 | +
|
| 1643 | +**Note:** vSphere zone support is coming soon but is not yet officially supported. |
| 1644 | +
|
| 1645 | +Nutanix and vSphere follow similar mechanisms for failure domain handling. |
| 1646 | +
|
| 1647 | +**For clusters configured with a single failure domain:** |
| 1648 | +
|
| 1649 | +- Nutanix and vSphere MachineSets can be adopted normally |
| 1650 | +- MachinePool adoption works correctly |
| 1651 | +
|
| 1652 | +**For clusters configured with multiple failure domains (e.g., FD1, FD2):** |
| 1653 | +
|
| 1654 | +After an OpenShift cluster is created, the failure domain configuration information is stored in the `Infrastructurespec.platformSpec.*.failureDomains`. The failure domains in the Infrastructure resource can be modified. |
| 1655 | +
|
| 1656 | +- If a newly added MachineSet in the spoke cluster is in FD1 or FD2, MachinePool adoption and autoscaling work normally. |
| 1657 | +
|
| 1658 | +**Limited Scenario:** |
| 1659 | +
|
| 1660 | +After creating a cluster with multiple failure domains (FD1, FD2) using Hive, if a new MachineSet is added in FD3 on the spoke cluster, it cannot be adopted. |
| 1661 | +The `ClusterDeployment.spec.platform.*.failureDomains` is immutable and does not support modification. Hive uses the ClusterDeployment's FailureDomains to generate MachineSets. Even if the `Infrastructurespec.platformSpec.*.failureDomains` resource has FD3, if the ClusterDeployment's FailureDomains does not have FD3: |
| 1662 | +
|
| 1663 | +- MachinePool adoption will fail because there is no generated FD3 MachineSet to match against |
| 1664 | +- The FD3 MachineSet will be deleted by Hive because it has the correct `hive.openshift.io/machine-pool` label (making `isControlledByMachinePool` return true) but no matching generated MachineSet exists |
| 1665 | +
|
| 1666 | +**Note:** There is one difference between Nutanix and vSphere: Nutanix can only configure one PrismCentral, while vSphere supports configuring multiple VCenters (topology). _After a vSphere cluster is created, adding new VCenters is not supported_; however, new failure domains can be added within existing VCenters. |
| 1667 | +
|
1363 | 1668 | ## Configuration Management |
1364 | 1669 |
|
1365 | 1670 | ### Vertical Scaling |
|
0 commit comments