|
| 1 | +--- |
| 2 | +title: HostedClusterPackage API |
| 3 | +weight: 1200 |
| 4 | +images: [] |
| 5 | +mermaid: true |
| 6 | +--- |
| 7 | + |
| 8 | +The `HostedClusterPackage` API extends Package Operator with progressive |
| 9 | +rollout capabilities for Packages targeting HyperShift Hosted Control Planes |
| 10 | +(HCP). It introduces a cluster-scoped custom resource, the Package, which |
| 11 | +governs the lifecycle and update process of Packages across all hosted control |
| 12 | +planes within a HyperShift Management Cluster. |
| 13 | + |
| 14 | +## Overview |
| 15 | + |
| 16 | +This API allows for the gradual rollout of updates to HostedClusters, |
| 17 | +significantly reducing the "blast radius" of failed upgrades compared to |
| 18 | +simultaneous updates. It also simplifies the configuration required to deliver |
| 19 | +objects into an HCP namespace by reducing the dependency on multiple systems |
| 20 | +(Hive, ACM) down to a single API. |
| 21 | + |
| 22 | +```mermaid |
| 23 | +flowchart LR |
| 24 | + subgraph Hive |
| 25 | + metrics-sss["metrics-forwarder<br><b>SelectorSyncSet</b>"] |
| 26 | + end |
| 27 | + subgraph HyperShift Management Cluster |
| 28 | + metrics-hsp["metrics-forwarder<br><b>HyperShift Package</b>"] |
| 29 | + subgraph Namespace: my-cluster-x |
| 30 | + ns-c1["<b>Package</b>"] |
| 31 | + end |
| 32 | + subgraph Namespace: my-cluster-y |
| 33 | + ns-c2["<b>Package</b>"] |
| 34 | + end |
| 35 | + end |
| 36 | + metrics-sss--->metrics-hsp |
| 37 | + metrics-hsp--->ns-c1 |
| 38 | + metrics-hsp--->ns-c2 |
| 39 | +``` |
| 40 | + |
| 41 | +### Key Features |
| 42 | + |
| 43 | +* **Progressive Rollout**: Updates are rolled out gradually to HostedClusters |
| 44 | + rather than all at once. |
| 45 | +* **Lifecycle Automation**: Automatically creates Packages for new |
| 46 | + HostedClusters and deletes them when the cluster is removed. |
| 47 | +* **Status & Monitoring**: Provides metrics and status updates on the number |
| 48 | + of available, unavailable, or updated packages. |
| 49 | +* **Simplified Configuration**: Reduces the configuration surface from |
| 50 | + multiple objects (SelectorSyncSet, Policy, PlacementRule, etc.) to just the |
| 51 | + `HostedClusterPackage` API. |
| 52 | + |
| 53 | +## HostedClusterPackage Resource |
| 54 | + |
| 55 | +The `HostedClusterPackage` resource is the core configuration object for this API. |
| 56 | +It coordinates the rollout process and defines how updates traverse the fleet |
| 57 | +of HostedClusters. |
| 58 | + |
| 59 | +### Targeting Clusters |
| 60 | + |
| 61 | +The API includes an optional label selector to target specific HostedClusters |
| 62 | +within the Management cluster. |
| 63 | + |
| 64 | +### Partitioning |
| 65 | + |
| 66 | +To control the order of updates, you can attach an optional partition configuration |
| 67 | +to the Package API. This ensures that all items within a specific group are processed |
| 68 | +before the rollout moves to the next group. |
| 69 | + |
| 70 | +* **Grouping**: The configuration uses labels on the HostedCluster object to |
| 71 | + assign groups (e.g., hypershift.openshift.io/risk-group). |
| 72 | +* **Ordering**: Groups can be ordered via a static list or by alphanumeric |
| 73 | + ascending order. |
| 74 | +* **Implicit Handling**: HostedClusters without the specified label or with |
| 75 | + unknown values are placed in an implicit "unknown" group and upgraded last. |
| 76 | +* **Dynamic Regrouping**: If a cluster's label changes to an earlier group |
| 77 | + during an upgrade, the process will jump back to handle that group before |
| 78 | + continuing. |
| 79 | + |
| 80 | +### Progression Strategies |
| 81 | + |
| 82 | +The API supports configurable progression strategies to control the speed and safety |
| 83 | +of the rollout. |
| 84 | + |
| 85 | +### Rolling Upgrade |
| 86 | + |
| 87 | +The `rollingUpgrade` strategy is designed to keep service disruptions to a minimum. |
| 88 | + |
| 89 | +* `maxUnavailable`: Configures the maximum number of Package instances that |
| 90 | + can be updating or unavailable at the same time. If a Package is already |
| 91 | + unavailable before the upgrade starts, it counts towards this limit. These |
| 92 | + unavailable packages are prioritized for updates to prevent accumulating |
| 93 | + faulty versions. |
| 94 | +* `minReadySeconds`: Specifies the minimum time to wait before considering a |
| 95 | + package ready (e.g., 60 seconds). |
| 96 | + |
| 97 | +## Status & Observability |
| 98 | + |
| 99 | +The Package API exposes status information to help you track the progress of a |
| 100 | +rollout and the health of the fleet. This status is critical for understanding |
| 101 | +if an update is proceeding smoothly or if it has stalled due to errors. |
| 102 | + |
| 103 | +### Rollout State |
| 104 | + |
| 105 | +The status subresource provides high-level metrics regarding the rollout |
| 106 | +process. These fields allow you to quickly assess the distribution of package |
| 107 | +versions across your HostedClusters: |
| 108 | + |
| 109 | +* **Updated Packages**: The number of HostedClusters that have successfully |
| 110 | + received the latest version of the Package. |
| 111 | + |
| 112 | +* **Available Packages**: The number of HostedClusters where the Package is |
| 113 | + currently healthy and serving traffic. |
| 114 | + |
| 115 | +* **Unavailable Packages**: The number of HostedClusters where the Package is |
| 116 | + currently degraded or updating. This count is used to enforce the |
| 117 | + maxUnavailable limit defined in the progression strategy. |
| 118 | + |
| 119 | +### Progression Logic |
| 120 | + |
| 121 | +The operator uses the status of individual Packages to determine if the rollout can |
| 122 | +proceed to the next target. |
| 123 | + |
| 124 | +* **Success**: If a targeted HostedCluster successfully updates and becomes |
| 125 | + available, the operator proceeds to select the next cluster in the partition. |
| 126 | + |
| 127 | +* **Failure**: If a Package update fails or becomes unavailable, the rollout |
| 128 | + pauses for that specific rollout path. This prevents the propagation of errors |
| 129 | + to the rest of the fleet. |
| 130 | + |
| 131 | +## Monitoring |
| 132 | + |
| 133 | +In addition to the resource status, the controller exports metrics that |
| 134 | +provide a fleet-wide overview of the rollout state. SREs can utilize these |
| 135 | +metrics to build dashboards that visualize the number of available versus |
| 136 | +unavailable packages over time. |
| 137 | + |
| 138 | +## Configuration Example |
| 139 | + |
| 140 | +The following YAML example demonstrates a Package configured with |
| 141 | +risk-based partitioning and a rolling upgrade strategy. |
| 142 | + |
| 143 | +```yaml |
| 144 | +apiVersion: package-operator.run/v1alpha1 |
| 145 | +kind: HostedClusterPackage |
| 146 | +metadata: |
| 147 | + name: example-hosted-cluster-package |
| 148 | +spec: |
| 149 | + # Partition Configuration |
| 150 | + partition: |
| 151 | + labelKey: hypershift.openshift.io/risk-group |
| 152 | + order: |
| 153 | + # Static ordering example: |
| 154 | + static: |
| 155 | + - early |
| 156 | + - normal |
| 157 | + - late |
| 158 | + # OR Alphanumeric ordering (mutually exclusive with static): |
| 159 | + # alphanumericAsc: {} |
| 160 | + |
| 161 | + # Progression Strategy |
| 162 | + minReadySeconds: 60 |
| 163 | + strategy: |
| 164 | + rollingUpgrade: |
| 165 | + maxUnavailable: 1 # Max packages to update concurrently |
| 166 | +``` |
| 167 | +
|
| 168 | +**Note**: The `HostedClusterPackage` API is experimental and subject to |
| 169 | +change in future. |
0 commit comments