Skip to content

Commit 5055570

Browse files
committed
feat: [pko-351] added documentation for HostedClusterPackage API
Signed-off-by: Ankit152 <[email protected]>
1 parent a015b2d commit 5055570

File tree

2 files changed

+463
-70
lines changed

2 files changed

+463
-70
lines changed
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: HostedClusterPackage API
3+
weight: 1201
4+
images: []
5+
mermaid: true
6+
---
7+
8+
**Note**: The `HostedClusterPackage` API is experimental and subject to
9+
change in the future.
10+
11+
The `HostedClusterPackage` API extends Package Operator with progressive
12+
rollout capabilities for Packages targeting HyperShift Hosted Control Planes
13+
(HCP). It introduces a cluster-scoped custom resource, the
14+
`HostedClusterPackage`, which governs the lifecycle and update process of
15+
Packages across all hosted control planes within a HyperShift Management
16+
Cluster.
17+
18+
## Overview
19+
20+
This API allows for the central rollout of `Packages` to all hosted control
21+
planes on a given management cluster. It provides facilities to control
22+
rollout strategies, significantly reducing the "blast radius" of failed
23+
upgrades compared to simultaneous updates. It also simplifies the
24+
configuration required to deliver objects into an HCP namespace by reducing
25+
the dependency on multiple systems down to a single API.
26+
27+
```mermaid
28+
flowchart LR
29+
subgraph Hive
30+
metrics-sss["metrics-forwarder<br><b>SelectorSyncSet</b>"]
31+
end
32+
subgraph HyperShift Management Cluster
33+
metrics-hsp["metrics-forwarder<br><b>HyperShift Package</b>"]
34+
subgraph Namespace: my-cluster-x
35+
ns-c1["<b>Package</b>"]
36+
end
37+
subgraph Namespace: my-cluster-y
38+
ns-c2["<b>Package</b>"]
39+
end
40+
end
41+
metrics-sss--->metrics-hsp
42+
metrics-hsp--->ns-c1
43+
metrics-hsp--->ns-c2
44+
```
45+
46+
### Key Features
47+
48+
* **Progressive Rollout**: Updates are rolled out gradually to `HostedClusters`
49+
rather than all at once.
50+
* **Lifecycle Automation**: Automatically creates Packages for new
51+
`HostedClusters` and deletes them when the cluster is removed.
52+
* **Status & Monitoring**: Provides status updates on the number
53+
of available, unavailable, or updated packages.
54+
* **Simplified Configuration**: Reduces the configuration surface from
55+
multiple objects to just the `HostedClusterPackage` API.
56+
57+
## HostedClusterPackage Resource
58+
59+
The newly introduced `HostedClusterPackage` resource configuration object
60+
for this functionality. It coordinates the rollout process and which
61+
`HostedClusters` in the fleet are targeted.
62+
63+
### Targeting Clusters
64+
65+
The API includes an optional label selector to target specific HostedClusters
66+
within the Management cluster.
67+
68+
### Partitioning
69+
70+
To control the order of updates, you can attach an optional partition
71+
configuration to the Package API. This ensures that all items within a
72+
specific group are processed before the rollout moves to the next group.
73+
74+
* **Grouping**: The configuration uses labels on the HostedCluster object to
75+
assign groups (e.g., hypershift.openshift.io/risk-group).
76+
* **Ordering**: Groups can be ordered via a static list or by alphanumeric
77+
ascending order.
78+
* **Implicit Handling**: HostedClusters without the specified label or with
79+
unknown values are placed in an implicit "unknown" group and upgraded last.
80+
* **Dynamic Regrouping**: If a cluster's label changes to an earlier group
81+
during an upgrade, the process will jump back to handle that group before
82+
continuing.
83+
84+
```yaml
85+
spec:
86+
partition:
87+
labelKey: hypershift.openshift.io/risk-group
88+
order:
89+
static:
90+
- early
91+
- normal
92+
- late
93+
```
94+
95+
### Progression Strategies
96+
97+
The API supports configurable progression strategies to control the speed
98+
and safety of the rollout.
99+
100+
### Rolling Upgrade
101+
102+
The `rollingUpgrade` strategy is designed to keep service disruptions to a
103+
minimum.
104+
105+
* `maxUnavailable`: Configures the maximum number of Package instances that
106+
can be updating or unavailable at the same time. If a Package is already
107+
unavailable before the upgrade starts, it counts towards this limit. These
108+
unavailable packages are prioritized for updates to prevent accumulating
109+
faulty versions.
110+
111+
```yaml
112+
spec:
113+
strategy:
114+
rollingUpgrade:
115+
maxUnavailable: 1
116+
```
117+
118+
## Status & Observability
119+
120+
The `HostedClusterPackage` API exposes status information to help you track the
121+
progress of a rollout and the health of the fleet. This status can help you
122+
understand if an update is proceeding smoothly or if it has stalled due to
123+
errors.
124+
125+
### Rollout State
126+
127+
The status subresource provides high-level metrics regarding the rollout
128+
process. These fields allow you to quickly assess the distribution of package
129+
versions across your `HostedClusters`:
130+
131+
* **Updated Packages**: The number of `HostedClusters` that have successfully
132+
received the latest version of the Package.
133+
134+
* **Available Packages**: The number of `HostedClusters` where the Package is
135+
currently healthy and serving traffic based on `Available=True` status condition
136+
of the Package
137+
138+
* **Unavailable Packages**: The number of HostedClusters where the Package is
139+
currently degraded or updating. This count is used to enforce the
140+
maxUnavailable limit defined in the progression strategy.
141+
142+
### Progression Logic
143+
144+
The controller uses the status of individual Packages to determine if the
145+
rollout can proceed to the next target.
146+
147+
* **Success**: If a targeted HostedCluster successfully updates and becomes
148+
available, the operator proceeds to select the next cluster in the partition.
149+
150+
* **Failure**: If a Package update fails or becomes unavailable, the rollout
151+
pauses for that specific rollout path. This prevents the propagation of errors
152+
to the rest of the fleet.
153+
154+
## Configuration Example
155+
156+
The following YAML example demonstrates a HostedClusterPackage configured with
157+
risk-based partitioning and a rolling upgrade strategy.
158+
159+
```yaml
160+
apiVersion: package-operator.run/v1alpha1
161+
kind: HostedClusterPackage
162+
metadata:
163+
name: example-hosted-cluster-package
164+
spec:
165+
# Partition Configuration
166+
partition:
167+
labelKey: hypershift.openshift.io/risk-group
168+
order:
169+
alphanumericAsc: {}
170+
strategy:
171+
rollingUpgrade:
172+
maxUnavailable: 1 # Max packages to update concurrently
173+
```

0 commit comments

Comments
 (0)