diff --git a/hyperfleet/components/adapter/DNS/GCP/cloud-dns-exploration.md b/hyperfleet/components/adapter/DNS-deprecated/GCP/cloud-dns-exploration.md similarity index 100% rename from hyperfleet/components/adapter/DNS/GCP/cloud-dns-exploration.md rename to hyperfleet/components/adapter/DNS-deprecated/GCP/cloud-dns-exploration.md diff --git a/hyperfleet/components/adapter/DNS/GCP/gcp-dns-adapter-spike-report.md b/hyperfleet/components/adapter/DNS-deprecated/GCP/gcp-dns-adapter-spike-report.md similarity index 100% rename from hyperfleet/components/adapter/DNS/GCP/gcp-dns-adapter-spike-report.md rename to hyperfleet/components/adapter/DNS-deprecated/GCP/gcp-dns-adapter-spike-report.md diff --git a/hyperfleet/components/adapter/PullSecret/GCP/gcp-secret-manager-sdk-methods.md b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/gcp-secret-manager-sdk-methods.md similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/gcp-secret-manager-sdk-methods.md rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/gcp-secret-manager-sdk-methods.md diff --git a/hyperfleet/components/adapter/PullSecret/GCP/images/image1.png b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image1.png similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/images/image1.png rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image1.png diff --git a/hyperfleet/components/adapter/PullSecret/GCP/images/image2.png b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image2.png similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/images/image2.png rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image2.png diff --git a/hyperfleet/components/adapter/PullSecret/GCP/images/image3.png b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image3.png similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/images/image3.png rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image3.png diff --git a/hyperfleet/components/adapter/PullSecret/GCP/images/image4.png b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image4.png similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/images/image4.png rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image4.png diff --git a/hyperfleet/components/adapter/PullSecret/GCP/images/image5.png b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image5.png similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/images/image5.png rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/images/image5.png diff --git a/hyperfleet/components/adapter/PullSecret/GCP/pull-secret-requirements.md b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/pull-secret-requirements.md similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/pull-secret-requirements.md rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/pull-secret-requirements.md diff --git a/hyperfleet/components/adapter/PullSecret/GCP/pull-secret-service-ddr.md b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/pull-secret-service-ddr.md similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/pull-secret-service-ddr.md rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/pull-secret-service-ddr.md diff --git a/hyperfleet/components/adapter/PullSecret/GCP/pullsecret-adapter-config.yaml b/hyperfleet/components/adapter/PullSecret-deprecated/GCP/pullsecret-adapter-config.yaml similarity index 100% rename from hyperfleet/components/adapter/PullSecret/GCP/pullsecret-adapter-config.yaml rename to hyperfleet/components/adapter/PullSecret-deprecated/GCP/pullsecret-adapter-config.yaml diff --git a/hyperfleet/components/adapter/framework/adapter-config-template-MVP.yaml b/hyperfleet/components/adapter/framework/adapter-config-template-MVP.yaml deleted file mode 100644 index e92bd04..0000000 --- a/hyperfleet/components/adapter/framework/adapter-config-template-MVP.yaml +++ /dev/null @@ -1,250 +0,0 @@ -# HyperFleet Adapter Framework Configuration Template (MVP) -# -# This is a Configuration Template for configuring cloud provider adapters -# using the HyperFleet Adapter Framework with CEL (Common Expression Language). -# -# TEMPLATE SYNTAX: -# ================ -# 1. Go Templates ({{ .var }}) - Variable interpolation throughout -# 2. field: "path" - Simple JSON path extraction (translated to CEL internally) -# 3. expression: "cel" - Full CEL expressions for complex logic -# -# CONDITION SYNTAX (when:): -# ========================= -# Option 1: Expression syntax (CEL) -# when: -# expression: | -# clusterPhase == "Terminating" -# -# Option 2: Structured conditions (field + operator + value) -# when: -# conditions: -# - field: "clusterPhase" -# operator: "equals" -# value: "Terminating" -# -# Supported operators: equals, notEquals, in, notIn, contains, greaterThan, lessThan, exists -# -# CEL OPTIONAL CHAINING: -# ====================== -# Use optional chaining with orValue() to safely access potentially missing fields: -# resources.?clusterNamespace.?status.?phase.orValue("") -# adapter.?executionStatus.orValue("") -# -# Copy this file to your adapter repository and customize for your needs. - -apiVersion: hyperfleet.redhat.com/v1alpha1 -kind: AdapterConfig -metadata: - # Adapter name (used as resource name and in logs/metrics) - name: example-adapter - namespace: hyperfleet-system - labels: - hyperfleet.io/adapter-type: example - hyperfleet.io/component: adapter - -# ============================================================================ -# Adapter Specification -# ============================================================================ -spec: - # Adapter Information - adapter: - # Adapter version - version: "0.1.0" - - # ============================================================================ - # HyperFleet API Configuration - # ============================================================================ - hyperfleetApi: - # HTTP client timeout for API calls - timeout: 2s - # Number of retry attempts for failed API calls - retryAttempts: 3 - # Retry backoff strategy: exponential, linear, constant - retryBackoff: exponential - - # ============================================================================ - # Kubernetes Configuration - # ============================================================================ - kubernetes: - apiVersion: "v1" - - # ============================================================================ - # Global params - # ============================================================================ - # params to extract from CloudEvent and environment variables - params: - # Environment variables from deployment - - name: "hyperfleetApiBaseUrl" - source: "env.HYPERFLEET_API_BASE_URL" - type: "string" - description: "Base URL for the HyperFleet API" - required: true - - - name: "hyperfleetApiVersion" - source: "env.HYPERFLEET_API_VERSION" - type: "string" - default: "v1" - description: "API version to use" - - # Extract from CloudEvent data - - name: "clusterId" - source: "event.id" - type: "string" - description: "Unique identifier for the target cluster" - required: true - - - # ============================================================================ - # Global Preconditions - # ============================================================================ - # These preconditions run sequentially and validate cluster state before resource operations - preconditions: - # ========================================================================== - # Step 1: Get cluster status - # ========================================================================== - - name: "clusterStatus" - apiCall: - method: "GET" - # NOTE: API path includes /api/hyperfleet/ prefix - url: "{{ .hyperfleetApiBaseUrl }}/api/hyperfleet/{{ .hyperfleetApiVersion }}/clusters/{{ .clusterId }}" - timeout: 10s - retryAttempts: 3 - retryBackoff: "exponential" - # Capture fields from the API response. Captured values become variables for use in resources section. - capture: - - name: "clusterName" - field: "name" - - name: "clusterPhase" - field: "status.phase" - - name: "generationId" - field: "generation" - conditions: - - field: "clusterPhase" - operator: "equals" - value: "NotReady" - - # ============================================================================ - # Resources (Create/Update Resources) - # ============================================================================ - # All resources are created/updated sequentially in the order defined below - resources: - # ========================================================================== - # Resource 1: Cluster Namespace - # ========================================================================== - - name: "clusterNamespace" - manifest: - apiVersion: v1 - kind: Namespace - metadata: - # Use | lower to ensure valid K8s resource name (lowercase RFC 1123) - name: "{{ .clusterId | lower }}" - labels: - hyperfleet.io/cluster-id: "{{ .clusterId }}" - hyperfleet.io/managed-by: "{{ .metadata.name }}" - hyperfleet.io/resource-type: "namespace" - annotations: - hyperfleet.io/created-by: "hyperfleet-adapter" - hyperfleet.io/generation: "{{ .generationId }}" - discovery: - # The "namespace" field within discovery is optional: - # - For namespaced resources: set namespace to target the specific namespace - # - For cluster-scoped resources (like Namespace, ClusterRole): omit or leave empty - # Here we omit it since Namespace is cluster-scoped - bySelectors: - labelSelector: - hyperfleet.io/resource-type: "namespace" - hyperfleet.io/cluster-id: "{{ .clusterId }}" - hyperfleet.io/managed-by: "{{ .metadata.name }}" - - - # ============================================================================ - # Post-Processing - # ============================================================================ - post: - payloads: - # Build status payload inline - - name: "clusterStatusPayload" - build: - # Adapter name for tracking which adapter reported this status - adapter: "{{ .metadata.name }}" - - # Conditions array - each condition has type, status, reason, message - # Use CEL optional chaining ?.orValue() for safe field access - conditions: - # Applied: Resources successfully created - - type: "Applied" - status: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" ? "True" : "False" - reason: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" - ? "NamespaceCreated" - : "NamespacePending" - message: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" - ? "Namespace created successfully" - : "Namespace creation in progress" - - # Available: Resources are active and ready - - type: "Available" - status: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" ? "True" : "False" - reason: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" ? "NamespaceReady" : "NamespaceNotReady" - message: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") == "Active" ? "Namespace is active and ready" : "Namespace is not active and ready" - - # Health: Adapter execution status (runtime) Don't need to update this. This can be reused from the adapter config. - - type: "Health" - status: - expression: | - adapter.?executionStatus.orValue("") == "success" ? "True" : (adapter.?executionStatus.orValue("") == "failed" ? "False" : "Unknown") - reason: - expression: | - adapter.?errorReason.orValue("") != "" ? adapter.?errorReason.orValue("") : "Healthy" - message: - expression: | - adapter.?errorMessage.orValue("") != "" ? adapter.?errorMessage.orValue("") : "All adapter operations completed successfully" - - # Use CEL expression for numeric fields to preserve type (not Go template which outputs strings) - observed_generation: - expression: "generationId" - - # Use Go template with now and date functions for timestamps - observed_time: - value: "{{ now | date \"2006-01-02T15:04:05Z07:00\" }}" - - # Optional data field for adapter-specific metrics extracted from resources - data: - namespace: - name: - expression: | - resources.?clusterNamespace.?metadata.?name.orValue("") - status: - expression: | - resources.?clusterNamespace.?status.?phase.orValue("") - - # ============================================================================ - # Post Actions - # ============================================================================ - # Post actions are executed after resources are created/updated - postActions: - # Report cluster status to HyperFleet API (always executed) - - name: "reportClusterStatus" - apiCall: - method: "POST" - # NOTE: API path includes /api/hyperfleet/ prefix and ends with /statuses - url: "{{ .hyperfleetApiBaseUrl }}/api/hyperfleet/{{ .hyperfleetApiVersion }}/clusters/{{ .clusterId }}/statuses" - body: "{{ .clusterStatusPayload }}" - timeout: 30s - retryAttempts: 3 - retryBackoff: "exponential" - headers: - - name: "Content-Type" - value: "application/json" diff --git a/hyperfleet/components/adapter/framework/adapter-environment-config-template.yaml b/hyperfleet/components/adapter/framework/adapter-environment-config-template.yaml deleted file mode 100644 index 2a86abe..0000000 --- a/hyperfleet/components/adapter/framework/adapter-environment-config-template.yaml +++ /dev/null @@ -1,127 +0,0 @@ -# ============================================================================ -# HyperFleet Environment Configuration Template -# ============================================================================ -# -# Centralized environment-specific configuration shared by all adapters. -# -# Benefits: -# - Single source of truth for environment config -# - No duplication across adapters -# - Easy to update (change ConfigMap, restart pods) -# -# Usage: -# 1. Copy for each environment: hyperfleet-env-{dev|staging|prod}.yaml -# 2. Update values for that environment -# 3. Deploy: kubectl apply -f hyperfleet-env-dev.yaml -# 4. Adapters reference via: envFrom.configMapRef.name: hyperfleet-environment -# -# Configuration Layers: -# 1. Adapter Logic ConfigMap (per adapter) - event filters, resources -# 2. Broker ConfigMap (per environment) - broker connection -# 3. Environment ConfigMap (per environment) - API URL, log level ← THIS FILE -# 4. Deployment env vars (per adapter) - subscription name -# 5. Secrets (per environment) - API tokens -# ============================================================================ - ---- -# Development Environment -apiVersion: v1 -kind: ConfigMap -metadata: - name: hyperfleet-environment - namespace: hyperfleet-system - labels: - app.kubernetes.io/name: hyperfleet - app.kubernetes.io/component: environment-config - hyperfleet.io/environment: development -data: - ENVIRONMENT: "development" - - # HyperFleet API - HYPERFLEET_API_BASE_URL: "http://hyperfleet-api.hyperfleet-system.svc.cluster.local:8080" - HYPERFLEET_API_VERSION: "v1" - - # Observability - LOG_LEVEL: "debug" - METRICS_PORT: "9090" - HEALTH_PORT: "8080" - TRACE_ENABLED: "true" - - # Feature Flags - ENABLE_EXPERIMENTAL_FEATURES: "true" - ENABLE_DEBUG_ENDPOINTS: "true" - ---- -# Staging Environment -apiVersion: v1 -kind: ConfigMap -metadata: - name: hyperfleet-environment - namespace: hyperfleet-system - labels: - app.kubernetes.io/name: hyperfleet - app.kubernetes.io/component: environment-config - hyperfleet.io/environment: staging -data: - ENVIRONMENT: "staging" - - HYPERFLEET_API_BASE_URL: "http://hyperfleet-api.hyperfleet-system.svc.cluster.local:8080" - HYPERFLEET_API_VERSION: "v1" - - LOG_LEVEL: "info" - METRICS_PORT: "9090" - HEALTH_PORT: "8080" - TRACE_ENABLED: "true" - - ENABLE_EXPERIMENTAL_FEATURES: "true" - ENABLE_DEBUG_ENDPOINTS: "false" - ---- -# Production Environment -apiVersion: v1 -kind: ConfigMap -metadata: - name: hyperfleet-environment - namespace: hyperfleet-system - labels: - app.kubernetes.io/name: hyperfleet - app.kubernetes.io/component: environment-config - hyperfleet.io/environment: production -data: - ENVIRONMENT: "production" - - HYPERFLEET_API_BASE_URL: "http://hyperfleet-api.hyperfleet-system.svc.cluster.local:8080" - HYPERFLEET_API_VERSION: "v1" - - LOG_LEVEL: "warn" - METRICS_PORT: "9090" - HEALTH_PORT: "8080" - TRACE_ENABLED: "false" - - ENABLE_EXPERIMENTAL_FEATURES: "false" - ENABLE_DEBUG_ENDPOINTS: "false" - ---- -# ============================================================================ -# Usage in Adapter Deployment -# ============================================================================ -# -# spec: -# template: -# spec: -# containers: -# - name: adapter -# envFrom: -# - configMapRef: -# name: hyperfleet-environment # ← Environment config -# - configMapRef: -# name: hyperfleet-broker-config # ← Broker config -# env: -# - name: SUBSCRIPTION_NAME # ← Adapter-specific -# value: "validation-adapter-sub" -# - name: HYPERFLEET_API_TOKEN # ← Secret -# valueFrom: -# secretKeyRef: -# name: hyperfleet-api-token -# key: token -# ============================================================================ diff --git a/hyperfleet/components/adapter/framework/configs/adapter-business-logic-template-MVP.yaml b/hyperfleet/components/adapter/framework/configs/adapter-business-logic-template-MVP.yaml new file mode 100644 index 0000000..24c4aad --- /dev/null +++ b/hyperfleet/components/adapter/framework/configs/adapter-business-logic-template-MVP.yaml @@ -0,0 +1,252 @@ +# HyperFleet Adapter Framework Configuration Template (MVP) +# +# This is a Configuration Template for configuring cloud provider adapters +# using the HyperFleet Adapter Framework with CEL (Common Expression Language). +# +# TEMPLATE SYNTAX: +# ================ +# 1. Go Templates ({{ .var }}) - Variable interpolation throughout +# 2. field: "path" - Simple JSON path extraction (translated to CEL internally) +# 3. expression: "cel" - Full CEL expressions for complex logic +# ============================================================================ +# CEL OPTIONAL CHAINING: +# ====================== +# Use optional chaining with orValue() to safely access potentially missing fields: +# resources.?clusterNamespace.?status.?phase.orValue("") +# adapter.?executionStatus.orValue("") +# +# Copy this file to your adapter repository and customize for your needs. + +# ============================================================================ +# Global params +# ============================================================================ +# params to extract from CloudEvent and environment variables +params: + # Environment variables from deployment + - name: "hyperfleetApiBaseUrl" + source: "config.hyperfleetApiBaseUrl" + type: "string" + description: "Base URL for the HyperFleet API" + required: true + + - name: "hyperfleetApiVersion" + source: "config.hyperfleetApiVersion" + type: "string" + default: "v1" + description: "API version to use" + + # Extract from CloudEvent data + - name: "clusterId" + source: "event.id" + type: "string" + description: "Unique identifier for the target cluster" + required: true + + +# ============================================================================ +# Global Preconditions +# ============================================================================ +# These preconditions run sequentially and validate cluster state before resource operations +preconditions: + # ========================================================================== + # Step 1: Get cluster status + # ========================================================================== + - name: "clusterStatus" + apiCall: + method: "GET" + # NOTE: API path includes /api/hyperfleet/ prefix + url: "{{ .hyperfleetApiBaseUrl }}/api/hyperfleet/{{ .hyperfleetApiVersion }}/clusters/{{ .clusterId }}" + timeout: 10s + retryAttempts: 3 + retryBackoff: "exponential" + # Capture fields from the API response. Captured values become variables for use in resources section. + capture: + - name: "clusterName" + field: "name" + - name: "generationId" + field: "generation" + - name: "placementClusterName" + field: "status.conditions.placement.data.clusterName" # This is an example how to capture placement cluster name from the cluster status + - name: "namespaceName" + field: "conditions.landingzone.data.namespace.name" + conditions: + - field: "namespaceName" + operator: "notExists" + +# ============================================================================ +# Resources (Create/Update Resources) +# ============================================================================ +# All resources are created/updated sequentially in the order defined below +resources: + # ========================================================================== + # Resource 1: Cluster Namespace + # ========================================================================== + - name: "clusterNamespace" + transport: + client: "kubernetes" + manifests: + - name: "clusterNamespace" + manifest: + apiVersion: v1 + kind: Namespace + metadata: + name: "{{ .clusterId | lower }}" + labels: + hyperfleet.io/cluster-id: "{{ .clusterId }}" + hyperfleet.io/managed-by: "{{ .metadata.name }}" + hyperfleet.io/resource-type: "namespace" + annotations: + hyperfleet.io/managed-by: "{{ .metadata.name }}" + hyperfleet.io/generation: "{{ .generationId }}" + discovery: + # The "namespace" field within discovery is optional: + # - For namespaced resources: set namespace to target the specific namespace + # - For cluster-scoped resources (like Namespace, ClusterRole): omit or leave empty + # Here we omit it since Namespace is cluster-scoped + bySelectors: + labelSelector: + hyperfleet.io/resource-type: "namespace" + hyperfleet.io/cluster-id: "{{ .clusterId }}" + hyperfleet.io/managed-by: "{{ .metadata.name }}" + - name: "agentNamespaceManifestWork" + transport: + client: "maestro" + maestro: + targetCluster: "{{ .placementClusterName }}" + # manifestWork supports both inline configuration and and ref approaches + # ref is suggested as the file method is more readable and maintainable. + manifestWork: + ref: "./manifestwork-ref.yaml" + # apiVersion: work.open-cluster-management.io/v1 + # kind: ManifestWork + # metadata: + # PUTYOURMETADATAHERE + # spec: + # workload: + # manifests: {{ .resources.agentNamespaceManifestWork.manifests | toJson }} + # deleteOption: + # propagationPolicy: "Foreground" + # gracePeriodSeconds: 30 + # manifestConfigs: + # - resourceIdentifier: + # group: "v1" + # resource: "namespaces" + # name: "{{ .clusterId | lower }}" + manifests: + - name: "agentNamespace" + # below manifest can be rendered from a template file, or just inline as below + # Optionally, you can nested it to the manifestWork ref template to avoid maintain outside of manifestWork template + manifest: + apiVersion: v1 + kind: Namespace + metadata: + name: "{{ .clusterId | lower }}" + labels: + hyperfleet.io/cluster-id: "{{ .clusterId }}" + hyperfleet.io/managed-by: "{{ .metadata.name }}" + hyperfleet.io/resource-type: "namespace" + annotations: + hyperfleet.io/created-by: "hyperfleet-adapter" + hyperfleet.io/generation: "{{ .generationId }}" + discovery: + # The "namespace" field within discovery is optional: + # - For namespaced resources: set namespace to target the specific namespace + # - For cluster-scoped resources (like Namespace, ClusterRole): omit or leave empty + # Here we omit it since Namespace is cluster-scoped + bySelectors: + labelSelector: + hyperfleet.io/resource-type: "namespace" + hyperfleet.io/cluster-id: "{{ .clusterId }}" + hyperfleet.io/managed-by: "{{ .metadata.name }}" + + +# ============================================================================ +# Post-Processing +# ============================================================================ +post: + payloads: + # Build status payload inline + - name: "clusterStatusPayload" + build: + # Adapter name for tracking which adapter reported this status + adapter: "{{ .metadata.name }}" + + # Conditions array - each condition has type, status, reason, message + # Use CEL optional chaining ?.orValue() for safe field access + conditions: + # Applied: Resources successfully created + - type: "Applied" + status: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" ? "True" : "False" + reason: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" + ? "NamespaceCreated" + : "NamespacePending" + message: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" + ? "Namespace created successfully" + : "Namespace creation in progress" + + # Available: Resources are active and ready + - type: "Available" + status: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" ? "True" : "False" + reason: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" ? "NamespaceReady" : "NamespaceNotReady" + message: + expression: | + resources.?agentNamespaceManifestWork.agentNamespace.?status.?phase.orValue("") == "Active" ? "Namespace is active and ready" : "Namespace is not active and ready" + + # Health: Adapter execution status (runtime) Don't need to update this. This can be reused from the adapter config. + - type: "Health" + status: + expression: | + adapter.?executionStatus.orValue("") == "success" ? "True" : (adapter.?executionStatus.orValue("") == "failed" ? "False" : "Unknown") + reason: + expression: | + adapter.?errorReason.orValue("") != "" ? adapter.?errorReason.orValue("") : "Healthy" + message: + expression: | + adapter.?errorMessage.orValue("") != "" ? adapter.?errorMessage.orValue("") : "All adapter operations completed successfully" + + # Use CEL expression for numeric fields to preserve type (not Go template which outputs strings) + observed_generation: + expression: "generationId" + + # Use Go template with now and date functions for timestamps + observed_time: + value: "{{ now | date \"2006-01-02T15:04:05Z07:00\" }}" + + # Optional data field for adapter-specific metrics extracted from resources + data: + namespace: + name: + expression: | + resources.?clusterNamespace.?metadata.?name.orValue("") + status: + expression: | + resources.?clusterNamespace.?status.?phase.orValue("") + + # ========================================================================== + # Post Actions + # ========================================================================== + # Post actions are executed after resources are created/updated + postActions: + # Report cluster status to HyperFleet API (always executed) + - name: "reportClusterStatus" + apiCall: + method: "POST" + # NOTE: API path includes /api/hyperfleet/ prefix and ends with /statuses + url: "{{ .hyperfleetApiBaseUrl }}/api/hyperfleet/{{ .hyperfleetApiVersion }}/clusters/{{ .clusterId }}/statuses" + body: "{{ .clusterStatusPayload }}" + timeout: 30s + retryAttempts: 3 + retryBackoff: "exponential" + headers: + - name: "Content-Type" + value: "application/json" diff --git a/hyperfleet/components/adapter/framework/adapter-consumer-configmap-template.yaml b/hyperfleet/components/adapter/framework/configs/adapter-consumer-configmap-template.yaml similarity index 100% rename from hyperfleet/components/adapter/framework/adapter-consumer-configmap-template.yaml rename to hyperfleet/components/adapter/framework/configs/adapter-consumer-configmap-template.yaml diff --git a/hyperfleet/components/adapter/framework/configs/adapter-deployment-config-template.yaml b/hyperfleet/components/adapter/framework/configs/adapter-deployment-config-template.yaml new file mode 100644 index 0000000..93067a0 --- /dev/null +++ b/hyperfleet/components/adapter/framework/configs/adapter-deployment-config-template.yaml @@ -0,0 +1,107 @@ +# HyperFleet Adapter Framework Configuration Template with Maestro Transport +# +# This is an enhanced Configuration Template for configuring cloud provider adapters +# using the HyperFleet Adapter Framework with Maestro transport capabilities for +# remote cluster resource management. +# +# TRANSPORT CLIENTS: +# ================ +# - hyperfleet: HyperFleet API client +# - maestro: Maestro transport client +# +apiVersion: hyperfleet.redhat.com/v1alpha1 +kind: AdapterConfig +metadata: + # Adapter name (used as resource name and in logs/metrics) + name: aro-hcp-adapter + namespace: hyperfleet-system + labels: + hyperfleet.io/adapter-type: aro-hcp + hyperfleet.io/component: adapter + hyperfleet.io/transport: maestro + +# ============================================================================ +# Adapter Specification +# ============================================================================ +spec: + # Adapter Information + adapter: + # Adapter version + # Only check major and minor version for compatibility + # PATCH version can be omitted to "0.2" + version: "0.2.0" + + # ============================================================================ + # NEW: Clients Configuration + # ============================================================================ + # Defines how resources are managed (hyperfleet api client vs transport client) + + clients: + # Maestro-specific configuration this is optional and only used if the adapter uses Maestro transport + maestro: + # Maestro gRPC server connection (for ManifestWork operations) + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_GRPC_SERVER_ADDRESS or flag --maestro-grpc-server-address + grpcServerAddress: "{{ .config.maestro.grpcServerAddress }}" + # Maestro HTTPS server connection (for REST API operations) + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_HTTP_SERVER_ADDRESS or flag --maestro-http-server-address + httpServerAddress: "{{ .config.maestro.httpServerAddress }}" + # Source identifier for CloudEvents routing and status subscription + # it is the adapter name, must be unique across adapters to avoid CloudEvent conflicts + sourceId: "{{ .metadata.name }}" + + # Authentication configuration + auth: + # Authentication type: "tls" (certificate-based mTLS) + type: "tls" + + # TLS certificate configuration + tlsConfig: + # Certificate Authority file path + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_CA_FILE or flag --maestro-ca-file + caFile: "{{ .config.maestro.caFile }}" + # Client certificate file path + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_CERT_FILE or flag --maestro-cert-file + certFile: "{{ .config.maestro.certFile }}" + # Client private key file path + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_KEY_FILE or flag --maestro-key-file + keyFile: "{{ .config.maestro.keyFile }}" + # Server name for TLS verification (used for both gRPC and HTTPS) + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_MAESTRO_SERVER_NAME or flag --maestro-server-name + serverName: "{{ .config.maestro.serverName }}" + + # Connection and timeout settings + timeout: "30s" + retryAttempts: 3 + retryBackoff: "exponential" + + # Keep-alive settings for long-lived gRPC connections + keepalive: + time: "30s" + timeout: "10s" + permitWithoutStream: true + + # ============================================================================ + # HyperFleet API Configuration + # ============================================================================ + httpAPI: + # HTTP client timeout for API calls + timeout: 2s + # Number of retry attempts for failed API calls + retryAttempts: 3 + # Retry backoff strategy: exponential, linear, constant + retryBackoff: exponential + + kubernetes: + apiVersion: "v1" + # optional: kubeconfig file path + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_KUBECONFIG or flag --kubeconfig + # when the kubeconfig value is not set, the adapter will check if host and token are set, if not it will use the in-cluster service account token for authentication + kubeconfig: "{{ .config.kubeconfig }}" + + # optional: host and token for direct authentication + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_KUBECONFIG_HOST or flag --kubeconfig-host + host: "{{ .config.host }}" + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_KUBECONFIG_TOKEN or flag --kubeconfig-token + token: "{{ .config.token }}" + # it can be set by the adapter deployment configmap environment variable HYPERFLEET_KUBECONFIG_CA_FILE or flag --kubeconfig-ca-file + caFile: "{{ .config.caFile }}" \ No newline at end of file diff --git a/hyperfleet/components/adapter/framework/adapter-observability-config-template.yaml b/hyperfleet/components/adapter/framework/configs/adapter-observability-config-template.yaml similarity index 100% rename from hyperfleet/components/adapter/framework/adapter-observability-config-template.yaml rename to hyperfleet/components/adapter/framework/configs/adapter-observability-config-template.yaml diff --git a/hyperfleet/components/adapter/framework/configs/manifestwork-ref.yaml b/hyperfleet/components/adapter/framework/configs/manifestwork-ref.yaml new file mode 100644 index 0000000..4bbb3a4 --- /dev/null +++ b/hyperfleet/components/adapter/framework/configs/manifestwork-ref.yaml @@ -0,0 +1,115 @@ +# ManifestWork Template for External Reference +# File: manifestwork-ref.yaml +# +# This template file defines the ManifestWork structure that wraps Kubernetes manifests +# for deployment via Maestro transport. It's referenced from business logic configs +# using the 'ref' approach for clean separation of concerns. +# +# Template Variables Available: +# - .clusterId: Target cluster identifier +# - .generationId: Resource generation for conflict resolution +# - .adapterName: Name of the adapter creating this ManifestWork +# - .placementCluster: Target cluster name (becomes ManifestWork namespace) +# - .timestamp: Creation timestamp +# - .manifests: Array of rendered Kubernetes manifests (injected by framework) + +apiVersion: work.open-cluster-management.io/v1 +kind: ManifestWork +metadata: + # ManifestWork name - must be unique within consumer namespace + name: "hyperfleet-cluster-setup-{{ .clusterId }}" + + # Labels for identification, filtering, and management + labels: + # HyperFleet tracking labels + hyperfleet.io/cluster-id: "{{ .clusterId }}" + hyperfleet.io/adapter: "{{ .adapterName }}" + hyperfleet.io/component: "infrastructure" + hyperfleet.io/generation: "{{ .generationId }}" + hyperfleet.io/resource-group: "cluster-setup" + + # Maestro-specific labels + maestro.io/source-id: "{{ .adapterName }}" + maestro.io/resource-type: "manifestwork" + maestro.io/priority: "normal" + + # Standard Kubernetes application labels + app.kubernetes.io/name: "aro-hcp-cluster" + app.kubernetes.io/instance: "{{ .clusterId }}" + app.kubernetes.io/version: "v1.0.0" + app.kubernetes.io/component: "infrastructure" + app.kubernetes.io/part-of: "hyperfleet" + app.kubernetes.io/managed-by: "hyperfleet-adapter" + app.kubernetes.io/created-by: "{{ .adapterName }}" + + # Annotations for metadata and operational information + annotations: + # Tracking and lifecycle + hyperfleet.io/created-by: "hyperfleet-adapter-framework" + hyperfleet.io/managed-by: "{{ .adapterName }}" + hyperfleet.io/generation: "{{ .generationId }}" + hyperfleet.io/cluster-name: "{{ .clusterId }}" + hyperfleet.io/deployment-time: "{{ .timestamp }}" + + # Maestro-specific annotations + maestro.io/applied-time: "{{ .timestamp }}" + maestro.io/source-adapter: "{{ .adapterName }}" + + # Operational annotations + deployment.hyperfleet.io/strategy: "rolling" + deployment.hyperfleet.io/timeout: "300s" + monitoring.hyperfleet.io/enabled: "true" + + # Documentation + description: "Complete cluster setup including namespace, configuration, and RBAC" + documentation: "https://docs.hyperfleet.io/adapters/aro-hcp" + +# ManifestWork specification +spec: + # ============================================================================ + # Workload - Contains the Kubernetes manifests to deploy + # ============================================================================ + workload: + # Kubernetes manifests array - injected by framework from business logic config + manifests: {{ .resources.agentNamespaceManifestWork.manifests | toJson }} + + # ============================================================================ + # Delete Options - How resources should be removed + # ============================================================================ + deleteOption: + # Propagation policy for resource deletion + # - "Foreground": Wait for dependent resources to be deleted first + # - "Background": Delete immediately, let cluster handle dependents + # - "Orphan": Leave resources on cluster when ManifestWork is deleted + propagationPolicy: "Foreground" + + # Grace period for graceful deletion (seconds) + gracePeriodSeconds: 30 + + # ============================================================================ + # Manifest Configurations - Per-resource settings for update and feedback + # ============================================================================ + manifestConfigs: + # ======================================================================== + # Configuration for Namespace resources + # ======================================================================== + - resourceIdentifier: + group: "" # Core API group (empty for v1 resources) + resource: "namespaces" # Resource type + name: "{{ .clusterId | lower }}" # Specific resource name + updateStrategy: + type: "ServerSideApply" # Use server-side apply for namespaces + serverSideApply: + fieldManager: "hyperfleet-adapter" # Field manager name for conflict resolution + force: false # Don't force conflicts (fail on conflicts) + feedbackRules: + - type: "JSONPaths" # Use JSON path expressions for status feedback + jsonPaths: + - name: "phase" # Namespace phase (Active, Terminating) + path: ".status.phase" + - name: "conditions" # Namespace conditions array + path: ".status.conditions" + - name: "creationTimestamp" # When namespace was created + path: ".metadata.creationTimestamp" + + \ No newline at end of file diff --git a/hyperfleet/components/adapter/hypershift/GCP/hypershift-controlplane-adapter-spike.md b/hyperfleet/components/adapter/hypershift-deprecated/GCP/hypershift-controlplane-adapter-spike.md similarity index 100% rename from hyperfleet/components/adapter/hypershift/GCP/hypershift-controlplane-adapter-spike.md rename to hyperfleet/components/adapter/hypershift-deprecated/GCP/hypershift-controlplane-adapter-spike.md diff --git a/hyperfleet/components/adapter/maestro-cli/maestro-adapter-integration-strategy.md b/hyperfleet/components/adapter/maestro-cli-deprecated/maestro-adapter-integration-strategy.md similarity index 100% rename from hyperfleet/components/adapter/maestro-cli/maestro-adapter-integration-strategy.md rename to hyperfleet/components/adapter/maestro-cli-deprecated/maestro-adapter-integration-strategy.md diff --git a/hyperfleet/components/adapter/maestro-cli/maestro-cli-implementation.md b/hyperfleet/components/adapter/maestro-cli-deprecated/maestro-cli-implementation.md similarity index 100% rename from hyperfleet/components/adapter/maestro-cli/maestro-cli-implementation.md rename to hyperfleet/components/adapter/maestro-cli-deprecated/maestro-cli-implementation.md diff --git a/hyperfleet/components/adapter/maestro-integration/SPIKE-maestro-adapter-integration.md b/hyperfleet/components/adapter/maestro-integration/SPIKE-maestro-adapter-integration.md new file mode 100644 index 0000000..9ce0596 --- /dev/null +++ b/hyperfleet/components/adapter/maestro-integration/SPIKE-maestro-adapter-integration.md @@ -0,0 +1,429 @@ +# SPIKE: Maestro Client Integration for HyperFleet Adapter Framework + +## Table of Contents + +1. [Overview](#overview) +2. [Current State](#current-state) +3. [Problem Statement](#problem-statement) +4. [Proposed Solution: Maestro Transport Integration](#proposed-solution-maestro-transport-integration) +5. [Design Details](#design-details) + - [1. Updated DSL Structure](#1-updated-dsl-structure) + - [2. Authentication Configuration](#2-authentication-configuration) + - [3. Implementation Components](#3-implementation-components) + - [4. Status Handling & Reporting Cycle](#4-status-handling--reporting-cycle-adaptation) + - [5. Error Handling & Edge Cases](#5-error-handling--edge-cases) + - [6. Deployment Configuration](#6-deployment-configuration) +6. [Implementation Strategy](#implementation-strategy) +7. [Risks & Mitigations](#risks--mitigations) +8. [Success Criteria](#success-criteria) +9. [Alternative Approaches](#alternative-approaches) + - a) [Ultra-High-Volume Watch Processing](#alternative-approach-a) + - b) [Sentinel-Only Polling - SELECTED](#alternative-approach-b) + - c) [Bidirectional Event-Driven Architecture](#alternative-approach-c) + +--- + +## Overview + +This SPIKE document outlines the integration of Maestro client capabilities into the HyperFleet Adapter Framework to enable remote cluster resource management through CloudEvents transportation. The integration will allow adapters to create and manage Kubernetes resources on remote clusters via the Maestro server infrastructure. + +## Current State + +### Existing Adapter Framework Architecture +- **CloudEvent Processing**: Adapters consume CloudEvents from Sentinel every 30 minutes (this is configurable when deploy Sentinel) when clusters are ready +- **Direct K8s API**: Current adapters directly manage resources on local/accessible clusters +- **Status Reporting**: Adapters report status back to HyperFleet API via HTTP REST calls +- **DSL-Based Configuration**: Declarative YAML configuration drives adapter behavior + +### Resource Reporting Cadence +- **Pre-Ready Polling**: Sentinel posts CloudEvents every 10 seconds before cluster becomes ready +- **Ready Polling**: Sentinel posts CloudEvents every 30 minutes when cluster is ready +- **Implications**: Status updates frequency depends on cluster readiness state + +## Problem Statement + +While remote clusters could be accessed via kubeconfig, for ARO-HCP and similar architectures, using Maestro transport provides significant advantages: + +1. **Security**: No need to distribute and manage kubeconfig credentials for remote clusters +2. **Push-based efficiency**: Maestro pushes changes to agents instead of adapters polling remote clusters +3. **Centralized management**: ManifestWork provides a unified way to manage resources across multiple clusters +4. **Maintain adapter execution model** with minimal changes to existing DSL +5. **Work within reporting cycles** imposed by Sentinel (10s pre-ready, 30min ready) + +## Proposed Solution: Maestro Transport Integration + +### High-Level Architecture + +```text +┌─────────────────┐ ┌─────────────────┐ ┌──────────────────-┐ +│ Sentinel │────▶│ HyperFleet │────▶│ Deployment Cluster│ +│ (CloudEvents) │ │ Adapter │ │ (via k8s API) │ +│ Every 30min │ │ │ │ │ +└─────────────────┘ └─────────────────┘ └──────────────────-┘ + │ + ▼ + ┌─────────────────┐ ┌──────────────────┐ + │ Maestro Server │────▶│ Maestro Agent │ + │ (CloudEvents) │ │ (Remote Cluster) │ + └─────────────────┘ └──────────────────┘ +``` + +### Integration Components + +#### 1. Maestro Client Integration +- **Maestro gRPC Source Client**: Create/Update/Delete ManifestWorks +- **Maestro Watch Client**: Monitor resource status changes +- **Connection Management**: Handle gRPC connections and reconnection logic + +#### 2. Transport Abstraction Layer +- **Transport Interface**: Abstract transport (Direct K8s API vs Maestro) +- **Maestro Transport Implementation**: Wrap Maestro client operations +- **Resource Conversion**: Transform DSL resources to ManifestWork format + +#### 3. Authentication & Configuration +- **Maestro Auth Config**: gRPC endpoints, certificates, authentication tokens +- **Consumer Identity**: Cluster identity for Maestro subscription targeting + +## Design Details + +### 1. Updated DSL Structure + +#### Deployment Configuration (adapter-deployment-config-template.yaml) + +Infrastructure settings for Maestro connection. This section is **OPTIONAL** - only configure when maestro transport is used in business logic. + +> **See full example:** [`../framework/configs/adapter-deployment-config-template.yaml`](../framework/configs/adapter-deployment-config-template.yaml) + +#### Business Logic Configuration (adapter-business-logic-template-MVP.yaml) + +Per-resource transport configuration with `targetCluster` resolved from captured params. + +> **See full example:** [`../framework/configs/adapter-business-logic-template-MVP.yaml`](../framework/configs/adapter-business-logic-template-MVP.yaml) + +#### Key Design Decisions + +| Aspect | Location | Notes | +|--------|----------|-------| +| Manifest format | Business logic config | Same K8s manifest format for both `direct` and `maestro` transport | +| Generation management | Both transports | Same behavior: check annotation generation, apply update when generation differs | +| Maestro server connection | Deployment config | Static infrastructure settings | +| Authentication (TLS) | Deployment config | Managed via Helm/secrets | +| consumerName/targetCluster | Business logic config | Dynamic, resolved from precondition captures | +| Transport type per resource | Business logic config | `direct` or `maestro` per resource | +| sourceId | Deployment config | Unique identifier for CloudEvents routing | + +### 2. Authentication Configuration + +#### TLS Certificate-Based Authentication (mTLS) +Authentication is handled via TLS client certificates. Configuration includes: + +- **CA Certificate**: Server certificate authority for verification +- **Client Certificate**: Client certificate for mutual TLS authentication +- **Private Key**: Client private key corresponding to certificate +- **Server Name**: Server hostname for TLS verification + +Connection settings include timeouts, keepalive parameters, and retry configuration. + +### 3. Implementation Components + +#### Transport Interface +```go +// Transport defines the unified interface for resource operations +// Implemented by both DirectTransport (K8s API) and MaestroTransport (CloudEvents) +type Transport interface { + // Create or update a resource + Apply(ctx context.Context, resource *Resource) (*Resource, error) + + // Get current resource (full object with status) + Get(ctx context.Context, resource *Resource) (*Resource, error) + + // Delete a resource + Delete(ctx context.Context, resource *Resource) error + + // List resources via filter + List(ctx context.Context, options ListOptions) ([]*Resource, error) +} + +// Resource represents a full Kubernetes resource with metadata, spec, and status +type Resource struct { + Name string + Namespace string + Object *unstructured.Unstructured // Full K8s object (apiVersion, kind, metadata, spec, status) + Transport TransportMeta +} + +// TransportMeta contains transport-specific information +type TransportMeta struct { + Type string // "direct" or "maestro" + TargetCluster string // For Maestro: consumer name + ManifestWorkName string // For Maestro: ManifestWork name +} + +// ListOptions for both K8s and Maestro list operations +type ListOptions struct { + TargetCluster string // For Maestro: consumer name (target cluster) + LabelSelector string + FieldSelector string + Limit int64 +} + +// DirectTransport implements Transport for direct K8s API access +type DirectTransport struct { + client dynamic.Interface + discovery discovery.DiscoveryInterface + mapper meta.RESTMapper +} + +// MaestroTransport implements Transport for Maestro CloudEvents transport +type MaestroTransport struct { + client workv1client.WorkV1Interface + watcher watch.Interface + config *MaestroConfig + sourceID string +} +``` + +#### Maestro Client Wrapper +```go +// MaestroClientManager handles Maestro client connections and ManifestWork operations +type MaestroClientManager struct { + client workv1client.WorkV1Interface + config *MaestroConfig + watchers map[string]watch.Interface + mu sync.RWMutex +} + +// Methods: +// - CreateManifestWork(ctx, consumerName, resource) (*workv1.ManifestWork, error) +// - GetManifestWork(ctx, consumerName, workName) (*workv1.ManifestWork, error) +// - DeleteManifestWork(ctx, consumerName, workName) error +// - WatchManifestWorkStatus(ctx, consumerName) (watch.Interface, error) +``` + +### 4. Status Handling & Reporting Cycle Adaptation + +#### Unified Status Access + +The transport layer abstracts status retrieval - business logic accesses `resources.?resourceName.?status...` regardless of transport type. The framework handles: + +- **Direct transport**: Status from K8s API response +- **Maestro transport**: Status extracted from ManifestWork conditions + +> **See full example:** [`../framework/configs/adapter-business-logic-template-MVP.yaml`](../framework/configs/adapter-business-logic-template-MVP.yaml) (post section) + +#### Reporting Cycle Considerations (Sentinel timer is configurable via deployment) +- **Pre-Ready (10s polling)**: More frequent status updates during cluster provisioning +- **Ready (30min polling)**: Standard reporting cadence once cluster is ready +- **Timeout handling**: Framework manages timeouts per transport type + +### 5. Error Handling & Edge Cases + +#### Connection Failures +- **Retry with backoff**: Configurable max retries and exponential backoff +- **Fallback mode**: Optional fallback to direct API if Maestro connection fails +- **Health checks**: Periodic connection health verification + +#### Status Synchronization Issues +- **Partial ManifestWork failures**: Report degraded status with specific failure reasons +- **Late status updates**: Extend timeout for next cycle if critical resources are pending +- **Lost CloudEvents**: Implement status reconciliation on next cycle + +### 6. Deployment Configuration + +#### Configuration Approach + +Maestro transport settings are configured in `AdapterConfig` (see `adapter-deployment-config-template.yaml`), not a separate ConfigMap. This provides: +- Single source of truth for adapter configuration +- CRD schema validation +- Helm values override support + +#### Required Secret Mounts + +TLS certificates must still be mounted from Kubernetes Secrets: + +```yaml +spec: + containers: + - name: adapter + volumeMounts: + - name: maestro-certs + mountPath: /etc/maestro/certs + readOnly: true + volumes: + - name: maestro-certs + secret: + secretName: maestro-client-certs +``` + +#### Secret Structure +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: maestro-client-certs + namespace: hyperfleet-system +type: Opaque +data: + ca.crt: + hyperfleet-client.crt: + hyperfleet-client.key: +``` + +## Implementation Strategy + +### Phase 1: Transport Abstraction +1. **Create Transport Interface**: Define abstraction layer for resource operations (Apply, Get, Delete, List) +2. **Implement DirectTransport**: Wrap K8s dynamic client to implement Transport interface +3. **K8s Client Setup**: Configure K8s client with kubeconfig or service account authentication + +### Phase 2: Maestro Transport Implementation +1. **Maestro Client Integration**: Implement MaestroTransport with gRPC client +2. **Configuration Extension**: Add maestro section to adapter DSL +3. **Authentication Handling**: Implement TLS-based authentication +4. **Status Mapping**: Convert ManifestWork status to adapter status + +### Phase 3: Config Loader and Executor Implementation +1. **Config Loader**: Parse and validate AdapterConfig and business logic YAML +2. **Resource Executor**: Execute resource operations based on business logic config +3. **Transport Selection**: Route operations to DirectTransport or MaestroTransport based on config +4. **Status Builder**: Build status payloads from CEL expressions in config + +### Phase 4: Example Implementation & Helm Integration +1. **Example Adapter**: Implement reference adapter using Maestro transport +2. **Business Logic Example**: Create sample business logic config with both direct and maestro resources +3. **Helm Charts Update**: Add Maestro configuration options to adapter Helm charts +4. **Secret Management**: Helm templates for Maestro TLS certificates + +## Risks & Mitigations + +### Technical Risks + +| Risk | Impact | Mitigation | +|------|--------|------------| +| **Maestro Connection Failures** | Medium - Unable to manage remote resources | Retry with backoff, connection health monitoring | +| **ManifestWork Status Delays** | Low - Late status reporting | Sentinel timer syncs on next polling cycle | +| **Resource Conversion Errors** | Medium - Failed resource creation | Same K8s manifest format for both transports, validation at config load | +| **Reporting Cycle Timing** | Low - Status update latency for customers | Pre-ready 10s polling minimizes latency during provisioning; 30min acceptable when cluster is stable | +| **Traceability Gap** | Medium - Hard to trace resource full cycle | Maestro lacks OTel trace_id/span_id support; use resource labels and adapter logs for correlation | + +### Operational Risks + +| Risk | Impact | Mitigation | +|------|--------|------------| +| **Maestro Server Downtime** | High - Complete adapter failure | Multi-region Maestro deployment, circuit breaker pattern | +| **CloudEvent Message Loss** | Low - Missed status updates | Maestro has built-in event management; Sentinel timer syncs status on next polling cycle | +| **Resource Drift** | Medium - Inconsistent cluster state | Implement drift detection, reconciliation cycles | + +## Success Criteria + +### Functional Requirements +- ✅ **Remote Resource Management**: Successfully create/update/delete resources on remote clusters +- ✅ **Status Reporting**: Accurate status reporting within 30-minute windows +- ✅ **Authentication**: Secure connection to Maestro infrastructure +- ✅ **Backward Compatibility**: Existing adapters continue working without changes +- ✅ **Error Handling**: Graceful failure handling and recovery + +### Operational Requirements +- ✅ **Deployment**: Simple configuration and deployment process +- ✅ **Monitoring**: Comprehensive metrics and health checks +- ✅ **Troubleshooting**: Clear error messages and debugging capabilities +- ✅ **Documentation**: Complete development and operational documentation + +## Alternative Approaches + + + +### Alternative Approach a) Ultra-High-Volume Watch Processing + +#### Purpose +This approach aims to resolve status updating latency by continuously watching ManifestWork status changes for a period of time, providing near real-time status updates instead of waiting for Sentinel polling cycles. + +#### How It Works +- Keep a Watch connection open to Maestro for ManifestWork status changes +- Process status updates as they arrive rather than polling +- Report status immediately when changes are detected + +#### Why We Don't Recommend This Approach + +| Concern | Impact | +|---------|--------| +| **Goroutine Management Complexity** | Multiple goroutines for watch, processing, and reporting require careful lifecycle management | +| **Resource Cost** | Long-lived connections and continuous processing consume significant memory and CPU | +| **Framework Instability** | Complex concurrency patterns increase risk of goroutine leaks, deadlocks, and race conditions | +| **Traceability Gap** | Status updates from Maestro broker are not correlated with Sentinel events, making it hard to trace the full request lifecycle | +| **Insufficient CloudEvent Parameters** | Maestro CloudEvents don't contain enough parameters to build payloads; can use ManifestWork labels as workaround, but further implementation needed if more parameters introduced | +| **Diminishing Returns** | Pre-ready 10s polling already provides acceptable latency during provisioning | + +#### Recommendation +Use the standard polling-based approach with Sentinel timer: +- **Pre-ready (10s polling)**: Acceptable latency during cluster provisioning +- **Ready (30min polling)**: Sufficient for stable clusters +- **Simpler architecture**: No goroutine management complexity +- **Predictable resource usage**: No long-lived watch connections +- **Better traceability**: All operations triggered by Sentinel events, easier to trace full lifecycle + + + +### Alternative Approach b) Sentinel-Only Polling - SELECTED + +#### Purpose +Only subscribe to Sentinel events for triggering resource operations and status updates. No Maestro broker watching - status updates are fully controlled by Sentinel polling cycles. + +#### How It Works +```text +Sentinel Event → Adapter → Apply Resources via Maestro → Get Status via Maestro → Report to HyperFleet API +``` + +- **Single event source**: Only Sentinel CloudEvents trigger adapter execution +- **Polling-based status**: Status fetched from Maestro when Sentinel triggers, not pushed +- **Same pattern as K8s**: Follows same logic as direct K8s client - apply and get status on demand + +#### Why We Recommend This Approach + +| Benefit | Description | +|---------|-------------| +| **Simplicity** | Single goroutine per broker subscription, no complex watch management | +| **Traceability** | All operations tied to Sentinel events, easy to trace full request lifecycle | +| **Consistency** | Same execution pattern for both direct K8s and Maestro transports | +| **Predictable** | No background processes, resource usage scales with Sentinel polling rate | + +#### Trade-off + +| Concern | Impact | Mitigation | +|---------|--------|------------| +| **Status Update Latency** | Customers see status updates only when Sentinel polls | Pre-ready 10s polling minimizes latency during provisioning; 30min acceptable for stable clusters | + +#### Why We Selected This Approach + +This option is the **simplest solution that will work for MVP**. It keeps the system straightforward: + +- **Centralized decision making**: Decisions on when to report status to API are taken at the Sentinel +- **Reactive adapters**: Adapters are reactive only to Sentinel pulses, no autonomous background processing +- **Flexible polling frequency**: Sentinel can increase the frequency of pulses for Ready clusters if required +- **Future extensibility**: Changes to more frequent updates (real-time from Maestro) will be considered after MVP + + + +### Alternative Approach c) Bidirectional Event-Driven Architecture + +#### Purpose +Create a fully event-driven system where adapters only process events and publish status back through broker infrastructure, with Sentinel handling all HyperFleet API calls. + +#### How It Works +```text +Maestro Events → Adapter (Event Processor) → HyperFleet Broker → Sentinel → HyperFleet API +``` + +- **Adapter**: Pure event transformer, no direct API calls +- **Broker**: Central event hub for status distribution +- **Sentinel**: Centralized API client for HyperFleet + +#### Why We Don't Recommend This Approach + +| Concern | Impact | +|---------|--------| +| **Infrastructure Complexity** | Requires additional broker infrastructure and topic management | +| **Increased Latency** | Multiple event hops add latency to status reporting | +| **Event Ordering Challenges** | Maintaining event order across distributed components is complex | +| **Operational Overhead** | More components to monitor, debug, and maintain | +| **Over-Engineering** | Current scale doesn't justify this level of decoupling | \ No newline at end of file diff --git a/hyperfleet/components/adapter/maestro-cli/maestro-architecture.md b/hyperfleet/components/adapter/maestro-integration/maestro-architecture-introduction.md similarity index 72% rename from hyperfleet/components/adapter/maestro-cli/maestro-architecture.md rename to hyperfleet/components/adapter/maestro-integration/maestro-architecture-introduction.md index 1da0f18..8dca4d9 100644 --- a/hyperfleet/components/adapter/maestro-cli/maestro-architecture.md +++ b/hyperfleet/components/adapter/maestro-integration/maestro-architecture-introduction.md @@ -111,6 +111,95 @@ Communication Flow: --- +## Maestro Event Flow and Processing Patterns + +### CloudEvents Data Flow Architecture + +``` +Maestro Server ←→ gRPC CloudEvents Stream ←→ Client (Watch API) +``` + +**Key Finding**: In **gRPC mode**, Maestro uses an **integrated gRPC broker** - no external message broker required. See [Deployment Modes](#deployment-modes) section for broker-specific details. + +### Watch Processing Patterns + +#### **Watch Implementation Architecture** +``` +workClient.ManifestWorks(consumerName).Watch(ctx, metav1.ListOptions{}) +``` + +**Key Finding**: Watch uses **hybrid approach** - HTTP REST API for initial state + CloudEvents for live updates! + +**Watch Processing Flow:** +1. **Initial List**: HTTP REST API call to `/api/maestro/v1/resource-bundles` +2. **CloudEvents Subscription**: Subscribe for real-time ManifestWork changes +3. **Event Handler**: Process incoming CloudEvents and forward to watch channel + +**Connection & Protocol Details:** +- **Protocol**: gRPC CloudEvents streaming (not HTTP polling) +- **Authentication**: TLS/mTLS or token-based +- **Filtering**: By consumerName (target cluster) and sourceID (client identifier) +- **Performance**: Synchronous initial load + asynchronous live updates + +#### **Direct CloudEvents Subscription (Alternative)** +``` +Client → CloudEventsClient.Subscribe() → CloudEvents Stream (Skip Watch API) +``` + +**Characteristics:** +- **Transport**: Direct CloudEvents subscription (bypass Watch wrapper) +- **Data Source**: Live CloudEvent stream only (no initial REST API call) +- **Performance**: Highest throughput, but loses initial state synchronization +- **Use Case**: Event-driven processing where current state not required + +### Event Processing Volume Analysis + +#### **High-Volume Event Scenarios** +Based on our analysis of thousands of events every 10 seconds: + +**Processing Approaches:** + +1. **Sequential Processing** (Standard) + - Process each Watch event individually + - **Capacity**: 50-100 events per 10-second window + - **Bottleneck**: Sequential event handling blocks subsequent events + +2. **Parallel Processing** (High-Volume) + - Single Watch goroutine + multiple worker goroutines + - **Capacity**: 1,000+ events per 10-second window + - **Architecture**: One connection feeding multiple processors + +3. **Event-Driven Processing** (Enterprise Scale) + - Pure event subscription without Watch API + - **Capacity**: 10,000+ events per second + - **Architecture**: Direct broker subscription with event transformation + +### Connection Management + +#### **gRPC Mode (Recommended)** +- **One gRPC connection per client** to Maestro server +- **SourceID-based filtering**: Each client gets events filtered by its sourceID +- **No message competition**: Each client receives independent event stream +- **Authentication**: TLS/mTLS or token-based + +#### **MQTT/Pub-Sub Mode** +- See [Deployment Modes](#deployment-modes) section for broker-specific details +- **Key difference**: Potential message competition depending on topic design + +### Event Deduplication and Filtering + +#### **Event Characteristics** +- **Status Updates**: Frequent condition changes generate multiple events +- **Generation Tracking**: API generation correlation for conflict resolution +- **Event Ordering**: CloudEvents provide sequencing and delivery guarantees + +#### **Filtering Strategies** +- **Label-based**: Filter by cluster ID, resource type, adapter name +- **Generation-based**: Process only newer generation events +- **SourceID-based**: Events filtered by client's sourceID parameter + +--- + ## Communication Protocols ### HTTP REST API @@ -231,8 +320,7 @@ This becomes a Maestro `resource` with: 2. **gRPC Authentication** - TLS/mTLS for transport security - - Token-based authentication - - Certificate-based client auth + - Certificate-based client authentication (mTLS) 3. **Agent Authentication** - mTLS between server and agents diff --git a/hyperfleet/components/adapter/validation/GCP/gcp-validation-adapter-spike-report.md b/hyperfleet/components/adapter/validation-deprecated/GCP/gcp-validation-adapter-spike-report.md similarity index 100% rename from hyperfleet/components/adapter/validation/GCP/gcp-validation-adapter-spike-report.md rename to hyperfleet/components/adapter/validation-deprecated/GCP/gcp-validation-adapter-spike-report.md