-
Notifications
You must be signed in to change notification settings - Fork 28
Description
tyk-operator-conf secret creation via post-install hook breaks atomic deploys and prevents adding operator to existing installations
Describe the bug
The tyk-operator-conf secret is created by the tyk-bootstrap chart via a post-install Helm hook. This design causes two significant issues:
-
Atomic deploys fail: When using
helm install --atomic, the tyk-operator deployment starts before thepost-installhook runs, causing pods to fail becausetyk-operator-confsecret doesn't exist yet. This results in a failed atomic install. -
Cannot add operator to existing installation: Since
post-installhooks only run duringhelm install(nothelm upgrade), users who initially deployed without the operator cannot later enable it - the bootstrap job won't run again to create the required secret.
Expected behavior
- Atomic deploys should work when
global.components.operator: true - Users should be able to enable the operator on existing installations via
helm upgrade
Current behavior
- Operator pods fail on initial install until post-install hook completes (breaks
--atomic) - Enabling operator on existing installation requires manually creating the
tyk-operator-confsecret
Steps to reproduce
Scenario 1 - Atomic install failure
helm install tyk-stack tyk-helm/tyk-stack -n tyk --atomic \
--set global.components.operator=true \
--set global.secrets.useSecretName=my-tyk-secrets
# ... other valuesResult: Install fails because operator pods can't start without the tyk-operator-conf secret.
Note: Even though the operator license is provided via global.secrets.useSecretName, the bootstrap job still creates a separate tyk-operator-conf secret that the operator deployment requires.
Scenario 2 - Adding operator to existing installation
# Initial install without operator
helm install tyk-stack tyk-helm/tyk-stack -n tyk \
--set global.components.operator=false \
--set global.secrets.useSecretName=my-tyk-secrets
# Later, try to enable operator
helm upgrade tyk-stack tyk-helm/tyk-stack -n tyk \
--set global.components.operator=true \
--set global.secrets.useSecretName=my-tyk-secretsResult: Operator pods fail because tyk-operator-conf secret is never created (post-install hook doesn't run on upgrade).
The secret provided via global.secrets.useSecretName contains the operator license key, but the bootstrap job that reads this and creates tyk-operator-conf only runs on initial install.
Suggested solutions
-
Use
pre-install,pre-upgradehook instead ofpost-install: Change the bootstrap job to run as apre-install,pre-upgradehook. This ensures thetyk-operator-confsecret exists before the operator deployment is created. The challenge here is that the bootstrap job currently waits for Dashboard to be ready to fetchTYK_AUTHandTYK_ORG- this would need to be redesigned (e.g., create secret with known values fromglobal.secrets.useSecretNamerather than fetching from Dashboard API). -
Create the secret via a pre-install hook with values from the user-provided secret: Since users already provide credentials via
global.secrets.useSecretName, apre-installhook could createtyk-operator-confby copying relevant values from that secret, rather than bootstrapping Dashboard first. -
Document the limitation: At minimum, document that:
--atomicinstalls are not supported when operator is enabled- Adding operator to existing installations requires manual secret creation
Environment
- Helm chart version: tyk-stack (latest)
- Kubernetes version: N/A (design issue)
Additional context
The core issue is that the bootstrap job runs as a post-install hook, meaning it executes after all resources (including the operator deployment) are created. The operator deployment immediately fails because it references a secret that doesn't exist yet.
The tyk-k8s-bootstrap application has logic to detect existing organizations and skip recreation, so idempotency is already handled. However, the fundamental timing issue (post vs pre) needs to be addressed for atomic deploys to work.
For scenario 2 (adding operator to existing installation), adding post-upgrade to the hook would help, but only if combined with solution 3 (optional secretRef) to handle the initial pod failure gracefully.