Validate AWS account consistency before Karpenter operations#2892
Validate AWS account consistency before Karpenter operations#2892
Conversation
Prevent accidental cross-account resource deployment by verifying that the current AWS credentials and the target EKS cluster belong to the same AWS account. This check runs in both `install` and `uninstall` commands right after clients are built. In the install path, a mismatch is a hard error. In the uninstall path, a confirmed mismatch is a hard error but unreachable clusters (e.g. already deleted) produce a warning so cleanup can still proceed. Also extract a shared GetAWSAccountID helper to deduplicate the STS GetCallerIdentity boilerplate that was repeated in install and uninstall. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@codex review |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (14.92%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #2892 +/- ##
==========================================
- Coverage 40.06% 40.05% -0.01%
==========================================
Files 319 319
Lines 28039 28676 +637
==========================================
+ Hits 11233 11487 +254
- Misses 15983 16360 +377
- Partials 823 829 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 9 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
🎯 Code Coverage (details) 🔗 Commit SHA: 8daa146 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29130a60da
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| cluster, err := cli.EKS.DescribeCluster(ctx, &eks.DescribeClusterInput{ | ||
| Name: awssdk.String(clusterName), | ||
| }) |
There was a problem hiding this comment.
Validate cluster identity from kubeconfig, not by name lookup
This check can return a false “accounts match” result because it resolves the cluster via DescribeCluster using the same AWS credentials being validated. If the wrong AWS account has a cluster with the same name (a common pattern like prod/staging), DescribeCluster returns that account’s cluster ARN, so the comparison always passes and cross-account install/uninstall can still proceed against the kubeconfig-selected Kubernetes cluster. To actually prevent cross-account operations, the cluster account must be derived from kubeconfig context data (for example, the context ARN when present) or another identity source independent of the current AWS credentials.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in a3b59f4.
The validation now extracts the AWS account ID from the kubeconfig context ARN (via GetAccountIDFromKubeconfig) instead of relying on DescribeCluster with the credentials being validated. This is independent of the AWS credentials and cannot be fooled by same-named clusters across accounts.
When the kubeconfig context is not an ARN (eksctl format, plain name), falls back to DescribeCluster with the known limitation documented in the code.
The previous approach used DescribeCluster with the same AWS credentials being validated, which gives a false positive when both accounts have a cluster with the same name (e.g. "prod"). Now extract the account ID directly from the kubeconfig context when it is an EKS ARN (arn:aws:eks:region:account:cluster/name). This source is independent of the AWS credentials and cannot be fooled by same-named clusters. Falls back to DescribeCluster only when the kubeconfig context is not an ARN (eksctl format, plain name). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract shared helper to avoid duplicating kubeconfig loading and context resolution between GetClusterNameFromKubeconfig and GetAccountIDFromKubeconfig. Also rename ctx variable to kubeCtx to avoid shadowing context.Context convention, and remove inline comments that repeated the function's docstring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nsistency Make getAccountIDFromKubeconfig private and call it from inside ValidateAWSAccountConsistency instead of requiring callers to pre-compute and pass the kubeconfig account ID. This simplifies the call sites and encapsulates the two-strategy validation logic (kubeconfig ARN vs DescribeCluster fallback). Also inline the trivial validateAccountIDs helper and remove its now-unnecessary tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What does this PR do?
Adds a guard to verify that the AWS credentials being used belong to the same AWS account as the target EKS cluster before any
kubectl datadog autoscaling clusteroperations (install/uninstall). This prevents accidentally creating CloudFormation stacks in one AWS account while deploying Helm charts to an EKS cluster in another.Motivation
The install and uninstall commands create both AWS resources (CloudFormation stacks, IAM roles) and Kubernetes resources (Helm chart, Karpenter CRDs). The AWS clients are configured from the default AWS credential chain, while the Kubernetes clients come from kubeconfig. Without validation, a misconfigured environment could silently target different AWS accounts for each.
Additional Notes
GetAWSAccountIDhelper to deduplicate theSTS.GetCallerIdentityboilerplate that was repeated in install and uninstall.AccountMismatchErrortyped error so the uninstall flow can distinguish "confirmed mismatch" from "cannot verify".Minimum Agent Versions
N/A — this is a
kubectl-datadogplugin change, not an agent change.Describe your test plan
validateAccountIDspure logic covering: matching accounts, mismatched accounts, GovCloud/China partitions, and invalid ARN format.go build ./cmd/kubectl-datadog/...passes.go test ./cmd/kubectl-datadog/autoscaling/cluster/common/clients/...passes.go vet ./cmd/kubectl-datadog/...passes.Checklist
bug,enhancement,refactoring,documentation,tooling, and/ordependenciesqa/skip-qalabel🤖 Generated with Claude Code