Add Fargate profile for Karpenter controller by L3n41c · Pull Request #2891 · DataDog/datadog-operator

L3n41c · 2026-04-10T16:54:29Z

What does this PR do?

Adds AWS Fargate support to the kubectl datadog autoscaling cluster install command so the Karpenter controller runs on dedicated Fargate nodes instead of regular EC2 nodes.

Motivation

Karpenter's Helm chart includes a default node affinity (karpenter.sh/nodepool: DoesNotExist) that prevents the controller from running on nodes it manages. This creates a migration problem: when migrating all workloads from pre-existing nodes to Karpenter-managed nodes, the controller itself cannot be migrated, forcing users to keep some pre-existing nodes.

Running Karpenter on Fargate solves this by providing dedicated serverless compute, separate from both pre-existing and Karpenter-managed EC2 nodes. Users can then fully decommission their pre-existing node groups.

Additional Notes

Changes:

Private subnet discovery from cluster VPC route tables (Fargate requires private subnets)
FargatePodExecutionRole and FargateProfile added to CloudFormation template (conditional)
--no-fargate flag for opt-out
Controller resource requests set when running on Fargate (for proper Fargate sizing)
Existing Fargate profiles are preserved on re-run if subnet discovery fails transiently
Fargate profile displayed in uninstall resource summary

Fargate compatibility:

Karpenter is a regular Deployment (not DaemonSet) — works on Fargate
No privileged containers or persistent volumes needed
Fargate nodes don't have karpenter.sh/nodepool label, so default affinity is satisfied
Resource needs (1 vCPU, 1Gi memory) are within Fargate limits

Minimum Agent Versions

N/A — this change affects the kubectl plugin only, not the operator or agent.

Describe your test plan

Unit tests: go test ./cmd/kubectl-datadog/autoscaling/cluster/install/guess/... (12 test cases for subnet filtering)
Build: go build ./cmd/kubectl-datadog/...
Manual: Run kubectl datadog autoscaling cluster install --cluster-name <name> on a test EKS cluster and verify:
- Fargate profile is created in the EKS console
- Karpenter pods run on Fargate nodes (kubectl get pods -n dd-karpenter -o wide shows fargate-* node names)
- kubectl datadog autoscaling cluster uninstall cleans up the Fargate profile via CloudFormation

Checklist

PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
PR has a milestone or the qa/skip-qa label
All commits are signed (see: signing commits)

🤖 Generated with Claude Code

Karpenter's Helm chart includes a default node affinity (karpenter.sh/nodepool: DoesNotExist) that prevents the controller from running on nodes it manages. This creates a migration problem: users must keep pre-existing nodes just for the Karpenter controller. Create an AWS Fargate profile for the Karpenter namespace so the controller runs on dedicated serverless compute, separate from both pre-existing and Karpenter-managed EC2 nodes. This allows users to fully migrate workloads to Karpenter-managed nodes. Changes: - Add private subnet discovery from cluster VPC route tables - Add FargatePodExecutionRole and FargateProfile to CloudFormation - Add --no-fargate flag for opt-out - Set controller resource requests when running on Fargate - Preserve existing Fargate profile on re-run if subnet discovery fails - Display Fargate profile in uninstall resource summary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix govet shadow error: use = instead of := for err in createCloudFormationStacks to avoid shadowing named return value - Fix gci import ordering in uninstall.go: ec2/types before eks - Remove trailing blank line in privatesubnets.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

L3n41c · 2026-04-10T17:14:22Z

@codex review

codecov-commenter · 2026-04-10T17:16:15Z

Codecov Report

❌ Patch coverage is 23.38710% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.00%. Comparing base (5adfc81) to head (5b9a129).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...ctl-datadog/autoscaling/cluster/install/install.go	0.00%	46 Missing ⚠️
...utoscaling/cluster/install/guess/privatesubnets.go	42.64%	39 Missing ⚠️
...datadog/autoscaling/cluster/uninstall/uninstall.go	0.00%	10 Missing ⚠️

❌ Your patch status has failed because the patch coverage (23.38%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2891      +/-   ##
==========================================
- Coverage   40.06%   40.00%   -0.06%     
==========================================
  Files         319      320       +1     
  Lines       28039    28153     +114     
==========================================
+ Hits        11233    11262      +29     
- Misses      15983    16068      +85     
  Partials      823      823

Flag	Coverage Δ
unittests	`40.00% <23.38%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...datadog/autoscaling/cluster/uninstall/uninstall.go	`0.00% <0.00%> (ø)`
...utoscaling/cluster/install/guess/privatesubnets.go	`42.64% <42.64%> (ø)`
...ctl-datadog/autoscaling/cluster/install/install.go	`13.18% <0.00%> (-2.01%)`	⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5adfc81...5b9a129. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

datadog-prod-us1-3 · 2026-04-10T17:16:36Z

🎯 Code Coverage (details)
• Patch Coverage: 22.88%
• Overall Coverage: 40.08% (-0.07%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 5b9a129 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b9a12952d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-10T17:25:03Z

cmd/kubectl-datadog/autoscaling/cluster/install/install.go

 		"DeployPodIdentityAddon": strconv.FormatBool(!isUnmanagedEKSPIAInstalled),
 		"DeployNodeAccessEntry":  strconv.FormatBool(supportsAPIAuth),
+		"DeployFargateProfile":   strconv.FormatBool(deployFargate),
+		"FargateSubnets":         fargateSubnets,


Keep Fargate subnets stable across re-installs

This always sends a freshly discovered FargateSubnets value during stack updates, which can break idempotency: in AWS::EKS::FargateProfile, Subnets is a replacement-only property, while the template fixes FargateProfileName to ${ClusterName}-karpenter. If the discovered subnet set/order changes after first install (for example after VPC/subnet changes), CloudFormation needs to replace a custom-named profile and the update fails, so kubectl datadog autoscaling cluster install can fail on re-run. To avoid this, keep using the existing profile subnets whenever the profile already exists (not only on discovery errors), or stop pinning the profile name.

Useful? React with 👍 / 👎.

L3n41c added the enhancement New feature or request label Apr 10, 2026

github-actions bot added the team/container-autoscaling label Apr 10, 2026

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

L3n41c added this to the v1.27.0 milestone Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Fargate profile for Karpenter controller#2891

Add Fargate profile for Karpenter controller#2891
L3n41c wants to merge 2 commits intomainfrom
lenaic/karpenter-fargate-profile

L3n41c commented Apr 10, 2026

Uh oh!

L3n41c commented Apr 10, 2026

Uh oh!

codecov-commenter commented Apr 10, 2026

Uh oh!

datadog-prod-us1-3 bot commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

L3n41c commented Apr 10, 2026

What does this PR do?

Motivation

Additional Notes

Minimum Agent Versions

Describe your test plan

Checklist

Uh oh!

L3n41c commented Apr 10, 2026

Uh oh!

codecov-commenter commented Apr 10, 2026

Codecov Report

Uh oh!

datadog-prod-us1-3 bot commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants