Skip to content

Conversation

@charliedmcb
Copy link
Contributor

@charliedmcb charliedmcb commented Sep 18, 2025

Fixes #

Description
Add 90-day expiration logic for Linux images to comply with AKS requirements. Images older than 90 days are automatically upgraded to the latest available version.

See this comment for additional option on setting unreadiness, rather than forced upgrade:
#1169 (comment)

How was this change tested?

Note: did notice a new flaky unit test failure, but doesn't seem related to this change:
image
https://github.com/Azure/karpenter-provider-azure/actions/runs/17959552654/job/51079421795?pr=1169#step:6:3093

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

Release Note


charliedmcb and others added 9 commits September 18, 2025 23:32
Add 90-day expiration logic for Linux images to comply with AKS requirements. Images older than 90 days are automatically upgraded to the latest available version.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Separate version parsing logic into dedicated parseLinuxImage function that returns year, month, day, patch, and error for better testability and maintainability.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…n check

Pass context to upgradeAnyExpiredImages function and use logging.ImageID constant for consistent logging across the codebase.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Expire images 30 minutes before the actual 90-day limit to provide a buffer and ensure we don't accidentally try to provision with an image that expires during the provisioning process.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…s TODO

Generalize function name while maintaining Linux-only implementation. Add clear TODO comment indicating where Windows image version format (OS.PATCH.YYMMDD) support should be added.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add OverrideNodeImageVersions field and Reset method to NodeImageVersionsAPI for isolated test data. Expose NodeImageVersionsAPI in test Environment and integrate Reset into Environment.Reset().

Makes nodeImageVersions private to prevent accidental modification.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add test coverage for expired image upgrade functionality including:
- Runtime-generated image versions with mixed ages (expired, recent, malformed)
- SIG configuration with proper subscription ID format
- Exact image comparison validation for upgrade scenarios
- Test isolation using override mechanism in fake API

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Remove the expired image upgrade functionality to isolate it on a separate branch.
This reverts the images controller and tests back to the main branch state
while keeping the fake API improvements on the current branch.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add Error field to NodeImageVersionsAPI to enable testing of error scenarios.
The Error field takes precedence over normal data when set, and is cleared
during Reset() calls for proper test isolation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@charliedmcb charliedmcb changed the base branch from main to charliedmcb/extendListNodeImagesFake September 19, 2025 22:04
Base automatically changed from charliedmcb/extendListNodeImagesFake to main September 19, 2025 22:43
charliedmcb and others added 6 commits September 19, 2025 15:46
Add isLinuxImageVersion function to detect Linux image format (YYYYMM.DD.PATCH)
using regex validation. Update isImageExpired to return early with "not expired"
for non-Linux images, avoiding parsing errors for Windows or other image formats.

This ensures only Linux images are processed for 90-day expiration checking
while gracefully handling other image formats.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Restructure isImageExpired to pre-declare year/month/day variables and use
conditional logic for Linux vs Windows image format detection. Return proper
errors for unsupported formats instead of silently treating them as non-expired.

This maintains error handling for malformed images while providing a clear
structure for adding Windows image support in the future.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Improve isLinuxImageVersion to use regex capture groups and return parsed
components directly. Simplify isImageExpired to use direct assignment and
early return pattern. Remove unused parseLinuxImage function.

This eliminates redundant parsing and makes the code more efficient and
readable while maintaining the same functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
goalImages = overrideAnyGoalStateVersionsWithExisting(nodeClass, goalImages)
updatedImages := overrideAnyGoalStateVersionsWithExisting(nodeClass, goalImages)
// Scenario C: Upgrade any images that are more than 90 days old (AKS compliance requirement)
goalImages = upgradeAnyExpiredImages(ctx, updatedImages, goalImages)
Copy link
Contributor Author

@charliedmcb charliedmcb Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of expiring the images, we could set the images not ready with reason ImagesExpired to prevent provisioning, and only bump them out of this particular unready state when a K8s upgrade occurs, or a maintenance window opens.

Something like:

nodeClass.StatusConditions().SetFalse(v1beta1.ConditionTypeImagesReady, "ImagesExpired", "Images are older than 90 days, and have expired. Must open a maintenance window or preform a Kubernetes upgrade")

charliedmcb and others added 3 commits September 23, 2025 21:08
Reset Azure-specific context options to defaults in BeforeEach to prevent
SIG/CIG configuration from bleeding between tests. This fixes test failures
that occurred when running full test suites where some tests configure
SIG mode while others expect CIG mode.

The issue was that nested BeforeEach blocks were setting UseSIG=true but
the top-level BeforeEach wasn't resetting Azure options back to defaults,
causing context pollution between test runs.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants