-
Notifications
You must be signed in to change notification settings - Fork 245
feat: enable localdns hosts plugin to cache critical AKS FQDNs #7639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a periodic “local DNS cache” for mcr.microsoft.com that writes resolved IPs into /etc/hosts.testing and wires CoreDNS/localdns to consult that file before forwarding, improving reliability and latency for MCR image pulls when LocalDNS is enabled (especially in the scriptless path).
Changes:
- Adds
mcr-hosts-setupscript, systemd service, and timer into the VHD build pipeline and node provisioning flow, including a newshouldEnableMCRHostsSetuphelper and CSE wiring to enable the timer when LocalDNS (scriptless) is enabled. - Updates the localdns CoreDNS template and associated tests to add a
hosts /etc/hosts.testingplugin block so MCR lookups can be served from the generated hosts file before going to Azure DNS. - Adds targeted shellspec coverage for the new
mcr-hosts-setupbehavior and for enabling its timer, and refreshes bakedCustomDatablobs used in VHD-related tests to include the new artifacts.
Reviewed changes
Copilot reviewed 31 out of 83 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
vhdbuilder/packer/vhd-image-builder-mariner*.json, vhdbuilder/packer/vhd-image-builder-flatcar*.json, vhdbuilder/packer/vhd-image-builder-cvm.json, vhdbuilder/packer/vhd-image-builder-base.json |
Ensures new mcr-hosts-setup.sh, .service, and .timer artifacts are copied into /home/packer during various VHD builds so they can be installed into the image. |
vhdbuilder/packer/vhd-image-builder-arm64-gen2.json |
Same as above for the ARM64 Gen2 image; one object is slightly misformatted compared to the rest of the JSON. |
vhdbuilder/packer/packer_source.sh |
Copies mcr-hosts-setup.sh to /opt/azure/containers and installs the corresponding systemd service and timer units into /etc/systemd/system with appropriate permissions. |
parts/linux/cloud-init/artifacts/mcr-hosts-setup.sh |
New script that resolves A/AAAA records for mcr.microsoft.com via dig and writes them into /etc/hosts.testing, logging both summary counts and the concrete IPs. |
parts/linux/cloud-init/artifacts/mcr-hosts-setup.service |
Ones-shot systemd unit that runs the mcr-hosts-setup.sh script after network-online.target is reached. |
parts/linux/cloud-init/artifacts/mcr-hosts-setup.timer |
Systemd timer that triggers mcr-hosts-setup.service at boot and every 5 minutes thereafter, with jitter and ordering relative to localdns.service. |
parts/linux/cloud-init/artifacts/cse_config.sh |
Adds shouldEnableMCRHostsSetup, which uses systemctlEnableAndStart mcr-hosts-setup.timer 30 to enable/start the timer and logs descriptive messages. |
parts/linux/cloud-init/artifacts/cse_main.sh |
Integrates shouldEnableMCRHostsSetup into the base provisioning flow, calling it when SHOULD_ENABLE_LOCALDNS is true so the timer is only enabled alongside LocalDNS scriptless corefile generation. |
pkg/agent/baker.go |
Extends the LocalDNS CoreDNS template so that, when $isRootDomain is true, a hosts /etc/hosts.testing { fallthrough } block is inserted before the Azure DNS forwarder. |
pkg/agent/baker_test.go |
Updates expected localdns corefile strings in tests to include the new hosts /etc/hosts.testing stanza, ensuring the template change is validated. |
spec/parts/linux/cloud-init/artifacts/mcr_hosts_setup_spec.sh |
New shellspec tests that (by re-simulating the logic) verify hosts file generation and content based on mocked dig output; currently they do not execute the real script, which has maintainability implications. |
spec/parts/linux/cloud-init/artifacts/cse_config_spec.sh |
Adds tests to ensure shouldEnableMCRHostsSetup echoes the expected messages and calls systemctlEnableAndStart mcr-hosts-setup.timer 30. |
pkg/agent/testdata/CustomizedImage*/CustomData |
Refreshes gzipped CustomData payloads to include the new artifacts and behavior, keeping VHD-related tests aligned with the new provisioning logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 83 changed files in this pull request and generated 4 comments.
…dEnableAKSHostsSetup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 83 changed files in this pull request and generated 2 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 84 changed files in this pull request and generated 3 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 84 changed files in this pull request and generated 1 comment.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 84 changed files in this pull request and generated no new comments.
Replace bash-specific [[ ]] with POSIX [ ] to resolve shellcheck SC3010 warnings. Variables are properly quoted so [ ] works equivalently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 31 out of 84 changed files in this pull request and generated 3 comments.
| # Create test script that overrides HOSTS_FILE before running the main logic | ||
| cat > "${TEST_SCRIPT}" << EOF | ||
| #!/usr/bin/env bash | ||
| set -uo pipefail | ||
| source "${TEST_DIR}/mock_nslookup.sh" | ||
| HOSTS_FILE="${HOSTS_FILE}" | ||
| EOF | ||
| # Append the original script content, skipping the shebang, set options, and HOSTS_FILE declaration (first 8 lines) | ||
| tail -n +9 "${SCRIPT_PATH}" >> "${TEST_SCRIPT}" |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests rely on a hard-coded tail -n +9 offset to strip the shebang, options, and HOSTS_FILE declaration from aks-hosts-setup.sh. This tightly couples the specs to the exact line count at the top of the script, so adding or removing a comment or option in the first few lines of the script will silently break all these tests. Consider making the cut point resilient by slicing based on a marker (for example, removing everything up to and including the HOSTS_FILE= line with sed/awk), or centralizing this "build test script" logic in a helper so the magic number is defined in a single place.
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Requirements:
Special notes for your reviewer:
Release note: