-
Notifications
You must be signed in to change notification settings - Fork 478
Fix race condition in SecretManager secret caching #11560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Co-authored-by: fabiocav <2507935+fabiocav@users.noreply.github.com>
|
@copilot please also update the release_notes.md file with a description of this change (along to a reference to this PR) |
Co-authored-by: fabiocav <2507935+fabiocav@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a race condition in SecretManager where multiple concurrent threads could serially acquire locks and redundantly load secrets from storage, causing excessive logging and potential DataProtection library failures. The fix implements double-check locking in both GetHostSecretsAsync and GetFunctionSecretsAsync methods.
Changes:
- Added double-check locking pattern in
GetHostSecretsAsyncto verify_hostSecretsis null after lock acquisition - Added double-check locking pattern in
GetFunctionSecretsAsyncto verify function secrets aren't already cached after lock acquisition - Added concurrency tests to validate that secrets are loaded only once despite concurrent calls
- Updated release notes with entry describing the fix
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/WebJobs.Script.WebHost/Security/KeyManagement/SecretManager.cs | Implements double-check locking in both secret loading methods to prevent redundant loads |
| test/WebJobs.Script.Tests/Security/SecretManagerTests.cs | Adds concurrency tests validating that loading log messages appear exactly once |
| release_notes.md | Documents the race condition fix |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
/azp run host.public |
|
Azure Pipelines successfully started running 1 pipeline(s). |
release_notes.md
Outdated
| <!-- Please add your release notes in the following format: | ||
| - My change description (#PR) | ||
| --> | ||
| - Fix race condition in SecretManager secret caching with double-check locking pattern (#TBD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this TBD here?
|
/azp run host.public |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| <!-- Please add your release notes in the following format: | ||
| - My change description (#PR) | ||
| --> | ||
| --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Update the release notes and set the issue number to 11560
Issue describing the changes in this PR
Under concurrent load, multiple threads acquire the
_hostSecretsLockserially, each performing redundant secret loading operations. This causes excessive log noise and potentially contributes to DataProtection library failures due to high call volume.Changes:
GetHostSecretsAsync- After acquiring lock, verify_hostSecretsis still null before loadingGetFunctionSecretsAsync- After acquiring lock, verify function secrets aren't already cached before loadingrelease_notes.mddescribing the fixPull request checklist
IMPORTANT: Currently, changes must be backported to the
in-procbranch to be included in Core Tools and non-Flex deployments.in-procbranch is not requiredrelease_notes.mdAdditional information
This addresses CRI #51000000793837 where 72 threads serially loaded secrets, each logging "Loading host secrets" and stressing the DataProtection library. The fix reduces both computational overhead and log volume while potentially resolving related DataProtection failures.
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.