-
Notifications
You must be signed in to change notification settings - Fork 874
Fix SynchronizationLockException exception when using IMDS for credentials #4211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
…before the adapter plugins register.
| if (!_imdsRefreshFailed && | ||
| _lastRetrievedCredentials.IsExpiredWithin(TimeSpan.Zero)) | ||
| logger.DebugFormat("[Background Timer] Waiting on lock to refresh ECS IMDS"); | ||
| if (_credentialsSemaphore.Wait(_credentialsLockTimeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could _credentialsSemaphore already be disposed here before it tries to wait on it? Yes the exception handler would eat the exception but if we had a check here for _isDisposed we could may be able to avoid an unnecessary exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| { | ||
| try | ||
| logger.DebugFormat("Waiting on lock to refresh ECS IMDS"); | ||
| if (_credentialsSemaphore.Wait(_credentialsLockTimeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this try to wait on a disposed semaphore? Dispose and GetCredentials/GetCredentialsAsync could be called at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be the caller try to call a method on a disposed object which is invalid. Same as a user calling read operations on a disposed Stream. The extra protection done for dispose in the RenewCredentials was done because that is called via the timer and there could be timing issues shutting down the timer and disposing the semaphore.
| // lower refresh rate to speed test | ||
| var provider = DefaultInstanceProfileAWSCredentials.CreateTestDefaultInstanceProfileAWSCredentials(stub, TimeSpan.FromSeconds(5)); | ||
|
|
||
| // fix semaphore (production code's semaphore is created with initialCount:0 in this branch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the comment. The code does initialCount: 1. Also what is "this branch"? Note there are a few places where this comment exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the EnsureSemaphoreReleased and ResetPreviousRefreshFailedFlag as they were unnecessary artifacts when I use the AI agent to do the first pass at creating unit tests.
Draft Mode: I put it in draft mode because I want to do some testing on an actual EC2 instance to confirm the behavior of the unit tests.
Description
The
DefaultInstanceProfileAWSCredentialsused as the underlying "singleton" across service clients to fetch AWS credentials from EC2 IMDS used aReaderWriterLockSlimto control the number of threads accessing IMDS. In V3 this was okay because only the sync version was implemented but in V4 we added the async version andReaderWriterLockSlimis not valid to use across async/await call contexts. I also noticed in the previous logic within the scope of the reader lock it would also initiate a fetch credentials fetch essential the "write" logic that should have only been done when a Write lock was obtained.I reworked the logic to rely on a
SemaphoreSlimto control access and fixed it so the fetch credentials was only done when a thread had acquired the semaphore.The reason more users haven't run into the issue since V4 was GA last spring is generally the
GetCredentialsandGetCredentialsAsyncshould only be returning credentials that were previously fetched during the background call toRenewCredentials. TheRenewCredentialsmethod uses the sync calls. When @msab-john ran into the issue there must have been some other issue between the EC2 instance and IMDS that caused the backgroundRenewCredentialsto fail to acquire credentials and then whenGetCredentialsAsyncwas called it reverted to it's failsafe of directly trying to get credentials that triggered theSynchronizationLockExceptionI also did some refactoring of the
DefaultInstanceProfileAWSCredentialsto make it more friendly for unit testing.Motivation and Context
#4199
Testing
Add new unit tests
Dry Run: Success (DRY_RUN-6ef5e975-cba6-4794-939b-83e6620f3b50)
Direct EC2 Testing: Pending