-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: get rid of MemoryStream in KeyRingBasedDataProtector
#59322
base: main
Are you sure you want to change the base?
perf: get rid of MemoryStream in KeyRingBasedDataProtector
#59322
Conversation
var keySize = sizeof(Guid); | ||
int totalPurposeLen = 4 + keySize + 4; | ||
|
||
var purposeLengthsPool = ArrayPool<int>.Shared.Rent(purposes.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you measure if it was worth renting an array only to prevent an extra call to SecureUtf8Encoding.GetByteCount
and Measure7BitEncodedUIntLength
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean just pay the cost twice,? Yeah, should probably measure it.
Side thought: I'm guessing we don't actually expect many purposes. Another thing we could do here, if you're concerned about ArrayPool
overhead, is use the stack - something like:
int[]? lease = null;
Span<int> lengths = purposes.Length <= 32 ? stackalloc int[purposes.Length] : (lease = ArrayPool<int>.Shared.Rent(purposes.Length)).AsSpan(0, purposes.Length);
...
if (lease is not null)
{
ArrayPool<int>.Shared.Return(lease);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method | Mean | Error | StdDev | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|---|
MemoryStream | 141.4 ns | 2.62 ns | 3.31 ns | 0.0067 | - | - | 512 B |
Manual | 101.8 ns | 0.98 ns | 0.92 ns | 0.0023 | 0.0002 | 0.0002 | 320 B |
Manual_CalculateByteCountEveryTime | 124.7 ns | 1.62 ns | 1.52 ns | 0.0017 | - | - | 128 B |
Manual_StackAllocForSmallPurposeArrays | 122.4 ns | 0.70 ns | 0.54 ns | 0.0014 | - | - | 128 B |
It seems like that last option Marc suggested is the most balanced - we indeed dont expect many purposes (I think), and this solution does not contest for the ArrayPool
rent.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the last two doing the same work as Manual
? it seems odd that the perf is worse than Manual
- we're doing less work (by not touching the pool at all) - why is it taking longer? question: can you try this with [SkipLocalsInit]
on the method? or the entire type for fairness (this is applied globally in aspnetcore, I believe, but I don't know whether it'll apply to your benchmark which looks to be custom).
I also don't know why the memory is less if it is doing the same work - the only non-amortized allocation here is the result byte[]
; so I assume this is a different data test than Manual
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try this with [SkipLocalsInit] on the method? or the entire type for fairness (this is applied globally in aspnetcore, I believe
Don't think it is. #26586
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have talked with Marc: he launched the benchmarks on his machine and there is no crazy allocs which are confusing. I think we should merge with the current state of code (stackalloc for under 32 dynamic length of purposes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try this with [SkipLocalsInit] on the method? or the entire type for fairness (this is applied globally in aspnetcore, I believe
Don't think it is. #26586
We should revisit this, IMHO.
src/DataProtection/DataProtection/src/KeyManagement/KeyRingBasedDataProtector.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/KeyManagement/KeyRingBasedDataProtector.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/KeyManagement/KeyRingBasedDataProtector.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/KeyManagement/KeyRingBasedDataProtector.cs
Outdated
Show resolved
Hide resolved
...taProtection/test/Microsoft.AspNetCore.DataProtection.Tests/Internal/Int7BitEncodingUtils.cs
Show resolved
Hide resolved
...ft.AspNetCore.DataProtection.Tests/KeyManagement/AdditionalAuthenticatedDataTemplateTests.cs
Show resolved
Hide resolved
@@ -367,7 +367,8 @@ internal static byte[] BuildAadTemplateBytes(string[] purposes) | |||
var keySize = sizeof(Guid); | |||
int totalPurposeLen = 4 + keySize + 4; | |||
|
|||
var purposeLengthsPool = ArrayPool<int>.Shared.Rent(purposes.Length); | |||
int[]? lease = null; | |||
Span<int> purposeLengthsPool = purposes.Length <= 32 ? stackalloc int[purposes.Length] : (lease = ArrayPool<int>.Shared.Rent(purposes.Length)).AsSpan(0, purposes.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 from me, interested in @sebastienros thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing we don't actually expect many purposes
I don't know exactly where this DP is used, in the case of antiforgery we saw a single purpose
. But assuming this is the default in any ASP.NET it may be used for way more things. Would be good to get this data for the benchmarks, because we might just screw other scenarios. At least we could add a [Parameter]
to the benchmark with other numbers of purposes. (1, 8, 64) for starter? But it might also be ever better for other scenarios where the allocations would be worse and this would make even more sense than just for AF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there shouldn't really be any allocation here - or a least, it is amortized over the pool usage; we aren't borrowing it for an extended period of time or anything, so: the only time we can really get into problems here is if the pool itself is already totally borked and drowning, in which case we'd be the victim, not the perpetrator
Since the data protector is supposed to be reused, and it is for AF in aspnetcore (https://github.com/dotnet/aspnetcore/blob/74cdf3811fc59b29c07b49a949c5e335da870beb/src/Antiforgery/src/Internal/DefaultAntiforgeryTokenSerializer.cs is registered as a singleton), there shouldn't be any measurable advantage in improving the creation of this "template" field. It shows in the micro benchmark, but in practice this shouldn't in an actual app (unless users create this protector again and again). We should just be saving a single allocation for the first request. If we are seeing these streams allocated in a web benchmarks then maybe we should find the source, and change it to reuse the data protector. |
Yup, already discussed offline. If we're happy that this is functionally correct, suggest we get this merged, and move on to look deeper at the per-data protect/unprotect paths. |
/azp run |
Azure Pipelines successfully started running 3 pipeline(s). |
While taking a look at allocations in DataProtection layer executed on the linux machine in benchmarks (see #59287), I found it is using a
MemoryStream
, which i decided to improve (get rid of). Benchmark codeNew code is set to be executed only under NET10_OR_GREATER because `Span<>'ish APIs are not supported for netstandard2.0 \ netFx.
Notes:
string[] purposes
is having multiple items where string values are of a "big" length, and when it is a singlepurpose
with lets say "SamplePurpose" value. Here are the results when there are 9 purposes:big thanks to @mgravell for help
Related to #59287