Reducing allocations by caching with StringBuilderCache: A deep dive on StringBu...
source link: https://andrewlock.net/a-deep-dive-on-stringbuilder-part-5-reducing-allocations-by-caching-stringbuilders-with-stringbuildercache/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
So far in this series we've looked in detail at StringBuilder
, and how it works under-the-hood. In this post I look at a different type, the internal StringBuilderCache
type. This type is used internally in .NET Core and .NET Framework to reduce the cost of creating a StringBuilder
. In this post I describe why it's useful, run a small benchmark to see its impact, and walk through the code to show how it works.
Reducing allocations to improve performance
In the first post in this series, I discussed how .NET has focused on performance recently, with a particular focus on reducing allocations. This isn't a new problem for .NET, so in .NET 1.1 the StringBuilder
class was introduced. This lets you efficiently concatenate strings, characters, and ToString()
ed objects without creating a lot of intermediate strings.
However, StringBuilder
itself is a class that is allocated on the heap. As we've seen throughout this series, internally, the StringBuilder
uses a char[]
and a linked list of StringBuilder
s to store the intermediate values. All of these are allocated on the heap.
In cases where you're doing a lot of string concatenation, the instances of the StringBuilder
class (including the internal linked values) and the internal char[]
buffer can put some pressure on the GC. That's where StringBuilderCache
comes in.
Using StringBuilderCache to reduce StringBuilder allocations
StringBuilderCache
is an internal
class that has been present in .NET Framework and .NET Core for a looong time (I couldn't figure out exactly when, but it's since at least 2014, so .NET 4.5-ish). Being internal
it's not directly usable by user code, but it's used by various classes in the heart of .NET.
The observation behind StringBuilderCache
is that most cases where we need to build up a string, the size of the string will be relatively small. For example when formatting dates and times, you expect the final string to be relatively small. There are many other examples of cases like this, where you know the final string is going to be relatively small, but that you know the function will be called relatively frequently.
StringBuilderCache
works (perhaps unsurprisingly) by caching a StringBuilder
instance, and "loaning" it out whenever a StringBuilder
is required. Calling code can request a StringBuilder
instance and return it to the cache when it's finished with it. That means only a single instance of StringBuilder
needs to be created by the app, as it can keep being re-used, reducing GC pressure on the app.
If your first thought is "that doesn't sound thread-safe", don't worry. As you'll see later, there's a single
StringBuilder
per thread, so that isn't a problem.
Let's take this toy sample which concatenates a user's name using the StringBuilderCache
.
var user = new User
{
FirstName = "Andrew",
LastName = "Lock",
Nickname = "Sock",
};
int requiredCapacity = user.FirstName.Length
+ user.LastName.Length
+ user.Nickname.Length
+ 3;
// Fetch a StringBuilder of the required capacity. Instead of
// var sb = new StringBuilder(requiredCapacity);
StringBuilder sb = StringBuilderCache.Acquire(requiredCapacity);
sb.Append(user.FirstName);
sb.Append(user.LastName);
sb.Append(" (");
sb.Append(user.Nickname);
sb.Append(')');
// return the StringBuilder to the cache and retrieve the string. Instead of
// string fullName = sb.ToString();
string fullName = StringBuilderCache.GetStringAndRelease(sb);
As you can see, using StringBuilderCache
is pretty simple, and mostly analogous to using a StringBuilder
directly. The question is, does it improve performance?
Benchmarking StringBuilderCache
To see the impact of using StringBuilderCache
over StringBuilder
directly for a simple snippet like the above, I turned to BenchmarkDotNet. I copied the .NET 5 implementation of StringBuilderCache
into my project (we'll look at the implementation shortly), and created the following simple benchmark, directly analogous to the above example:
[MemoryDiagnoser]
public class StringBuilderBenchmark
{
private const string FirstName = "Andrew";
private const string LastName = "Lock";
private const string Nickname = "Sock";
[Benchmark(Baseline = true)]
public string UsingStringBuilder()
{
var sb = new StringBuilder();
sb.Append(FirstName);
sb.Append(LastName);
sb.Append(" (");
sb.Append(Nickname);
sb.Append(')');
return sb.ToString();
}
[Benchmark]
public string UsingStringBuilderCache()
{
var sb = StringBuilderCache.Acquire();
sb.Append(FirstName);
sb.Append(LastName);
sb.Append(" (");
sb.Append(Nickname);
sb.Append(')');
return StringBuilderCache.GetStringAndRelease(sb);
}
}
The results, running on my relatively old home laptop are as follows:
BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1052 (20H2/October2020Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=5.0.104
[Host] : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
DefaultJob : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
Method
Mean
Error
StdDev
Ratio
Gen 0
Gen 1
Allocated
UsingStringBuilder
87.76 ns
1.832 ns
2.382 ns
1.00
0.1262
-
264 B
UsingStringBuilderCache
67.56 ns
3.670 ns
10.588 ns
0.69
0.0267
-
56 B
As you can see, using the StringBuilderCache
gives a relative speed boost of 30% and allocates a fraction as much (56 vs 264 bytes).
Obviously, these are small speedups, but on a hot path, these sorts of micro-optimisations can be worthwhile.
We've looked at the benefit StringBuilderCache
can bring. The next question is: how does it do it?
Looking at the implementation of StringBuilderCache
You can find the latest implementation of StringBuilderCache
for .NET on GitHub, which is the implementation I show below. I'll give the whole implementation, and then discuss it below.
This version uses nullable reference types. You can also find an implementation for .NET Framework on https://referencesource.microsoft.com.
namespace System.Text
{
/// <summary>Provide a cached reusable instance of stringbuilder per thread.</summary>
internal static class StringBuilderCache
{
// The value 360 was chosen in discussion with performance experts as a compromise between using
// as little memory per thread as possible and still covering a large part of short-lived
// StringBuilder creations on the startup path of VS designers.
internal const int MaxBuilderSize = 360;
private const int DefaultCapacity = 16; // == StringBuilder.DefaultCapacity
[ThreadStatic]
private static StringBuilder? t_cachedInstance;
/// <summary>Get a StringBuilder for the specified capacity.</summary>
/// <remarks>If a StringBuilder of an appropriate size is cached, it will be returned and the cache emptied.</remarks>
public static StringBuilder Acquire(int capacity = DefaultCapacity)
{
if (capacity <= MaxBuilderSize)
{
StringBuilder? sb = t_cachedInstance;
if (sb != null)
{
// Avoid stringbuilder block fragmentation by getting a new StringBuilder
// when the requested size is larger than the current capacity
if (capacity <= sb.Capacity)
{
t_cachedInstance = null;
sb.Clear();
return sb;
}
}
}
return new StringBuilder(capacity);
}
/// <summary>Place the specified builder in the cache if it is not too big.</summary>
public static void Release(StringBuilder sb)
{
if (sb.Capacity <= MaxBuilderSize)
{
t_cachedInstance = sb;
}
}
/// <summary>ToString() the stringbuilder, Release it to the cache, and return the resulting string.</summary>
public static string GetStringAndRelease(StringBuilder sb)
{
string result = sb.ToString();
Release(sb);
return result;
}
}
}
The code is helpfully heavily commented, but lets walk through the code anyway. I'm actually going to start at the end first, and look at the GetStringAndRelease
and Release
messages first.
internal const int MaxBuilderSize = 360;
[ThreadStatic]
private static StringBuilder? t_cachedInstance;
public static string GetStringAndRelease(StringBuilder sb)
{
string result = sb.ToString();
Release(sb);
return result;
}
public static void Release(StringBuilder sb)
{
if (sb.Capacity <= MaxBuilderSize)
{
t_cachedInstance = sb;
}
}
The GetStringAndRelease()
method is very simple, it just calls ToString()
on the provided StringBuilder
, calls Release()
on the builder, and then returns the string.
The Release
method is where the "caching" happens. The method checks to see if the provided StringBuilder
's current capacity is less than the MaxBuilderSize
constant (360
), and if it is, it stores the StringBuilder
in the ThreadStatic
t_cachedInstance
.
As mentioned in the code comments, the value of 360 is chosen to be large enough to be useful, but not too large that a lot of memory is used per thread. If this check wasn't here, and you released a
StringBuilder
with a large capacity, then you'd forever be using up that memory without releasing it, essentially causing a memory leak.
Marking the t_cachedInstance
as [ThreadStatic]
means that each separate thread in your application will see a different StringBuilder
instance in t_cachedInstance
. This avoids any chance of concurrency issues due to multiple threads accessing the field.
That covers the release part of the cache, lets look at the acquire part now:
internal const int MaxBuilderSize = 360;
private const int DefaultCapacity = 16; // == StringBuilder.DefaultCapacity
[ThreadStatic]
private static StringBuilder? t_cachedInstance;
public static StringBuilder Acquire(int capacity = DefaultCapacity)
{
if (capacity <= MaxBuilderSize)
{
StringBuilder? sb = t_cachedInstance;
if (sb != null)
{
// Avoid stringbuilder block fragmentation by getting a new StringBuilder
// when the requested size is larger than the current capacity
if (capacity <= sb.Capacity)
{
t_cachedInstance = null;
sb.Clear();
return sb;
}
}
}
return new StringBuilder(capacity);
}
When you call Acquire
, you request a capacity
for the StringBuilder
. If the capacity is bigger than the cache's maximum capacity, then we bypass the cached value entirely, and just return a new StringBuilder
. Similarly, if we haven't cached a StringBuilder
yet, you just get a new one. For these cases, the StringBuilderCache
doesn't add any value.
We also check whether the capacity requested is less than the cached StringBuilder
's capacity. As mentioned in the comment, if we return a StringBuilder
with a capacity that's smaller than the requested capacity, we can be pretty much certain we're going to have to grow the StringBuilder
. That's fine, but it has a performance impact, so it's better in these cases to just return a new StringBuilder
.
If you're in the sweet-spot—requesting a capacity less than MaxBuilderSize
and less than the cached StringBuilder.Capacity
—then you can reuse the cached instance. The cached instance is cleared (so if you call Acquire
again before Release
then you don't re-use the builder), and the StringBuilder is "reset" by calling Clear()
. You can then use the StringBuilder
as normal, finally calling GetStringAndRelease()
to retrieve your built value, and to (potentially) add the builder to the cache.
That's all there is to it, a simple, single-value cache for StringBuilder
s. In the worse case it's no worse than using new StringBuilder()
, and in the best case you can avoid a few allocations.
Using StringBuilderCache in your own projects
The only downside to StringBuilderCache
is that you can't easily use it in your own projects! StringBuilderCache
is internal
, so there's no way to use it directly outside the core .NET libraries.
Luckily, the code is simple enough (and the license permissive-enough) that you can generally copy-paste the implementation into your own code. As an example, we use a similar implementation in the Datadog .NET Tracer library.
Another possibility, if you're trying to reduce the impact of StringBuilder
s on a hot-patg, it to look at another internal type, ValueStringBuilder
. I'll look at this type in another post.
Summary
In this post I discussed the need to reduce allocations for performance reasons, and the role of StringBuilder
in helping with that. However, the StringBuilder
class itself must be allocated. StringBuilderCache
provides a way to reduce the impact of allocating a StringBuilder
by reusing a single StringBuilder
instance per thread. I showed in a micro-benchmark that this can reduce allocation and improve performance. I then walked through the code to show how it was achieved.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK