Sep 0.8.0 - SepWriter Replace StringBuilder with ArrayPool Array

Sep 0.8.0 was released January 19th, 2025 - earlier this year - with two notable changes:

  • 🎯 Remove net7.0 target
  • ✨ SepWriter.Col: Replace StringBuilder with ArrayPool array and DefaultInterpolatedStringHandler

See v0.8.0 release for all changes and Sep README on GitHub for full details. Below is a quick (belated) blog post to explain the changes a bit.

SepWriter vs TextWriter

SepWriter hasn’t gotten as much attention as SepReader here, which is partly intentional, as SepWriter is not so much about performance and speed but more about convenience, ease of use and change. And not much has changed about that since Sep was introduced. If you want the best speed for writing you would be better off simply using TextWriter directly (if done correctly).

Let’s do a quick code comparison of SepWriter and TextWriter. Given:

1
2
3
4
const string ColNameA = "A";
const string ColNameB = "B";

ReadOnlySpan<int> values = [1, 2, 3];

we want to write some multiple of the values to csv for each column. With Sep this can be done like below. The main take away here is that with Sep you do not have to separate the writing of the header (column name) and the values.

1
2
3
4
5
6
7
8
using var sepWriter = Sep.Default.Writer().ToText();
foreach (var v in values)
{
    using var row = sepWriter.NewRow();
    row[ColNameA].Format(v * 10);
    row[ColNameB].Format(v * 100);
}
Console.WriteLine(sepWriter.ToString());

One way to do this with StringWriter (aka TextWriter) is shown below. While this clearly is longer, the other issue is how you have two separate parts for first writing the header and then writing the rows. Not a big issue here but when you have many columns keeping things in sync can be a challenge and a known source of dev churn.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const char Separator = ';';
using var textWriter = new StringWriter();
// Header
textWriter.Write(ColNameA);
textWriter.Write(Separator);
textWriter.Write(ColNameB);
textWriter.WriteLine();
// Rows
foreach (var v in values)
{
    textWriter.Write(v * 10);
    textWriter.Write(Separator);
    textWriter.Write(v * 100);
    textWriter.WriteLine();
}
Console.WriteLine(textWriter.ToString());

SepWriter.Col: StringBuilder Issue

The above TextWriter code is basically what Sep does under the hood. However, for each column (e.g. var col = row[ColNameA];) Sep would store each column value as a StringBuilder until the completion of a row and calling Dispose() on it at which point the contents of StringBuilder is written to the underlying TextWriter that SepWriter works over. In this way Sep could utilize all the StringBuilder functionality to support Format (e.g. ISpanFormattable) and similar. Additionally, Sep would use a pool of StringBuilders to reduce repeated allocations.

StringBuilder does have an underlying issue, though, in that it is basically implemented as a linked list of StringBuilders, which means in order to write all the contents of it to TextWriter, without creating a string, one would have to enumerate the chunks of it like:

1
2
3
4
foreach (var chunk in sb.GetChunks())
{
    _writer.Write(chunk.Span);
}

Since, long columns are rare it is similarly rare for there being multiple chunks. Hence, the enumeration causes a bit of a performance hit.

SepWriter.Col: Replace StringBuilder with ArrayPool array and DefaultInterpolatedStringHandler

Performance is a feature, and while not the top priority for SepWriter, 0.8.0 addresses this issue by swapping out the internal StringBuilder with a char[] from ArrayPool. To implement formatting Sep then relies on DefaultInterpolatedStringHandler.

However, this doesn’t have public APIs allowing for using and managing arrays from the ArrayPool. I was then faced with a choice of either copying the entire implementation of DefaultInterpolatedStringHandler or finding another way. That other way was to use the UnsafeAccessor attribute to access the internal state of DefaultInterpolatedStringHandler, as shown below, and reuse the array from ArrayPool. This is a bit of a hack, but it works and is fast.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Avoid recreating DefaultInterpolatedStringHandler while being
// able to reuse array from ArrayPool by using UnsafeAccessor to
// access internal state of this. This works fine for net8.0 and
// net9.0 but there are no guarantees if this could change in the
// future, if so consider using #if NET10_0_OR_GREATER or similar to
// address any changes or consider then copying the entire
// DefaultInterpolatedStringHandler source code and adopt for needs.
 
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[UnsafeAccessor(UnsafeAccessorKind.Field, Name = "_arrayToReturnToPool")]
static extern ref char[]? ArrayToReturnToPool(ref DefaultInterpolatedStringHandler handler);
 
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[UnsafeAccessor(UnsafeAccessorKind.Field, Name = "_pos")]
static extern ref int Position(ref DefaultInterpolatedStringHandler handler);

The downside is that UnsafeAccessor is only supported on net8.0 and above. Given net7.0 is no longer supported I decided it was time to drop support for it.

I don’t have detailed benchmarks here, but the end result for SepWriter is that for a given simple case of writing multiple short columns SepWriter is 10-15% faster while still having zero allocations after warmup/first rows. Additionally, code is simpler, even with the UnsafeAccessor code.

For more details take a look at the pull request SepWriter.Col: Replace StringBuilder with ArrayPool array and DefaultInterpolatedStringHandler.

That’s all!

2025.05.07