Sep 0.9.0 - Async Support

Sep 0.9.0 was released February 1st, 2025 - earlier this year - with a major new feature: Async support for both SepReader and SepWriter.

See v0.9.0 release for all changes, the release includes a few other niceties, and Sep README on GitHub for full details. Below is a (belated) blog post focusing on the pragmatic approach used to add async support. First, however, a copy of the section on async support in Sep README to introduce this, then details on how this support was added.

Async Support

Sep supports efficient ValueTask based asynchronous reading and writing.

However, given both SepReader.Row and SepWriter.Row are ref structs, as they point to internal state and should only be used one at a time, async/await usage is only supported on C# 13.0+ as this has support for “ref and unsafe in iterators and async methods” as covered in What’s new in C# 13. Please consult details in that for limitations and constraints due to this.

Similarly, SepReader only implements IAsyncEnumerable<SepReader.Row> (and IEnumerable<SepReader.Row>) for .NET 9.0+/C# 13.0+ since then the interfaces have been annotated with allows ref struct for T.

Async support is provided on the existing SepReader and SepWriter types similar to how TextReader and TextWriter support both sync and async usage. This means you as a developer are responsible for calling async methods and using await when necessary. See below for a simple example and consult tests on GitHub for more examples.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
var text = """
           A;B;C;D;E;F
           Sep;🚀;1;1.2;0.1;0.5
           CSV;;2;2.2;0.2;1.5
           
           """; // Empty line at end is for line ending

using var reader = await Sep.Reader().FromTextAsync(text);
await using var writer = reader.Spec.Writer().ToText();
await foreach (var readRow in reader)
{
    await using var writeRow = writer.NewRow(readRow);
}
Assert.AreEqual(text, writer.ToString());

Note how for SepReader the FromTextAsync is suffixed with Async to indicate async creation, this is due to the reader having to read the first row of the source at creation to determine both separator and, if file has a header, column names of the header. The From*Async call then has to be awaited. After that rows can be enumerated asynchronously simply by putting await before foreach. If one forgets to do that the rows will be enumerated synchronously.

For SepWriter the usage is kind of reversed. To* methods have no Async variants, since creation is synchronous. That is, StreamWriter is created by a simple constructor call. Nothing is written until a header or row is defined and Dispose/DisposeAsync is called on the row.

For reader nothing needs to be asynchronously disposed, so using does not require await. However, for SepWriter dispose may have to write/flush data to underlying TextWriter and hence it should be using DisposeAsync, so you must use await using.

To support cancellation many methods have overloads that accept a CancellationToken like the From*Async methods for creating a SepReader or for example NewRow for SepWriter. Consult Public API Reference for full set of available methods.

Additionally, both SepReaderOptions and SepWriterOptions feature the bool AsyncContinueOnCapturedContext option that is forwarded to internal ConfigureAwait calls, see the ConfigureAwait FAQ for details on that.

Pragmatic Async Support Implementation

async/await is viral. For any async method call, however deep it may be, all methods from top to deepest async method need to be async too. This means supporting both sync and async becomes problematic as you are faced with either trying to refactor, while still needing to copy entire method chains, or copy pasting everything.

For Sep I choose the latter with a twist. Isolate IO calling methods e.g. methods using TextReader for SepReader and TextWriter for SepWriter. Then create two separate files for each of these methods: one for synchronous methods and one for asynchronous methods. These files are nearly identical, differing only by a preprocessor directive defined at the top of each file. For example:

src/Sep/SepReader.IO.Async.cs

1
//#define SYNC

src/Sep/SepReader.IO.Sync.cs

1
#define SYNC

The rest of these files are then identical, but for each method in these files #if #else #endif preprocessor switches are used to handle differences in the method signature and implementation. All async method names are then suffixed with Async to differentiate from the sync methods. For example:

1
2
3
4
5
6
7
8
9
10
11
12
#if SYNC
    internal void Initialize(in SepReaderOptions options)
#else
    internal async ValueTask InitializeAsync(in SepReaderOptions options, 
        CancellationToken cancellationToken)
#endif
    {
#if SYNC
        if (MoveNext())
#else
        if (await MoveNextAsync(cancellationToken))
#endif

This means it is easy to maintain. For consistency unit tests ensure the two files are kept in sync in face of changes by simply comparing these pairs of files and that all lines are the same except for the first line.

This approach “avoids” duplicating logic while maintaining separate implementations for sync and async operations. The shared logic is preserved by keeping the core functionality identical, with only the method signatures and specific async-related keywords differing.

By using preprocessor directives, each variant has only the relevant code for either sync or async, ensuring no unnecessary runtime checks or sync over async.

Performance

All Async methods are implemented using ValueTask to avoid the overhead of allocating Task instances if not needed (e.g. if data is already available). This means the overhead is minimal, which can be seen in benchmarks (based on in memory data in the form of StringReader) as shown below.

Sep_Async is only 1.07x slower than sync Sep at the very lowest level of simply parsing the CSV file. For any real workload the difference is neglible.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
BenchmarkDotNet v0.14.0, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.102
  [Host]     : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
  Job-RANURT : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2

Job=Job-RANURT  EnvironmentVariables=DOTNET_GCDynamicAdaptationMode=0  Runtime=.NET 9.0  
Toolchain=net90  InvocationCount=Default  IterationTime=350ms  
MaxIterationCount=15  MinIterationCount=5  WarmupCount=6  
Quotes=False  Reader=String  

| Method       | Scope | Rows    | Mean         | Ratio | MB  | MB/s    | ns/row | Allocated     | Alloc Ratio |
|------------- |------ |-------- |-------------:|------:|----:|--------:|-------:|--------------:|------------:|
| Sep______    | Row   | 50000   |     2.230 ms |  1.00 |  29 | 13088.4 |   44.6 |       1.09 KB |        1.00 |
| Sep_Async    | Row   | 50000   |     2.379 ms |  1.07 |  29 | 12264.0 |   47.6 |       1.02 KB |        0.93 |
| Sep_Unescape | Row   | 50000   |     2.305 ms |  1.03 |  29 | 12657.6 |   46.1 |       1.02 KB |        0.93 |

That’s all!

2025.05.08