Sep 0.9.0 - Async Support
Sep 0.9.0 was released February 1st, 2025 - earlier this year - with a major new
feature: Async support for both SepReader
and SepWriter
.
See v0.9.0 release for all changes, the release includes a few other niceties, and Sep README on GitHub for full details. Below is a (belated) blog post focusing on the pragmatic approach used to add async support. First, however, a copy of the section on async support in Sep README to introduce this, then details on how this support was added.
Async Support
Sep supports efficient ValueTask
based asynchronous reading and writing.
However, given both SepReader.Row
and SepWriter.Row
are ref struct
s, as
they point to internal state and should only be used one at a time,
async/await
usage is only supported on C# 13.0+ as this has support for “ref
and unsafe in iterators and async methods” as covered in What’s new in C#
13. Please
consult details in that for limitations and constraints due to this.
Similarly, SepReader
only implements IAsyncEnumerable<SepReader.Row>
(and
IEnumerable<SepReader.Row>
) for .NET 9.0+/C# 13.0+ since then the interfaces
have been annotated with allows ref struct
for T
.
Async support is provided on the existing SepReader
and SepWriter
types
similar to how TextReader
and TextWriter
support both sync and async usage.
This means you as a developer are responsible for calling async methods and
using await
when necessary. See below for a simple example and consult tests
on GitHub for more examples.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
var text = """
A;B;C;D;E;F
Sep;🚀;1;1.2;0.1;0.5
CSV;✅;2;2.2;0.2;1.5
"""; // Empty line at end is for line ending
using var reader = await Sep.Reader().FromTextAsync(text);
await using var writer = reader.Spec.Writer().ToText();
await foreach (var readRow in reader)
{
await using var writeRow = writer.NewRow(readRow);
}
Assert.AreEqual(text, writer.ToString());
Note how for SepReader
the FromTextAsync
is suffixed with Async
to
indicate async creation, this is due to the reader having to read the first row
of the source at creation to determine both separator and, if file has a header,
column names of the header. The From*Async
call then has to be await
ed.
After that rows can be enumerated asynchronously simply by putting await
before foreach
. If one forgets to do that the rows will be enumerated
synchronously.
For SepWriter
the usage is kind of reversed. To*
methods have no Async
variants, since creation is synchronous. That is, StreamWriter
is created by a
simple constructor call. Nothing is written until a header or row is defined and
Dispose
/DisposeAsync
is called on the row.
For reader nothing needs to be asynchronously disposed, so using
does not
require await
. However, for SepWriter
dispose may have to write/flush data
to underlying TextWriter
and hence it should be using DisposeAsync
, so you
must use await using
.
To support cancellation many methods have overloads that accept a
CancellationToken
like the From*Async
methods for creating a SepReader
or
for example NewRow
for SepWriter
. Consult Public API
Reference for full set of available methods.
Additionally, both SepReaderOptions and
SepWriterOptions feature the bool
AsyncContinueOnCapturedContext
option that is forwarded to internal
ConfigureAwait
calls, see the ConfigureAwait
FAQ for details on
that.
Pragmatic Async Support Implementation
async/await
is viral. For any async method call, however deep it may be, all
methods from top to deepest async method need to be async too. This means
supporting both sync and async becomes problematic as you are faced with either
trying to refactor, while still needing to copy entire method chains, or copy
pasting everything.
For Sep I choose the latter with a twist. Isolate IO calling methods e.g.
methods using TextReader
for SepReader
and TextWriter
for SepWriter
.
Then create two separate files for each of these methods: one for synchronous
methods and one for asynchronous methods. These files are nearly identical,
differing only by a preprocessor directive defined at the top of each file. For
example:
1
//#define SYNC
1
#define SYNC
The rest of these files are then identical, but for each method in these files
#if #else #endif
preprocessor switches are used to handle differences in the
method signature and implementation. All async method names are then suffixed
with Async
to differentiate from the sync methods. For example:
1
2
3
4
5
6
7
8
9
10
11
12
#if SYNC
internal void Initialize(in SepReaderOptions options)
#else
internal async ValueTask InitializeAsync(in SepReaderOptions options,
CancellationToken cancellationToken)
#endif
{
#if SYNC
if (MoveNext())
#else
if (await MoveNextAsync(cancellationToken))
#endif
This means it is easy to maintain. For consistency unit tests ensure the two files are kept in sync in face of changes by simply comparing these pairs of files and that all lines are the same except for the first line.
This approach “avoids” duplicating logic while maintaining separate implementations for sync and async operations. The shared logic is preserved by keeping the core functionality identical, with only the method signatures and specific async-related keywords differing.
By using preprocessor directives, each variant has only the relevant code for either sync or async, ensuring no unnecessary runtime checks or sync over async.
Performance
All Async
methods are implemented using ValueTask
to avoid the overhead of
allocating Task
instances if not needed (e.g. if data is already available).
This means the overhead is minimal, which can be seen in benchmarks (based on in
memory data in the form of StringReader
) as shown below.
Sep_Async
is only 1.07x slower than sync Sep
at the very lowest level of
simply parsing the CSV file. For any real workload the difference is neglible.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
BenchmarkDotNet v0.14.0, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.102
[Host] : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
Job-RANURT : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
Job=Job-RANURT EnvironmentVariables=DOTNET_GCDynamicAdaptationMode=0 Runtime=.NET 9.0
Toolchain=net90 InvocationCount=Default IterationTime=350ms
MaxIterationCount=15 MinIterationCount=5 WarmupCount=6
Quotes=False Reader=String
| Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
|------------- |------ |-------- |-------------:|------:|----:|--------:|-------:|--------------:|------------:|
| Sep______ | Row | 50000 | 2.230 ms | 1.00 | 29 | 13088.4 | 44.6 | 1.09 KB | 1.00 |
| Sep_Async | Row | 50000 | 2.379 ms | 1.07 | 29 | 12264.0 | 47.6 | 1.02 KB | 0.93 |
| Sep_Unescape | Row | 50000 | 2.305 ms | 1.03 | 29 | 12657.6 | 46.1 | 1.02 KB | 0.93 |
That’s all!