DEV Community

vbilopav
vbilopav

Posted on

What Makes Norm Micro ORM for .NET Fast As Raw DataReader

Norm has been one of my favorites side projects that recently got very serious.

Basically, it's just, yet another micro ORM library (a Dapper clone) for .NET with a few extra tricks.

Originally, I wrote it so I can utilize features of modern C# like tuples and async streaming like for example:

// Map single values from a tuple to variables:
var (id, foo, bar) = connection.Single<int, string, string>("select id, foo, bar from my_table limit 1");

// Map to enumerable of named tuples:
IEnumerable<(int id, string foo, string bar)> results = connection.Read<int, string, string>("select id, foor, bar from my_table");

// Asynchronously stream values directly from database
await foreach(var (id, foo, bar) in connection.ReadAsync<int, string, string>("select id, foor, bar from my_table"))
{
    //...
}

// etc...
Enter fullscreen mode Exit fullscreen mode

That works fine, however, in order to be real Micro ORM, it needs to have a proper object mapper, that would map results from your query to your classes.

And it did, however, the first versions although fairly decent, were not as nearly as performant as famous mapper from Dapper -which is labeled as the "King of Micro ORM"

Recently, I've got some insights and ideas that I wanted to try out, so I rewrote Norm mapper from scratch.

Results surprised even me, I wasn't expecting performances so fast that is basically indistinguishable from the raw DataReader.

How is that possible?

Let's deep dive into Norm:

Reading the data

Norm implements one basic extensions data is used for data reading in preparation for mapping:

public static IEnumerable<(string name, object value)[]> Read(this DbConnection connection, string command) 
Enter fullscreen mode Exit fullscreen mode

As we can see, it returns an enumerator that yields an array that contains the name and value tuple.

It does not create a list of any kind, it simply uses yield to return a value when the enumeration is triggered.

And that enumerator item is an array of fields in the form of name and value tuple, where name is field name and value is an actual field value of object type (that requires casting later).

Actual mapping itself is implemented as an extension to that same structure:

public static IEnumerable<T> Map<T>(this IEnumerable<(string name, object value)[]> tuples)
Enter fullscreen mode Exit fullscreen mode

Similarly, this mapping extension will also yield the mapped result from enumeration, rather than creating some sort of list.

That's why when working with Norm, you would have to use Linq extension ToList to create an actual list and thus triggering the enumeration and mapping as well:

// build the enumerator, does not start reading 
// you can use also use connection.Query<MyClass>(query);
var enumaration = connection.Read().Map<MyClass>(query);
// start actual reading from the database and creates a list
var results = enumaration.ToList();
Enter fullscreen mode Exit fullscreen mode

This is neat because now I can build my Linq expressions before any serialization or mapping.

So now, all we have to do is the map (string name, object value)[] to an actual instance.

Mapping the data

Mapping, in a nutshell, would be just copying values from this array (string name, object value)[] to a class instance.

Let's say for example that we have a simple class:

public class MyClass 
{
    public int Id { get; init; }
    public string Foo { get; init; }
    public string Bar { get; init; }
}
Enter fullscreen mode Exit fullscreen mode

And naturally, the database query returns id, foo, and bar.

In order to map those values, we would have to compare names for each iteration step to map id to Id, foo to Foo, and bar to Bar.

But what if I already know that filed id is always the first value and foo is always the second value, and so on. That would be much more efficient because we don't have to compare name strings on each iteration.

That's mapping by position - which is much more efficient than mapping by name that is normally used.

However, that is a suboptimal solution because if we switch the order in the query (or in the class for that matter) while leaving the names unchanged - it will introduce errors and confusion.

For example, trying to map the query select foo, id, bar - to a class in this example - would break the program.

But, what if we can do the mapping by name - only on the first record and remember the position indexes - so we can use them later in all other iterations of the same type - we would be doing mapping by name only once for each type. That would be fast, right.

And that is precisely how Norm mapping works.

This also gives me extra breathing space, so now, I can do more complex mappings, like for example the one that uses camel case naming or normal case naming - without sacrificing performances, because mapping by name is now pretty cheap.

Object construction

Top comments (0)