DEV Community

Aman Agrawal for Coolblue

Posted on • Originally published at amanagrawal.blog on

Using C# Source Generators to Generate Data Transfer Objects (DTOs) – Part 2

In part 1, I created a very basic DTO generator that could only work with primitive types. In this final and very looong part, I will try and extend it to be more useful by supporting generic types, complex types and generating mapping methods.

First though I am going to tackle the mapping extension methods because that can enhance the usability of the current generator quite a bit with minimal work (ye’ old 80/20 rule). What I am after is something that looks like this:

This may not be uncommon for a mapping method, I have written tons of mappers like this and from experience I can say unequivocally, they never get much smarter than this with the exception of, null input handling. There shouldn’t be any smarts in the DTOs or the mappers anyway, its an anti-pattern and a design smell because DTOs are only meant as data vessels that get serialised over the network. Nothing more!

To keep things clean, I will remove the code that I had already written for the basic generator and simply add code to the end:

Much of the code should be pretty self-explanatory, I am simply generating a static class with an extension method in to convert from the domain entity to the DTO but let’s unpack:

  1. I am all for contextual names for classes and functions etc, but in this case if I just give the class the name EntityExtensions or something along those lines, then the names will clash with the other extension classes that I will create for other complex types later on. Its possible to put all extension methods in one class but for now, I’d rather keep them per DTO. The impact on compilation should be minimal so there is little incentive to bung them all in one class. Therefore, I am just going to append a “-” stripped Guid to the class name so they are all unique.
  2. Next I will define the signature of the extension method which accepts an instance of the domain entity type and returns an instance of DTO type. The TypeDeclarationSyntax instance will give me the name of the domain entity type I am creating the extension method on.
  3. Then I am going to loop over all the property members of the current domain entity type and add assignment statements that copy values from domain entity property and into the corresponding DTO property. Once again, this is driven by convention as opposed to configuration i.e. the property on the DTOs are assumed to be the same name and type as the corresponding properties in the corresponding domain entities. This will ensure type safety and keep the generation code simple.
  4. Finally, I close out the method, class and namespace. Note that I am adding the extension class and methods to the same namespace as that of the DTO for simplicity reasons.

Once this is done, I will build the solution and inspect my consuming app ConsoleApp9 for any generated code and sure enough, I see it (if the build succeeded):

Note that I didn’t have to restart Visual Studio for these changes to reflect. Turns out if you create a new source generator i.e. for the first time and do a build, VS picks it up. Its only any subsequent changes you might make to the types or the generated code that it needs to be restarted for.

The generated code also looks like its correct, if you can build its a good indicator that the code is syntactically correct otherwise the original build would have failed if I had made a typo whilst generating code.

I can easily show this, I will fudge up a semi-colon in the return statement and re-build the solution (normal build will not throw up errors):

But when I go to the generated entity, the semi-colon is still there!!🤔 Of course, I need to restart VS to see that, don’t I? 💡🤦‍♂️

Now I can start using this mapper from my consumer app because the ToDto extension method just magically appears (that’s not to say I don’t need to import the ConsoleApp9.Domain.Dtos namespace where all this generated code lives, I absolutely do but I will let Re-sharper and/or intellisense help me do that!):

Just to make sure it works as well as it looks, I will simply JSON-ify the DTO (the ultimate destiny for almost all DTOs anyway) and dump it on the console:

Looks like it!

So far so good! I’ve got the basic DTO and mapper working but I am not out of the woods yet. Say now the domain asks me to record an employee’s address, for this I will create a value type Address and add a nullable property of that type to the Employee domain entity (its not required to have a home address right from the start, an employee can always add their home address once they have a permanent place to stay):

I will just do a quick re-build at this stage to see what the generator outputs (if anything):

The new types have been added! so, yay!

But if I open the EmployeeDto class, at first blush everything seems fine! But there are two problems both highlighted in orange:

  1. The DTO mis-identified the type of the HomeAddress property as Address as opposed to AddressDto?, and
  2. The mapping function is directly assigning the entity property to the DTO property which will not work since the type is a complex type and will need to be further converted to DTO. Due to the mis-identification of the property type in problem 1, the build also didn’t fail because the mapper is assigning the property of an assignable type i.e. Address?.

To fix these I essentially need to:

a. Detect if the type of the domain entity property being evaluated is a complex type or not.

b. If its complex type, then 1) Use AddressDto? as the property type (for nullable types) instead of Address? 2) instead of directly assigning (as I had been doing thus far), invoke the corresponding ToDto() method on the domain entity property. This will convert Address to AddressDto for e.g. Otherwise, do what I am doing currently because the property is not a complex type.

For determining if the type is a complex type or not, I will be using the semantic model exposed by GeneratorExecutionContext because the semantic model is the one that contains information on what things mean for e.g. if something is a reference type and a class etc. which is what I need to find out. I will modify the BuildDtoProperty() method and add two convenience extension methods as shown in the gist below:

As it turns out this semantic information about properties lives inside the semantic model as ISymbol instances and for properties more specifically in IPropertySymbol instances and exposes type information. The IsOfTypeClass method, checks to see if the property type is a reference type, its kind is class and the property type must be within the same namespace as the original namespace. This last one is important i.e. both DTO types should be in the same namespace, this means no external types are allowed because it will be hard to be certain if that type is controlled by the client application or not, hence might be difficult to decorate with custom attributes and appropriately convert to a DTO. For e.g. if I create a String property in my domain entity, without this check, the generator will create a property of type StringDto which makes no sense since String is a .NET CLR type, not a custom domain type and is therefore not controlled by the consuming application.

The IsOfTypeStruct method is mostly the same except for checking to see if the type is a struct value. If either of these is true, then I want to suffix the original type name in the property with the word “Dto” to reference the DTO class. Whilst at it, I will also take care of nullable types as well! It would appear that IPropertySymbol.Type.Name excludes the “?” from the nullable types, IPropertySymbol.Type.ToDisplayString() includes it. The former is useful for complex type because I need to suffix “Dto” for the DTO property whilst the latter will work for primitive types because the type name can go into the DTO verbatim. Using display string for complex type could result in the type name looking like: Address?Dto? which is syntactically wrong and will fail to compile.

⚠ Lot of this code is trial and error. Exploring the Roslyn syntax/semantics API can help in understanding which types contain what information but good ol’ trial and error is less painful than trying to debug the source generator. Its doable by calling Debugger.Attach in the Initialize method but I’ve found that it tends to create a vicious debug cycle where VS prompts the UAC dialog everytime something causes the debugger to run for e.g. any time you change anything in the code. Dismissing that dialog half a dozen times everytime you alter a single letter in code is a NIGHTMARE so I wouldn’t recommend that approach!

Finally, I will change the mapper generation to include the ToDto invocation against any complex type properties. This is straightforward since it builds on the work already done above. For this I will modify the member loop in the main Execute method to do the same complex type vs primitive type check, and for complex type I will append the null coalescing operator and “ToDto()” suffix at the end (to make it null safe):

Build the solution to generate the updated code:

That’s more like it!

And now run the consumer app to make sure that its all working:

And the serialised version of the DTO agrees!

If the address was never set, the serialised value will simply be null but the app won’t crash due to a null-ref exception like it would have done if I hadn’t made the dto conversion null safe for nullable types.

Finally the domain is asking me to change the Employee definition to keep a track of all the assets an employee has been issued by the company for e.g. business phones, laptops etc.

To accommodate this request I will make 2 changes to the domain model:

  1. Create 2 new value types called CompanyAsset and AssetCode, in the domain and decorate them with the GenerateMappedDto attribute. An asset MUST have a code associated with it. This is just to see how code gen will work with nested complex types, domain modeling is outside the scope of this blog series.
  2. Add an IReadOnlyCollection<CompanyAsset> property calls AssetsAllocated and expose a method on the Employee class to add assets to the collection when they are allocated to our employee. So now the entity class looks like:

I’ll be able to build on the work done so far for much of the remaining challenge but generic types still need to be handled properly, more specifically generic collection types as in this case. If I were to build the code in its current form, the generic collection property(ies) will have the same problem of mis-identified types. So to address this what I want to do is: a) add a DTO property with type IReadOnlyCollection<CompanyAssetDto> b) Invoke the ToDto() method on each item of this collection in the mapper extension and so on down.

The challenge now is to detect if the property type is a generic type and suffix all complex type arguments with “Dto” so, IReadOnlyCollection<CompanyAsset> will become IReadOnlyCollection<CompanyAssetDto>.

⚡!!! You are now entering messy, hacky code territory!!! ⚡

Turns out this is a little quite a bit more difficult to achieve using the semantic model alone so I will also use the syntactic model (please read the inline comments in code to get some idea of what the hell is happening):

The way I figured which syntax types I need to use, is with this little nifty tool called Syntax Visualizer. You can install this if you modify your VS installation to add the .NET Compiler Platform SDK workload, via the Visual Studio Installer app. The way this works is by simply clicking on the type in your code that you want to visualise and the visualiser will automatically refresh and open up the corresponding node in the syntax tree:

What I am interested in is the TypeArgumentList node of the GenericNameSyntax node for this property

Basically it comes down to which types in the type argument list should have the Dto suffix and which shouldn’t. All custom types i.e. the ones defined in the Domain.Dtos namespace, need a Dto suffix whereas all .NET types, don’t. In the BuildTypeName() method, the INamedTypeSymbol::TypeArguments will carry all type arguments listed on the generic type whereas the node under consideration only refers to one type at a time, so I’ve got to do a “lookup” and then determine if the type in the type argument list is custom or not and then return appropriately suffixed DTO type names.

Ok! Property type name sorted, onto the mapper method…

This is getting hackier (or at least uglier) by the minute because I am focussing on getting it to work first, I will eventually put a more cleaned-up version of the code up on Github but for now I will highlight the chunk that fixes the conversion methods for properties with generic collection types.

Essentially, if the generic type argument is a primitive type then conversion is basically direct assignment from entity to DTO. But if any type argument is a custom type, then I will attach the “ToDto()” call to the assignment to convert from the entity type to DTO type. I am also making an assumption about the entity and the DTO, that is generic type arguments are only used with collection types like the ones I mentioned previously (so no Task<T> in domain entities for e.g.). Therefore if I find generic types with complex types as arguments, then I will also generate extension methods to convert a collection of entity types to a collection of DTO types:

I am having to handle Dictionaries differently because they have 2 type arguments as opposed to just one and either TKey or TValue could be a custom type. I still have to fix this bit (hence the 🤷‍♂️) but at this stage I am wondering if this whole thing is worth it in the first place? I mean just look at the code so far!! Horribly unreadable mess!

Anyway, this results in the EmployeeDto class that also has extension methods to convert collection type properties in the domain entity to their DTO counterpart:

Finally!

HOLY CRAP! That was a lot! Am I done though? For this particular source generator, I think yes. What’s “outstanding” i.e. is niggling at the back of my mind ? Well, a couple of things at least:

  1. Putting DTOs closer to where they are used: currently the code puts the generated DTOs within the sub-namespace within the domain and this could be a bit of problem because DTOs serve a different purpose than domain entities so they should be colocated with the thing that uses them. In this case, it should be the host project for e.g. web api etc. I’ve not yet found a way to put the generated code in a custom location or if its even possible. If it is, then custom namespace could be passed to the attribute which the generator could use but at the moment I am not sure.
  2. Performance profiling of the build with and without source generation: To be perfectly honest, in my sample scenario, I didn’t notice a whole lot of build slowdowns. A couple of seconds to do a clean build doesn’t sound a whole lot, of course this is going to be solution dependent. Given a large enough solution and dog slow build machine, things could change. The source generator’s Execute method itself takes < 20 ms on my laptop when doing builds inside Visual Studio (I’ve added a little bit of timing code that roughly measures this)
  3. Testability of source generators: Because throughout this entire exercise my focus was on exploration and trying to see what’s possible, I didn’t really TDD it (sue me! Its perfectly fine to not write tests when you are exploring/sketching because you don’t know how will it pan out!) I will tackle testing in a later post (accompanied by a fully refactored version of the code), assuming I haven’t given up on this problem by then! By the looksof things, this might be possible I will have to see.
  4. Debuggability of source generators: One way to debug source generator will be to output another .cs file with logs written out as C# comments. The process of emitting this is no different than what I have shown here. Key thing to remember, the hintName argument in the context.AddSource(...) should be whatever you want to name the generated .cs file and the encoding MUST be UTF8, don’t let the optionality of that parameter fool you. F5 debugging of source generators is horrible like I have already mentioned in a preceding section.
  5. Some edge case domain entity structures might not be covered by the current generator or might not produce the correct output: In order to keep the generator relatively simple and not have it do too much, I would keep special customisations out of it. So no ability to inject custom behaviour into the DTOs and/or extensions.
  6. Ignoring properties that I don’t want mapped: This is fairly straightforward to do and can be achieved by decorating such properties with another custom attribute may be [ExcludeFromMapping] or something. I might do this by the time you read this post.
  7. Un-mapping DTOs: i.e. if you don’t want an entity to be mapped to a DTO anymore, just remove the GenerateMappedDto attribute from the class and the generator will not generate code for it thereby effectively removing it. The generated code doesn’t get checked into the source control, so no harm either way.

Conclusion

I do see the value of source generators in affording productivity gains with regards to repetitive tasks that developers do that don’t change from one to the next all that much. For e.g. generating mapping code like the one I have shown in these posts, the canonical example of automatically generating implementations for interfaces for e.g. stubs etc and another one that I would like to try out : auto-generating tests for a public API, although this might also mean somehow auto-generating the whole test project and then generating test code into that project.

I find it a bit limiting that only new code could be created but existing code couldn’t be modified, although I can see where they are coming from on this. Allowing source generators to modify engineer written code can be risky due to potential flakiness and stability risks.

I also find limited debugging options a real pain and the fact that I have to restart VS multiple times to see the changes reflect but I am hoping these are just teething problems because VS Code is a lot better experience however, it doesn’t have the capability of showing the generated code so its a bit like flying blind.

Discovering the Roslyn syntax APIs with trial and error is quite time consuming but tools like Syntax Tree Visualiser help and once you’ve used the APIs you get some sense of what you need to use and then its just a matter of Ctrl + . exploration to find the right method/property to invoke.

Anyway, this has been fun, the code is on GitHub!

Header image source

Latest comments (0)