NUnit to xUnit automatic test conversion: pattern match

#csharp #refactoring #roslyn #dotnet

In the previous post I described how to use the Roslyn API to find code patterns in the C# AST and how to change the AST to rewrite the original code to something else. The goal was to automate the conversion of NUnit tests to xUnit. The approach I used was quite tedious, as I had to write a very long chain or ifs and typecasts to get the job done. Let's try to do better this time. Let's start with just the search part in our search-and-replace tool.

What would be great is to be able to specify structural patterns like this:

Assert.That(_, Is.EqualTo(_))
Assert.That(_, Is.EqualTo(true))
Assert.That(_, Is.Throws.TypeOf<_>())

And they would match the actual code:

// Matched by 'Assert.That(_, Is.EqualTo(_))'
Assert.That(account.Id, Is.EqualTo(id))
Assert.That("".ToBytes(), Is.EqualTo(new byte[] {}))

// Matched by 'Assert.That(_, Is.EqualTo(true))'
Assert.That(info.IsMd5, Is.EqualTo(true));
Assert.That(token.BoolAt(path, true), Is.EqualTo(true));

// Matched by 'Assert.That(_, Is.Throws.TypeOf<_>())'
Assert.That(() => Quad[-1], Throws.TypeOf<ArgumentOutOfRangeException>())
Assert.That(() => access(token, path), Throws.TypeOf<JTokenAccessException>())

At first it looks like a quite difficult task. But as it turns out in its simple form is not even that hard. I got the idea first when I was generating code for AST replacement with Roslyn Quoter. Looking at its source code I discovered a bunch of Parse* methods of the SyntaxFactory class.

So basically one function call will parse the snippet and return an AST for the given pattern:

var patternAst = SyntaxFactory.ParseExpression("Assert.That(_, Is.EqualTo(_))");

The one line above is equivalent to a wall of code like this:

var patternAst =
    InvocationExpression(
        MemberAccessExpression(
            SyntaxKind.SimpleMemberAccessExpression,
            IdentifierName("Assert"),
            IdentifierName("That")))
    .WithArgumentList(
        ArgumentList(
            SeparatedList<ArgumentSyntax>(
                new SyntaxNodeOrToken[]{
                    Argument(
                        IdentifierName("_")),
                    Token(SyntaxKind.CommaToken),
                    Argument(
                        InvocationExpression(
                            MemberAccessExpression(
                                SyntaxKind.SimpleMemberAccessExpression,
                                IdentifierName("Is"),
                                IdentifierName("EqualTo")))
                        .WithArgumentList(
                            ArgumentList(
                                SingletonSeparatedList<ArgumentSyntax>(
                                    Argument(
                                        IdentifierName("_"))))))})));

It feels like a total win already and we have not even done anything useful yet. But let's find this pattern in a source AST. First, we need to parse the file we're searching in:

var sourceAst = CSharpSyntaxTree.ParseText(File.ReadAllText(filename));

This gives us the list of all expression nodes in the AST:

var nodes = sourceAst.GetRoot().DescendantNodes().OfType<ExpressionSyntax>();

And now we find the nodes that match:

foreach (var e in nodes)
{
    if (Ast.Match(e, patternAst))
    {
        var line = e.GetLocation().GetLineSpan().StartLinePosition.Line;
        var code = e.NormalizeWhitespace();
        Console.WriteLine($"  {line}: {code}");
    }
}

Obviously the Ast.Match function is the tricky one. But not as tricky, really. We recursively traverse both ASTs in parallel and see if they match:

public bool Match(SyntaxNode code, SyntaxNode pattern)
{
    // A placeholder matches anything
    if (IsPlaceholder(pattern))
        return true;

    // Node types don't match. Clearly not a match.
    if (code.GetType() != pattern.GetType())
        return false;

    switch (code)
    {
    case ArgumentSyntax c:
        {
            var p = (ArgumentSyntax)pattern;
            return Match(c.Expression, p.Expression);
        }
    case ArgumentListSyntax c:
        {
            var p = (ArgumentListSyntax)pattern;
            return Match(c.OpenParenToken, p.OpenParenToken)
                && Match(c.Arguments, p.Arguments)
                && Match(c.CloseParenToken, p.CloseParenToken);
        }
    case IdentifierNameSyntax c:
        {
            var p = (IdentifierNameSyntax)pattern;
            return Match(c.Identifier, p.Identifier);
        }
    case InvocationExpressionSyntax c:
        {
            var p = (InvocationExpressionSyntax)pattern;
            return Match(c.Expression, p.Expression)
                && Match(c.ArgumentList, p.ArgumentList);
        }
    case LiteralExpressionSyntax c:
        {
            var p = (LiteralExpressionSyntax)pattern;
            return Match(c.Token, p.Token);
        }
    case MemberAccessExpressionSyntax c:
        {
            var p = (MemberAccessExpressionSyntax)pattern;
            return Match(c.Expression, p.Expression)
                && Match(c.Name, p.Name);
        }
    case GenericNameSyntax c:
        {
            var p = (GenericNameSyntax)pattern;
            return Match(c.Identifier, p.Identifier)
                && Match(c.TypeArgumentList, p.TypeArgumentList);
        }
    case TypeArgumentListSyntax c:
        {
            var p = (TypeArgumentListSyntax)pattern;
            return Match(c.LessThanToken, p.LessThanToken)
                && Match(c.Arguments, p.Arguments)
                && Match(c.GreaterThanToken, p.GreaterThanToken);
        }
    default:
        return false;
    }
}

So it's basically a giant switch with every node type in it. By far not every type is covered here, just those that I needed to get my examples to work. I imagine to cover the most of C# syntax I'd have to tediously write a couple of thousand lines of repetitive code. I'm not going to do it all any time soon. Just the stuff I need to cover my use cases.

With a few more lines of code added this already becomes a useful tool for searching for code patterns in a codebase. Next time we see how we can implement the replace part. The goal was to refactor, not just to search, wasn't it? I have some ideas on how it could be done. See you next time.

Conclusion

Thanks to Roslyn awesome API with just 172 lines of code we have a pretty advanced code grep. Surely, it's just a toy and a proof of concept at the moment. It would take a serious effort to make it something more than that. But I'm happy with what is possible with so little effort. Amazing.

Originally published on detunized.net

DEV Community

NUnit to xUnit automatic test conversion: pattern match

Conclusion

Top comments (0)