Arithmetic expressions
Let's say we want to represent a simple arithmetic expression in C#, and then evaluate it. For example, evaluating 1 + (2 + 3)
would produce 6
. What's the best way to do this?
The obvious approach is to define a class hierarchy. We start with a base interface that represents all expressions:
interface IExpr
{
int Eval();
}
The Eval
method is what we'll use to evaluate an expression. For our simple case, we have two concrete implementations of this interface. The first represents a literal integer value (e.g. 5
):
class Literal : IExpr
{
public Literal(int n)
{
N = n;
}
public int N { get; }
public int Eval() => N;
}
Evaluating a literal is trivial: we simply return the value of the literal itself.
Next, we can represent addition like this:
class Add : IExpr
{
public Add(IExpr a, IExpr b)
{
A = a;
B = b;
}
public IExpr A { get; }
public IExpr B { get; }
public int Eval() => A.Eval() + B.Eval();
}
To evaluate an addition expression, we recursively evaluate both of its subexpressions and then add those two values together.
We're ready to try it out:
// 1 + (2 + 3)
public static IExpr CreateTestExpr()
=> new Add(
new Literal(1),
new Add(
new Literal(2),
new Literal(3)));
public static void Test()
{
var expr = CreateTestExpr();
Console.WriteLine(expr.Eval());
}
This creates our 1 + (2 + 3)
test expression and then evaluates it, writing 6
to the console.
Extensibility
That's great, but what if we want to extend our system? For example, let's add a new expression type that represents multiplication. This is easy - all we need is a new concrete implementation of IExpr
:
class Mult : IExpr
{
public Mult(IExpr a, IExpr b)
{
A = a;
B = b;
}
public IExpr A { get; }
public IExpr B { get; }
public int Eval()
=> A.Eval() * B.Eval();
}
But what if we want to add new behavior to our expressions instead? Let's say we want to be able to print an arbitrary expression by converting it to a string. For example, printing the expression 1 + (2 + 3)
would result in the string "(1 + (2 + 3))"
. To make this work, we have to add a new Print
method to the base IExpr
interface:
interface IExpr
{
int Eval();
string Print(); // NEW
}
But that means that we'd have to modify all the existing expression types (Literal
, Add
, etc.) to support printing. Our requirement instantly cascades through the entire code base. We might not even own all the code that has to change, in which case we'd be stuck. Is there any way to add printing ability without modifying existing code?
The Visitor pattern
The typical way to solve this problem is with the Visitor pattern. To make this work, we have to start over with a different IExpr
interface:
interface IExpr
{
TResult Accept<TResult>(IVisitor<TResult> visitor);
}
Instead of exposing an explicit Eval
method, this new interface now exposes a generic Accept
method that supports any "visitor" that implements a separate interface called IVisitor
:
interface IVisitor<TResult>
{
TResult VisitLiteral(Literal literal);
TResult VisitAdd(Add add);
}
Each concrete IExpr
subtype then invokes the corresponding visitor method inside its Accept
method:
class Literal : IExpr
{
public Literal(int n)
{
N = n;
}
public int N { get; }
public TResult Accept<TResult>(IVisitor<TResult> visitor)
=> visitor.VisitLiteral(this);
}
class Add : IExpr
{
public Add(IExpr a, IExpr b)
{
A = a;
B = b;
}
public IExpr A { get; }
public IExpr B { get; }
public TResult Accept<TResult>(IVisitor<TResult> visitor)
=> visitor.VisitAdd(this);
}
Using this approach, different visitor implementations can implement different behavior. For example, here are two visitors - one that evaluates an expression, and a different one that prints an expression to a string:
class EvalVisitor : IVisitor<int>
{
public int VisitLiteral(Literal literal)
=> literal.N;
public int VisitAdd(Add add)
=> add.A.Accept(this) + add.B.Accept(this);
}
class PrintVisitor : IVisitor<string>
{
public string VisitLiteral(Literal literal)
=> literal.N.ToString();
public string VisitAdd(Add add)
=> String.Format($"({add.A.Accept(this)} + {add.B.Accept(this)})");
}
That's terrific - we can now add behavior to our system without modifying existing code. But... what if we still want to be add a new expression type like Mult
as well? In order to support that, we have to add it to the base IVisitor
interface:
interface IVisitor<TResult>
{
TResult VisitLiteral(Literal literal);
TResult VisitAdd(Add add);
TResult VisitMult(Mult mult); // NEW
}
But that means that we'd have to modify all the existing visitors (evaluation, printing, etc.) to support multiplication. Our requirement instantly cascades through the entire code base. We might not even own all the code that has to change, in which case we'd be stuck. Does this sound familiar?
The expression problem
This conundrum is called the "expression problem". Is there a good way to design a system so that it's easy to both a) add new subtypes, and b) add new behaviors, without modifying existing code?
It turns out that there is a rather nifty way to accomplish this, called "object algebras". We start with an "object algebra interface" that looks similar to the Visitor pattern:
interface IExprAlgebra<T>
{
T Literal(int n);
T Add(T a, T b);
}
We're going to use this interface more like a Factory pattern than a visitor, though. For example, we can create the expression 1 + (2 + 3)
like this:
public static T CreateTestExpr<T>(IExprAlgebra<T> factory)
=> factory.Add(
factory.Literal(1),
factory.Add(
factory.Literal(2),
factory.Literal(3)));
To create an expression we take a "factory" that implements our expression algebra, and ask it to construct an object of arbitrary type T
that represents 1 + (2 + 3)
. This type can be whatever the factory wants it to be.
For example, to create an algebra that can evaluate expressions we define an IEvalExpr
interface that represents any evaluable object:
interface IEvalExpr
{
int Eval();
}
class EvalExpr : IEvalExpr
{
public EvalExpr(Func<int> eval)
{
_eval = eval;
}
Func<int> _eval;
public int Eval()
=> _eval();
}
The EvalExpr
class implements this interface by wrapping an arbitrary function. We can then use these to create a concrete algebra:
class EvalAlgebra : IExprAlgebra<IEvalExpr>
{
public IEvalExpr Literal(int n)
=> new EvalExpr(() => n);
public IEvalExpr Add(IEvalExpr a, IEvalExpr b)
=> new EvalExpr(() => a.Eval() + b.Eval());
}
Note that each method returns an evaluable object of type IEvalExpr
. We can then test this approach, as follows:
IEvalExpr expr = CreateTestExpr(new EvalAlgebra());
Console.WriteLine(expr.Eval());
The first line creates an evaluation algebra and uses that as a factory to represent the expression as an evaluable object. The second line evaluates that object and writes the result to the console (6
, as before).
Extensibility, again
So far, so good. But how extensible is this approach? Can we implement printing without changing any of the existing code? Yes. As before, we first define an interface for printable objects:
interface IPrintExpr
{
string Print();
}
class PrintExpr : IPrintExpr
{
public PrintExpr(Func<string> print)
{
_print = print;
}
Func<string> _print;
public string Print()
=> _print();
}
And then implement a new print algebra that returns printable objects:
class PrintAlgebra : IExprAlgebra<IPrintExpr>
{
public IPrintExpr Literal(int n)
=> new PrintExpr(() => n.ToString());
public IPrintExpr Add(IPrintExpr a, IPrintExpr b)
=> new PrintExpr(() => $"{a.Print()} + {b.Print()}");
}
And can we also create a new expression type, like multiplication, without modifying existing code? Again, the answer is yes. To handle this, we first extend our algebraic interface:
interface IExprAlgebraExt<T> : IExprAlgebra<T>
{
T Mult(T a, T b);
}
class EvalAlgebraExt : EvalAlgebra, IExprAlgebraExt<IEvalExpr>
{
public IEvalExpr Mult(IEvalExpr a, IEvalExpr b)
=> new EvalExpr(() => a.Eval() * b.Eval());
}
This defines a new interface called IExprAlgebraExt
that includes all of IExprAlgebra
(i.e. Literal
and Add
) and also defines a new method called Mult
that follows the same pattern. The implementation of this interface is EvalAlgebraExt
, which similarly inherits from our existing EvalAlgebra
while also implementing the new Mult
requirement.
We can test it out on an expression like 4 * (5 + 6)
that contains literals, addition, and multiplication:
public static T CreateTestExprExt<T>(IExprAlgebraExt<T> factory)
=> factory.Mult(
factory.Literal(4),
factory.Add(
factory.Literal(5),
factory.Literal(6)));
public static void Test()
{
IEvalExpr expr =
CreateTestExprExt(new EvalAlgebraExt());
Console.WriteLine(expr.Eval());
}
Note that CreateTestExprExt
takes the extended version of the algebraic interface, which ensures that it can handle multiplication, but still returns a plain old evaluable object of type IEvalExpr
. We can expose this new interface to our entire system, or we can keep it local only - it's up to us. There's no cascade of changes that we're forced to make.
So, we've successfully solved the expression problem! This approach is more flexible than visitors, but doesn't add significant additional overhead or complexity. I'd call that a win!
References
For more information:
- The original paper that introduced object algebras: Extensibility for the Masses: Practical Extensibility with Object Algebras
- From Object Algebras to Finally Tagless Interpreters, which walks through this approach in Java and Haskell. I used this article as a basis for my post.
- My GitHub repository containing the code from this post.
Top comments (5)
Great, thank you :-)
Hi, I got this account just to be able to comment on this.
I really liked your series!
I know I'm a few months late, but I'm really enticed by the prospect of getting around the visitor pattern. I can see that this use case is very specific, so I was wondering if by any chance you have more examples using this approach? I'm trying to transpose your code to scenarios I can better understand but I'm having a hard time escaping the frame.
What I'm especially interested in is object modeling. The classic example is the C# sum type, which I wish was as easy as an F# discriminated union, but if course I end up using some sort of visitor pattern, either by interface or by inheritance.
So.. my question to you (which might seem stupid if I have gathered nothing from this last part of the series):
Very simply, I want to model a Fruit. It's either a banana, an orange or an apple.
Fruit { Banana | Orange | Apple }
I don't want to use the visitor pattern, I want to do what your title suggests. How would I go about doing that using your approach? (While being able to solve the expression problem like you did, if possible)
Best regards, and thanks again for the series!
Hi. I'm glad you like the series. The expression type discussed in my article is a sum type:
Expr { Literal of int | Add of (Expr * Expr) }
. You can see this in the F# implementation in my repository. So you can implement a fruit algebra the same way, like this:Hi! So grateful that you replied!
For some reason I'm still having a hard time extending the types of the example you provided, although I really like the example. Very clear and instructive.
I did something similar myself, with the goal of being able to extend types selectively, and I would love to hear your comments on the approach (heavily using the code you originally provided)
Especially if you have any thoughts on the downside of using this approach :)
What I think I've done is extensible types, extensible functionality, and it's all up to the caller to decide what to call. As well as there's only one generic method to get whatever we're interested in, from the objects that can provide it.
Output:
Areas
SumWithoutVisitor3.Rectangle : 200
SumWithoutVisitor3.Triangle : 100
SumWithoutVisitor3.Circle : 314,1592653589793
Circumferences
SumWithoutVisitor3.Circle : 62,83185307179586
Descriptions
SumWithoutVisitor3.Triangle : A mighty fine triangle!
Great series. Thanks for taking the time to write all these examples.