## DEV Community is a community of 905,285 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Dmitry Dorogoy

Posted on

# Fun with C# Regex based Expression calculator

Once I was asked to write an expression calculator. And my first impulse was to write a classical calculator based on two stacks, also known as shunting yard algorithm. In this approach, one stack is used for values and the other for operators. It's a classical solution but I wanted to do something more interesting and inspiring and fun...

## Intro

We have a mathematical expression like: `2 + 2 * (2 + (2))` and we need to calculate the result. There are some problems:

• Resolving brackets
• Resolving operation priorities

## Let's do it with regular expressions!

### Bracket expander

The regular expression could be easily used for finding some patterns in the string.
The first simple patter is a pattern for expanding brackets:
Like this: `\(\d+\)`
This pattern detects an open bracket followed by any number of digits and finally followed with a closed bracket.
This simple regular expression is able to catch the expression: `(42)`.
But how could we extract the number and reject the brackets?
Hopefully, the regular expressions have the ability to catch results into the named groups.
Let's update our regex example: `\((?<number>\d+)\)`
With this regex, we could find numbers with brackets and take only a number.

Let's check more complex operations. Addition.
For adding two numbers we have to write a regular expression to find the number plus number.
`(?<numberA>\d+)\+(?<numberB>\d+)`
Here we have two named capture groups and a "hardcoded" plus symbol.
This expression allows us to detect expressions like `1+1`.

## Multiplication

What's the difference between addition and multiplication? Actually only an operation. So, we use almost the same regular expression to detect multiplication. And the same for division and subtraction

## Let's create an expression solver

Let's assume we've got an expression `2*2+(2+2)`

At this moment we have got a bunch of regular expressions and small operation evaluators.
In a simple way we've got regular expressions for:
*( number ) => number
*numberA + numberB => sum
*numberA - numberB => sub
*numberA * numberB => mul
*numberA / numberB => div

### Step by step calculation

`2*2+(2+2)`

1. 2*2+(2+2)
2. 4+(2+2)
3. 4+(2+2)
4. 4+(4)
5. 4+(4)
6. 4+4
7. 4+4 => 8

It's a common question while we are talking about expression evaluators.
And now we could use an interesting feature of regular expressions: Negative lookahead.
The main idea is to avoid matching num + num surrounding with high priority operators (* /).
A bit simplified pattern:
`(?![*/])(?<numberA>-?\d+\.?\d*)\+(?<numberB>-?\d+\.?\d*)(?![*/])`
Easy as pie. (c) Daddy Pig :-)

The most common constant is Pi., (3.14). In my solution, we could support constants in the same manner as the opening brackets.
`(?i)PI` The regular expression "?i" is a case insensitive switch.

## How to add math function support?

What is the main difference between a constant Pi and, for example, Sinus function?
Theis only one difference - brackets. And we should construct a regular expression to detect text like `Sin(number)`.
One important moment - we already have an open bracket function. And we also have a potentially conflicted Sin() function. And we have a simple way to solve this kind of problem - expression order!
Let's check an example:`(Sin(Pi))+1`
To evaluate this function correctly we have to apply regexp substitutions in a particular order:

1. Replace Pi (Sin(Pi))+1 => (Sin(3.14))+1; Match: Pi
2. Calculate Sin(3.14) (Sin(3.14))+1 => (0.00)+1; Match: Sin(3.14)
3. Open brackets (0.00)+1 => 0.00+1; Match: (0.00)
4. Calculate Sum 0.00+1 => 1.00; Match: 0.00+1

I show only small examples with calculating only a simple math expression. Let imagine that our expression is a real DSL (Domain Specific Language) where Sin(x) is not a fast, but really time-consuming function. In this case, we could use the regular expressions ability to give us not only the first but all possible matches. For example, for function `Sin(0.5)+Sin(0.5)` we detect two Sin function at once. As a result, we could use a parallel calculation.

## Experiments!

To continue my experiments, I wrote a classical two-stacks-based evaluator. After I created a random expression generator and run both calculators over the randomized expression until both could calculate expression without errors.

### I got expected results:

*( 7 * 6 ) * 2 = 84
*4 ( - ( 5 ) ) = -1
*3 * ( 8 ) * 8 = 192
*7 + ( 1 - 8 ) * 9 = -56

But! also a completely incorrect expression like this:
*8 3 + - ( 9 ) = 2

Regular expression based calculator allows us to see the whole evaluation path step by step:
8 3 + - ( 9 ) => 8 3 + -9; Match: ( 9 )
8 3 + -9 => 8*-6; Match: **3 + -9*
8-6 => 2; Match: 8-6

The classical two stacks algorithm interprets this expression as
`(3+8)-9` with the same result 2.

## Conclusions

It's possible to create a simple calculator-like expression evaluator based on a regular expression. It's fun.
Classical shunting yard algorithm could evaluate erroneous expressions.