Hello dear readers!
Once I was asked to write an expression calculator. And my first impulse was to write a classical calculator based on two stacks, also known as shunting yard algorithm. In this approach, one stack is used for values and the other for operators. It's a classical solution but I wanted to do something more interesting and inspiring and fun...
We have a mathematical expression like:
2 + 2 * (2 + (2)) and we need to calculate the result. There are some problems:
- Resolving brackets
- Resolving operation priorities
The regular expression could be easily used for finding some patterns in the string.
The first simple patter is a pattern for expanding brackets:
This pattern detects an open bracket followed by any number of digits and finally followed with a closed bracket.
This simple regular expression is able to catch the expression:
But how could we extract the number and reject the brackets?
Hopefully, the regular expressions have the ability to catch results into the named groups.
Let's update our regex example:
With this regex, we could find numbers with brackets and take only a number.
Let's check more complex operations. Addition.
For adding two numbers we have to write a regular expression to find the number plus number.
Here we have two named capture groups and a "hardcoded" plus symbol.
This expression allows us to detect expressions like
What's the difference between addition and multiplication? Actually only an operation. So, we use almost the same regular expression to detect multiplication. And the same for division and subtraction
Let's assume we've got an expression
At this moment we have got a bunch of regular expressions and small operation evaluators.
In a simple way we've got regular expressions for:
*( number ) => number
*numberA + numberB => sum
*numberA - numberB => sub
*numberA * numberB => mul
*numberA / numberB => div
- 4+4 => 8
It's a common question while we are talking about expression evaluators.
And now we could use an interesting feature of regular expressions: Negative lookahead.
The main idea is to avoid matching num + num surrounding with high priority operators (* /).
A bit simplified pattern:
Easy as pie. (c) Daddy Pig :-)
The most common constant is Pi., (3.14). In my solution, we could support constants in the same manner as the opening brackets.
(?i)PI The regular expression "?i" is a case insensitive switch.
What is the main difference between a constant Pi and, for example, Sinus function?
Theis only one difference - brackets. And we should construct a regular expression to detect text like
One important moment - we already have an open bracket function. And we also have a potentially conflicted Sin() function. And we have a simple way to solve this kind of problem - expression order!
Let's check an example:
To evaluate this function correctly we have to apply regexp substitutions in a particular order:
- Replace Pi (Sin(Pi))+1 => (Sin(3.14))+1; Match: Pi
- Calculate Sin(3.14) (Sin(3.14))+1 => (0.00)+1; Match: Sin(3.14)
- Open brackets (0.00)+1 => 0.00+1; Match: (0.00)
- Calculate Sum 0.00+1 => 1.00; Match: 0.00+1
I show only small examples with calculating only a simple math expression. Let imagine that our expression is a real DSL (Domain Specific Language) where Sin(x) is not a fast, but really time-consuming function. In this case, we could use the regular expressions ability to give us not only the first but all possible matches. For example, for function
Sin(0.5)+Sin(0.5) we detect two Sin function at once. As a result, we could use a parallel calculation.
To continue my experiments, I wrote a classical two-stacks-based evaluator. After I created a random expression generator and run both calculators over the randomized expression until both could calculate expression without errors.
*( 7 * 6 ) * 2 = 84
*4 ( - ( 5 ) ) = -1
*3 * ( 8 ) * 8 = 192
*7 + ( 1 - 8 ) * 9 = -56
But! also a completely incorrect expression like this:
*8 3 + - ( 9 ) = 2
Regular expression based calculator allows us to see the whole evaluation path step by step:
8 3 + - ( 9 ) => 8 3 + -9; Match: ( 9 )
8 3 + -9 => 8*-6; Match: **3 + -9*
8-6 => 2; Match: 8-6
The classical two stacks algorithm interprets this expression as
(3+8)-9 with the same result 2.
It's possible to create a simple calculator-like expression evaluator based on a regular expression. It's fun.
Classical shunting yard algorithm could evaluate erroneous expressions.
P.S. Feel free to comment/contact me if you like this work.