An expression parser for MiniScript

#miniscript #minimicro #programming #library

Last week, we talked about the Pratt parsing algorithm. Prior to learning about that, I had always written recursive-descent parsers, including one a couple years ago that could evaluate simple numeric expressions.

Inspiring by Pratt parsing, I decided to take another pass at the expression evaluator. I'm pleased to report that it went swimmingly, and we now have a simple, efficient, and easily extensible little module for evaluating numeric expressions in MiniScript!

What's it do?

The new module provides an eval method that takes a string and returns a number. The string can contain any numeric expression, like "(2+3)*5" or "200 - 5!". It supports all the standard numeric operators, including ^ (power), % (mod), and ! (factorial).

eval("6 % 5")  -->  1
eval("17.45 % 12")  -->  5.45
eval("2^2^3")  -->  256
eval("(2^2)^3")  --> 64

The module contains one constant, pi, and a bunch of numeric functions: rnd, round, floor, ceil, log, sign, sqrt, sin, cos, tan, asin, acos, and atan (whew!). Because this is an expression (not a statement) that you're evaluating, arguments to a function must be enclosed in parentheses. But you can invoke rnd just fine with no parentheses.

eval("floor(rnd*6+1)")  -->  4 (varies from 1 to 6)
eval("round(cos(45*pi/180)*100)")  -->  71

You can even assign variables, and use them in later evaluations.

eval("deg = pi/180")  -->  null
eval("x = 45*deg")  -->  null
eval("cos(x)")  --> 0.707107

It's mainly intended to be invoked from your own code, wherever you need to compute the value of some expression at runtime. But if you just need a quick calculator, you can run eval.ms as the main program, and (after some unit tests) it will drop into a little REPL where you can just type expressions and see the value.

Unit testing: eval
All tests passed.
Enter expression to evaluate, or `quit` to quit.
eval> (365+1/4) * 24 * 60 * 60
31557600
eval>

How it works

I'm really pleased with how neat and clean the code came out, largely thanks to Pratt parsing. It's all in a single file, eval.ms, but it's divided into several sections. Click the link to open the code in another tab, and you will find:

Some general utility functions at the top (lines 1-18).
A simple lexer, which divides a string up into a list of tokens (lines 19-58).
Storage for variables and functions (lines 59-75). This is where you could add more constants or functions, or hook into variables in your host program.
Operations (lines 76-111): this is the code that actually computes and returns the "value" of a number, identifier, or some operation like addition or subtraction. The current code computes numeric values, but by replacing this Op map, you could make it build an abstract syntax tree, or emit bytecodes, or anything else.
Precedence levels are defined in a little map on lines 112-127.
The heart of the Pratt parser is "Parselets" -- little objects that each know how to handle one situation, such as "parse a binary operator, given an already-parsed left-hand side and a stream of tokens starting the right-hand side." These are defined in lines 128-162. There are only a handful of these and they're pretty simple. What the parselets actually do is defined by referencing something in the Op map (above).
The, uh, muscles and sinew of the Pratt parser are the methods in the Parse map (lines 168-238). These methods actually instantiate and configure Parselets, and for special cases, override the simple Parselet parsing to handle more complex situations. For example, Parse.Binop instantiates a unique InfixParselet for each operator; but Parse.Group, which handles parentheses, just handles the parsing itself, as there is nothing else quite like it. The most complex case is Parse.Identifier, as it needs to figure out whether the identifier is the start of an assignment, or a function call, or a regular variable look-up.
Finally, the language our parser parses is defined mainly by a set of "token effects" in lines 239-253. Each type of token can have one effect when it's found at the start of an expression, and a different effect when it's in the middle after some other value has been seen. Most tokens only have one of these effects, but for example, - has both, since it can be both a unary operator (as in "-x") and a binary operator ("x - y").

And that's pretty much it. The rest of the code is just a couple of minor helper methods, the unit tests, and the main loop.

Directions for Use

To use this code in your own projects, just download eval.ms from the GitHub repo, and place it next to your own script or in some convenient "lib" folder. Then you can just import "eval" to load the eval module, and then call eval.eval(s) to evaluate string s.

import "eval"
print eval.eval("round(100*rnd)") + "% chance of rain"

If you want to set or inspect variables accessable to the eval calls, you can do that as eval.vars.

eval.vars.foo = 6
eval.eval("bar = foo * 7")
print eval.vars.bar

If you want to provide a "reset" capability, restoring the vars to their initial state, then you should save a copy of it right after the import, and then reapply it later to reset. Remember that an easy way to copy a map is to just add it to an empty map.

import "eval"
// save the "clean" variables right after import
cleanVars = eval.vars + {}
// let's change the value of pi to a prettier number
eval.eval("pi = 4")
// no wait, that was a horrible idea -- reset!
eval.vars = cleanVars + {}
// ah, back to the original value
eval.eval("pi")

Limitations

The current version of this eval library only handles functions that take a single argument. So you can't currently say something like "atan(8,2)" or "round(pi,3)".

Also, it only handles numeric expressions. If you want to be able to evaluate expressions involving strings, you'll need to extend both the lexer and the operations to handle those.

If these limitations affect you, leave a comment or reach out to me on Discord, and I'll guide you through the necessary changes.

If you need to evaluate expressions involving lists or maps, or basically arbitrary MiniScript code, then this little parser is probably not for you. In that case consider the ms.ms project instead.