DEV Community

LarmkaartDev
LarmkaartDev

Posted on

The parser (Part 1: Equations)

The parser is a pretty complex component of the compiler. It will be responsible for converting the tokens into a format the Semantic analyzer can read and execute.

The parser will 'parse' all of the tokens the lexer generated. Parsing means that the parser will process the tokens and generate an Abstract Syntax Tree (AST) with them like described previously.

Let's see how the parser does this.

1. Equations

NOTE: In the following paragraph I'll be simplifying all tokens to just their Value properties, so if i write 1 + 2 i actually mean {"Value":"1","Type":"number"},{"Value":"+","Type":"operator"},{"Value":"2","Type":"number"}
This is just to make the entire example easier to follow.

Equations are pretty difficult to parse because you have to take order of operations in account. Instead of normal mathicmatical notation, the parser will make use of a clever notation called Postfix notation (click for more details). Postfix notation is a type of notation that gets rid of our order of operation problem. Instead of putting the operator between the terms, postfix will put it behind them.

For example the postfix notation will write

1 + 2 as 1 2 +

1 + 2 * 3 as 1 2 3 * +

(1 + 2) * 3 as 1 2 + 3 *

This notation might seem very confusing, but it will allow computers to calculate mathematical equations way quicker and easier.

To convert our standard notation into the postfix notation I will make use of the 'Shunting yard algorithm'. I won't go into details of implementation, but if you want to know more details and how to implement the algorithm, you should check out the Wikipedia page.

Now that we have the equation in postfix notation, we need to convert it into an AST so the Semantic analyzer will be able process and solve equation. for this I'll be using the following piece of code:

local queue = generatePostfix(tokens)

local function parsePostfix()
    local currentPostfixIndex = 1 -- current location in queue
    local operator = findNextOperator() -- find next operator

    while operator ~= nil do -- while there still exists an operator
        currentPostfixIndex -= 1 -- Look at the previous token
        local rhs = peekQueue(currentPostfixIndex) -- right hand side
        removeQueue(currentPostfixIndex) -- remove the processed token
        v -= 1 -- look at the previous token
        local lhs = peekQueue(currentPostfixIndex) -- left hand side
        removeQueue(currentPostfixIndex) -- remove the processed token

        local equation = {Type = "equation", Operator = operator.Value, Left = lhs, Right = rhs}

        if rhs.Type == "number" and lhs.Type == "number" then -- the equation only has numbers as terms?
            insertQueue(currentPostfixIndex, solveEquation(equation)) -- Insert the solved equation in the queue
        else
            insertQueue(currentPostfixIndex, equation) -- Insert the unsolved equationn in the queue
        end
        -- This is a quick optimisation so the Semantic analyzer doesn't have to calculate as much equations

        operator = findNextOperator() -- find the next operator
    end

    return queue
end
Enter fullscreen mode Exit fullscreen mode

The code might seem a bit overwhelming at first so let's go through the code step by step parsing the equation
(1 + 2) * (3 + 4)
the equation in postfix notation will be
1 2 + 3 4 + *

  1. Look for the next operator, in our case this will be + at index 3.
  2. Put the previous token 2 in the right side of the equation
  3. Put the previous token 1in the left side of the equation
  4. Check if the equation is already solvabe, if so, put the result of the equation into the queue. Otherwise put the unsolved equation in the queue. The queue should now look like 3 3 4 + *.
  5. repeat steps 1-4 untill no more operators are left. The entire equation should now be solved and the value in the queue should be 21
  6. all that is left is to return the queue so the equation AST can be processed further.

Lets look a bit closer at step 5. in the first iteration the queue should be
3 3 4 + *
Now we iterate through step 1-4 again
3 7 *
one more time
21

Now we have succesfully solved the equation using the postfix notation!

Top comments (0)