I realized I left a big part out of my last article so today I'm here to fill you all in.
One of the big changes I mentioned over part 1 was the switch from using Wasm to using LLVM and I accidentally completely glossed over it. Super sorry about that one.
LLVM is a pseudo-assembly and program collection that takes this "intermediate representation" as they call it and does the actual compilation into machine code. It's very type strong and looks familiar yet foreign.
And once the program is able to produce an AST from the code, it's the Slate compiler's job to then produce LLVM IR.
This gist contains both the code we're working on and the desired LLVM IR output wer're going to produce. Thankfully, the LLVM project provides a Go library we can use. Actually generating IR takes many things to consider, so today is going to cover the first part.
In our code we have 4
const declarations that equal to functions with some calls in each one. This is no matter, except for if we go to compile these in the order they're written. If this is the case, then we'll create a function object for
main, but then when we try to find the function object for
First we set up two variables. The first is a
completed map that associates
const names to their IR objects. This will contain the declarations that have been completed and can be referenced later at any time.
At the start this will be empty. Secondly, we also need a
todo string array of the uncompiled
const declarations. It will contain their names so long as they still need to processed. At the start all declarations will be placed into
todo. (For our example this will be
[main, print, write, syscall3])
Then, it loops through the
todo array and calculates the "dependencies" of each item. Remember earlier when we were going to parse
main and needed
main: [print] print: [write] write: [syscall3] syscall3: 
writecalls syscall3 because the convention is the direct call-path for x86_64 Linux, which is what this base example was made for.
syscall3has no "dependencies" because the only call it makes is inline assembly and will just about be directly inserted into the binary with its arguments.
Now, it will make the IR conversion for all AST objects whose calculated dependencies are all elements of the
completed map object and remove them from
For our example, that will be
syscall3 will then get sent to the LLVM Go library and placed into
completed. The compiler will then repeat this loop until
todo is empty and finally exit.
This approach worked because the example was very direct and there are no imports or anything like that. Additionally, it's not very type-safe.
I'm conflicted on whether I want to eventually allow name overloading, but I want it to be able to throw an error if
print(string) exists but you try and call
On top of that, tree shaking for the benefit of smaller binary outputs would be great. This could be achieved by starting at
main and diving into each dependency until it parses all that have none. It would then work backwards and only compile the functions/objects referenced through the
main pathway. This would have the added benefit of giving me dead code analysis for free.
I hope this explains the internals a lot better than the last article and I'll try my best to leave out jarring wholes next time. Blogging is very much its own skill and kudos to those who do this regularly.
Thanks for reading, and I'll see you next time ❤