For the past week I've been working on a game with a little compiler in Roblox Studio, I want to create a tool for beginners to help them get into coding. The compiler will make use of a high-level, Lua-like language. The goal of the game is to steer a little robot and make it do various tasks like navigating through a maze.
I've chosen Roblox Studio because of it's user-friendly nature. It will be easy for beginners to just start up the game and start coding! Roblox also handles most of the User interface by itself, so I will be able to focus more on coding the acutal logic. Another reason is that i already have a bunch of experience with Roblox Studio.
In this blog I want to focus on the journey of me creating my first ever compiler and explain my thought process. But before we start I want to make a few things clear:
- Most of the code shown in this blog will be a simplification from the real code to make it easier to understand
- Most of the code shown will be taken out of context
- All code will be shown in either Luau, Roblox's main coding language, or my own language. additional comments will be provided whenever necessary
- This compiler is made by me, it might differ from official methods
- This series is still WIP. I am writing this while still working on the compiler, some things are bound to change. Feedback is greatly appreciated!
Now that that's out of the way, lets start!
Chapter 1: Main overview
You can make a compiler in a few different ways. For example a computer will convert Assembly language into Machine language and then execute it like its part of the code. In my approach I will be making use of an Abstract Syntax Tree (AST for short) I will explain more about this in a later chapter.
Now let's see a quick overview of my compiler!
1. The Lexer
The lexer, also known as a tokeniser, will convert the raw code input into tokens. Just like a sentence has words, each with a different function, each word in a line of code will have their own function.
Let's take the following line as example:
var y = x + 2
lets look at the functions of each part:
var
is used to create a variable named y.
=
is used to assign a value to y.
+
is used to add two values together.
x
and y
are variables and 2
is a number.
The lexer will go through all of the code like this and generate tokens which define the function of all parts of the code. The parser will be able to process these tokens further.
2. The Parser
The parser will take the previously described tokens and create an Abstract Syntax Tree (AST). An AST is an abstract repesentation of code. An AST repesentation of the previous example might look like this:
Using Semantic analysis the computer will be able to run the AST.
3. The Semantic analyzer
** WIP **
Chapter 2: Before starting
Before you start actually writing a compiler, there are a few things you should take care of.
1. Create a language description
Creating a description of the grammar of the language in a text file is an important thing to do before writing the compiler. This will provide a structure to your language and make coding it easier. The description should consist of a name, description and token repesentation. I will provide you with a part of my own description:
--------------------------] Types
number - any type of number { Type = number, Value = NUMBER_VALUE }
string - a string of text { Type = string, Value = STRING_VALUE }
variable - an object with any type of value { Type = variable, Value = VARIABLE_NAME }
equation - an equation { Type = equation, Operator = OPERATOR, Left = LHS, Right = RHS }
2. Create a comfortable work environment
Before creating the compiler you should make yourself a comfortable work environment. Create a basic code-editor and console, this will make debugging and working with the compiler much easier.
Now you should be ready to start working!
Top comments (0)