DEV Community

Cover image for Crafting a Compiler in Rust: Introduction
Simmypeet
Simmypeet

Posted on

Crafting a Compiler in Rust: Introduction

Has anyone wondered? How does the code that we've written turn into a program or application? The simple explanation is that there's a program designed specifically to translate high-level code into machine code that computers can understand and execute. This program is called a compiler.

Hi, In this dev-blog series, I'm writing and documenting the process of creating a compiler for my programming language, Pernix. This series will go through to the point where the compiler is self-hosted, meaning it's possible to develop a compiler for my programming language using my programming language and compiler. This series will not be a line-by-line tutorial but will show what each component in the compiler does and how they interact.

About my Programming Language and Compiler

The programming language is called Pernix. It is a Latin word; that means agile, quick, and swift. The language is statically-typed and compiled and has a syntax similar to C. I'll implement the compiler in Rust programming language with the help of LLVM.

Compiler Basics

Before writing a compiler, let's understand some basics of it. Generally speaking, the compiler is separated into the front-end and back-end. The front-end part involves converting the source code into an intermediate representation and checking all sorts of errors in the program. The back-end then performs the translation from the intermediate representation produced by the front-end into machine code.

Typically, each operating system and CPU have its own instruction sets, meaning that the compiler has to handle the differences between them.
Fortunately, LLVM will do the most heavy-lifting jobs in the back-end part. Therefore, I'll focus more on developing the front-end part.

Compiler Front-End Overview

Again, the compiler front end can also be broken down into several parts:
Lexical Analysis, Syntactic Analysis, and Semantic Analysis.

  • Lexical Analysis involves breaking the source code into manageable tokens/words.
  • Syntactic Analysis performs the grammar check of the language and turns a token stream from the previous phase into the abstract syntax tree (AST).
  • Semantic Analysis performs additional checks that the Syntactic Analysis phase can't, such as type-checking and symbols look-up. As the previous can only validate the grammatical structure of the program.

Brief Diagram of Compiler Front-End

I'll go into more detail about each phase in their separate blog post of them.

Oldest comments (0)