DEV Community

JairusSW
JairusSW

Posted on • Edited on

Part 01: Why should we understand compilers?

"Knowledge is not power, it is potential. Only when one applies that knowledge, it is power."

As developers, we use compilers every. single. day.
The limitation is that we are only taught to comprehend programming languages without a deep understanding of how our code functions. This series of articles will guide you through the inner workings of the compilers that we use so that we can become more proficient developers.

So, let's dive in and code like a wizard. 🧙‍♂️

What even is a compiler?

Considering you (the reader), you probably already know what a compiler is, but in the case of newer developers, I'll explain it in one sentence.

"A compiler is a tool that translates human-readable code into machine-readable code."

This definition shows us three things about ourselves.

  1. Knowing a language barely scratches the surface
  2. Comprehending the compiler broadens our insight
  3. Superior knowledge can help us innovate our understanding

So what next?

In this series, we will be writing our own compiler from the ground up as a learning experience. On the road to that destination we'll learn the basics of low-level operations, compiler parts, and a broader understanding of other compilers.

We will:

  • Understand how compilers are written
  • Write our own syntax and parser
  • Create an Abstract Syntax Tree
  • Achieve "Hello World" in our compiler

So, buckle up and get ready to take this journey with me. Let's dive into the world of compilers together as we become stronger developers. 🪄

Part 01: Understanding the compiler

The most basic compilers are created in stages.

  1. Lexical Analysis - The compiler reads the source code character by character and breaks the syntax down into certain keywords such as import fn mut
  2. Syntax Analysis - Also known as parsing, the compiler checks the source code to make sure it conforms to the syntax specification. The compiler outputs an Abstract Syntax Tree (AST) that represents code to the computer.
  3. Semantic Analysis - This stage makes sure that the AST actually makes sense. It may check types, proper declaration of variables, and dead code elimination.

  4. Code Generation - The compiler translates the AST representation into the syntax of another language (usually lower level). That outputted code will eventually be compiled down to binary operations (ASM, x86, HLC...) by similar compilers.

It may take some source code like this (written in Zep, my language)

fn add(a: i32, b: i32) -> i32 {
    rt a + b
}
Enter fullscreen mode Exit fullscreen mode

And translate it into the corresponding WAT (A language to represent WebAssembly)

(module
    (func (export "add")
        (param $a i32) (param $b i32)
        (result i32)
        local.get $a
        local.get $b
        i32.add
    )
)
Enter fullscreen mode Exit fullscreen mode
Delving deeper

All computers utilize hardware and software to function. Your processor hardware understands a language (Machine Code) that is hard for us to grasp as humans. Instead, we code in high-level "languages" that eventually become Machine Code.

Stage 01 (8,000m Mt. Everest): Too-High Level Language
Too-High Level Languages include block coding, scratch, or drag-and-drop style "languages". The best example here is Scratch. This is understood by young children.

Stage 02 (0m Sea Level): High Level Language
High Level Languages, abbreviated HLL, are understandable by humans. Some examples are Rust, C++, JavaScript, or WAT. This is understood by all developers.

Stage 03 (-300m deep mineshafts): Assembly Languages
Only the brave adventurers delve this deep. Assembly is neither a HLL or actual Machine Code, but resides in the no man's land between the two. Not commonly used by humans. Low-level developers or those interested in isoteric languages (Brainf***) understand this.

Stage 04 (-65636m Edge of Hell): Machine Code
Nearly nobody goes this deep, yet it is still important to understand. Machine Code represents the actual operations performed by the CPU and is understood by the Processor. (As well as minecraft redstoners who design their own computers @mattbattwings). All programming languages are compiled to Machine Code and finally executed by the machine.

Outro

That's all for now. In future articles, we will begin to build our own compiler and eventually achieve "Hello World".

If you would like to see more articles in this series, feel free to comment below, provide suggestions for future articles, and discuss questions. As always, thanks for reading! ✌️

Top comments (0)