Notes:
- This article is less of an instructional one, like previous articles, and more of a question + fun narration of my struggles with compilers so far. More instructional articles will come later on.
- This is about compilers only, not about Interpreters. I have no idea how interpreters work yet.
The Background
Out of all the classes one takes as a CS student in university, some useless and some not so much, it might come to you as a surprise that I actually enjoyed the classes about programming languages and system software - it was certainly a surprise to me because I had no idea I had an interest in them!
Particularly, I was interested in learning how languages worked. All I knew beforehand is that programming languages are translated to machine code so that the computer understands what to do, because the computer doesn't understand JSFuck. However, I had no idea how we go from human-readable programming language to machine-language, I thought of it for a while as translating Spanish to English. After I learned that was not the case, I simply thought of it being a black box of computer science magic and didn't push the subject.
The Systems Software class at my university taught us a bit more about what was inside the black-box, they had us build a compiler for a simplified version of language called PL/0. It was the second time I took the class when it actually clicked for me and started being interesting, I can thank the professor for that. It was in said class where I learned about the steps of compiling a language (parser, analyzer, all that jazz) what an Abstract Syntax Tree (AST) was and how it was an intermediary step of compiling and a very powerful tool when understanding how a language works, and how a computer can understand it.
All that backstory to bring you to this Tweet:
Which came right at a time when I wanted to delve deeper into developer experience tools. At the same time, as if by divine timing, my brother shared with me this repo about learning to build your own anything:
codecrafters-io / build-your-own-x
Master programming by recreating your favorite technologies from scratch.
Build your own <insert-technology-here>
This repository is a compilation of well-written, step-by-step guides for re-creating our favorite technologies from scratch.
What I cannot create, I do not understand — Richard Feynman.
It's a great way to learn.
- 3D Renderer
- Augmented Reality
- BitTorrent Client
- Blockchain / Cryptocurrency
- Bot
- Command-Line Tool
- Database
- Docker
- Emulator / Virtual Machine
- Front-end Framework / Library
- Game
- Git
- Network Stack
- Neural Network
- Operating System
- Physics Engine
- Programming Language
- Regex Engine
- Search Engine
- Shell
- Template Engine
- Text Editor
- Visual Recognition System
- Voxel Engine
- Web Browser
- Web Server
- Uncategorized
Tutorials
Build your own 3D Renderer
- C++: Introduction to Ray Tracing: a Simple Method for Creating 3D Images
- C++: How OpenGL works: software rendering in 500 lines of code
- C++: Raycasting engine of Wolfenstein 3D
- C++: Physically Based Rendering:From Theory To Implementation
- C++: Ray Tracing in One Weekend
- C++: Rasterization: a Practical Implementation
- C#…
Which links to this repo about building your own compiler in JavaScript:
jamiebuilds / the-super-tiny-compiler
⛄ Possibly the smallest compiler ever
Welcome to The Super Tiny Compiler!
This is an ultra-simplified example of all the major pieces of a modern compiler written in easy to read JavaScript.
Reading through the guided code will help you learn about how most compilers work from end to end.
Why should I care?
That's fair, most people don't really have to think about compilers in their day jobs. However, compilers are all around you, tons of the tools you use are based on concepts borrowed from compilers.
But compilers are scary!
Yes, they are. But that's our fault (the people who write compilers), we've taken something that is reasonably straightforward and made it so scary that most think of it as this totally unapproachable thing that only the nerdiest of the nerds are able to understand.
Okay so where
…Which came from that fantastic article by @mattsparks:
🧠 Learn! 5 Github Repos With Great Free & Open Source Resources
Matt Sparks ・ May 29 '19
The Current Situation
So, I started going over the Super Tiny Compiler to understand how it works, and converting it to TypeScript to prove to myself that I understand what is happening in the different stages of the compiler.
I have been doing just that in a repository is a fork of the previously mentioned one, and you can find it here:
SammyIsra / the-super-tiny-compiler
⛄ Possibly the smallest compiler ever
My first attempt (out of school) to learn how compilers work, and what tf an AST is
This version will be in TypeScript mostly because that is a good way for me to properly understand how it works
Original README below this line
Welcome to The Super Tiny Compiler!
This is an ultra-simplified example of all the major pieces of a modern compiler written in easy to read JavaScript.
Reading through the guided code will help you learn about how most compilers work from end to end.
Why should I care?
That's fair, most people don't really have to think about compilers in their day jobs. However, compilers are all around you, tons of the tools you use are based on concepts borrowed from compilers.
But compilers are scary!
Yes, they are. But that's…
The first two steps were not so bad:
- The first step (the Tokenizer) turns the input string in the starting language to a series of tokens, each token identified as a Parenthesis, Number, String, or Name.
- The second step (the Parser) navigates the token list and creates a simple AST by analyzing the token list and determining what kind of node each token is and if the following ones are children nodes.
I'll go into a deeper discussion on the matter in more instructional articles, and they will come with drawings since that is how I first understood these topics.
The Blocker
Steps #3 and #4 (the Traverser and the Transformer) is where I am currently stuck at. These steps seem to live together in the sense that they perform the next step in the compiling process.
However, I am at a loss when trying to understand how we get the simple AST from step 2 and turn it into the more complicated one that the Code Generation step can use to create the compiler output, and how to turn that knowledge into the Types created by said steps.
The way I understand this section of the compiler is that the Transformer takes in the Traverser and the Simple AST, creates a "visitor" which dictates what happens when finding a node of a specific type in the Simple AST, and it outputs the New AST (as is referred to in the code), which is the input the Code Generation step uses to create the output code.
So this is where I lay my question and my request for help, the Traverser and the Transformer steps of the compiler are not the clearest subjects for me, so if you have any knowledge about them or if you have a similar experience to mine, I'd love to hear about it.
Quick note. After writing this article, I have since found this video from the same author: https://www.youtube.com/watch?v=Tar4WgAfMr4
Top comments (0)