Crafting IDE-Ready Compilers

#rust #compilers #series #languagedesign

This is the introductory post of a series, so normally you would find a motivational section here. Why should you learn how to build a compiler? I'm not going to bother with that; Crafting Interpreters did it better than I could, anyway. The short of it is, well, writing compilers is fun, and covers a lot of techniques for solving problems.

We're going to start by implementing an interpreter for the Tiny-C language. Tiny-C is a significantly stripped down C-style language where there is only one data type, no variable declarations, no functions, the only control flow options are if and while, and the only expressions are +, -, and <. Because of this, the grammar is simple enough I can include it here inline:

Program = Statement;
Statement =
  | If:{ "if" cond:ParenExpr then:Statement else:{ "else" then:Statement } }
  | While:{ "while" cond:ParenExpr then:Statement }
  | Block:{ "{" then:Statement* "}" }
  | Expr:{ then:Statement? ";" }
  ;
ParenExpr = "(" Expr ")";
Expr =
  | Assign:{ id:Id "=" val:Expr }
  | Test:{ lhs:Expr "<" rhs:Expr }
  | Sum:{ lhs:Expr "+" rhs:Term }
  | Diff:{ lhs:Expr "-" rhs:Term }
  ;
Term =
  | Id:{ 'a'..='z'+ }
  | Int:{ '0'..='9'+ }
  | Expr:ParenExpr
  ;

(Note that this is slightly modified from the original Tiny-C to better fit how we are going to build the compiler.) The simplicity of the language allows us to focus on the structure of the compiler rather than the intricacies of the language. If you don't fully understand the above grammar, don't worry: we'll be covering it in more detail in the future.

Additionally, we'll be writing the compiler in a fully IDE-ready style. This means our compiler will also serve as the intelligence engine powering a language server protocol server. Though our language is simple, we want to write the compiler as if it were a full-fat compiler for a full-fat language. By doing so, we can transfer what we learn here to real languages' intelligence engines.

Because we're trying to build a real compiler, we'll be taking full advantage of the available tooling for building compilers in Rust. Rust's ownership semantics make the compartmentalization and complex data flows through a compiler easier to see. For libraries, we'll be using at least rowan, salsa, and lsp-server.

With any luck, once we've implemented the interpreter for Tiny-C, we'll continue on to write a backend with Cranelift, and maybe even continue on to designing and writing a full language 👀

I plan to have a post in the series out at least every other Friday. Next time: we set up the repository structure and lex the language into its component tokens. We'll also be getting our first venture into code generation along the way.

Discuss this below! | on Reddit! | on Twitter! | on Discord! | Watch me live!

Top comments (2)

Jonathan Apodaca • Sep 28 '19

Oh man, am I looking forward to this series! Especially the LSP portion. Do you plan to hand-write the parser?

Christopher Durham • Sep 28 '19

I do plan on hand-writing the parser, as for the kind of error-tolerance, you want in an IDE environment, there's no real great modern parser generator.

That's the big reason for implementing such a simple language as Tiny-C: it's simple enough to do everything by hand and understand how everything works.

DEV Community

Crafting IDE-Ready Compilers

Related Reading

Top comments (2)

Read next

5. Mocking usePathName, useSearchParams and useRouter with Jest in Next 15

Terraform stories.

LeetCode Meditations: Number of 1 Bits

My Journey to Creating a Supportive Community for Freelancers & Small Business Owners - #1: The Vision