When you hear Abstract syntax trees, what is the first thought that occurs in your mind?
Something to do with compilers? Some complex tree manipulation? Bit manipulations? ๐ค
At the beginning of my career, this AST seemed like a complex term with low level compilers and transpilers magic sprinkled in it.
๐ก Motivation
The motivation behind writing this blog is to make it easy for everyone to understand what Abstract syntax trees are and how they play an important part in most of the tools we use on a daily basis.
Be it babel, webpack, parcel, eslint, codemods, css parsers, css in js - all of these tools use the magic of AST's to manipulate our code and transform it into something else.
In this post, we will unravel this magic and in the process learn to write some super simple babel plugins. Yeah ๐
๐ค What is an AST?
Like every new concept, we will start with a concrete definition.
According to wikipedia
An abstract syntax tree is a tree representation of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code.
To understand this, imagine we write a simple line of code in our editor
const a = 5 + 3;
This is a very simple variable assignment and addition of two numbers.
This simple operation goes through a process of Tokenization
and Parsing
.
๐น๏ธ Tokenization (Lexical Analysis)
Tokenization or Lexical analysis is the step where a function reads the code as a string and splits it into a list of tokens.
For the sake of simplicity, let's assume every token has the following interface
interface Token {
type: string,
value: string
}
Our code goes through process of lexical analysis and gets broken down into tokens.
๐งต Parsing (Syntactic Analysis)
Post lexical analysis gives us an array of tokens, we pass that through an AST parser(babylon or acorn or espree) which converts it into a tree of AST nodes by establishing dependencies between them.
The super simple code that we wrote gets converted into a tree of nodes which we call an Abstract syntax tree.
And that entire tree is represented as a json in the following manner
{
"type": "VariableDeclaration",
"declarations": [{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "a"
},
"init": {
"type": "BinaryExpression",
"left": {
"type": "Literal",
"value": 5
},
"operator": "+",
"right": {
"type": "Literal",
"value": 3
}
}
}],
"kind": "const"
}
In this json object we notice a param named type
. We call them AST node types.
Multiple types of AST nodes exist and for babel we can refer to the following
Babel AST Node Types
For espree parser(the one eslint uses) we can refer here
Eslint AST Node Types
Babel, webpack, parcel and all of these tools use a common approach.
They first convert our code to an AST tree, then apply some transformation on it(add, edit, update, delete), create a new tree out of those transformations and then convert it back to human readable code.
To understand what will the AST tree representation of a particular line of code look like, I would recommend to always check AST Explorer
Now, without wasting any further time, we will write our very first babel plugin.
This plugin will remove any debugger statements we might have in our code.
๐ Babel Plugin - Remove Debugger
Consider having following code in multiple locations of your repo
function test() {
const a = 5;
debugger;
const b = 6;
}
It is quite obvious that we would not like this debugger statements to end up in our production app.
(Note: In a real world app, we would have some bundler or some deployment pipeline step which can help us avoid such mistakes, but for the sake of this example let's assume we do not have any such deployment pipeline).
So we write a babel plugin to do the same for us.
Writing a babel plugin
Step 1: Identify the AST node type we want to target. If we go to AST Explorer and click on line 2, we will notice that the node type gets highlighted in yellow and it shows us that the AST node we have to target is DebuggerStatement
.
Step 2: Fire up your editor and create new file. Let's name it removeDebugger.js
. This will be our plugin file.
Every babel plugin we write from now on will follow a common pattern
module.exports = function(babel) {
return {
name: "remove-debugger-plugin", // this is optional
visitor: {
}
};
};
We are returning an object with another nested object with the key visitor
in it. It is named visitor because of the visitor pattern.
Step 3: We know the node type that we wish to target is DebuggerStatement
So our code will look like this now
module.exports = function(babel) {
return {
name: "remove-debugger-plugin", // this is optional
visitor: {
DebuggerStatement: function(path) {
}
}
};
};
Every node that we wish to target has to be a key inside our visitor object.
Step 4: Now the only step remaining in this babel plugin is to remove debugger statement node and we do it like this:
module.exports = function(babel) {
return {
name: "remove-debugger-plugin", // this is optional
visitor: {
DebuggerStatement: function(path) {
path.remove();
}
}
};
};
And that my friends was our first babel plugin.
This babel plugin explains us how to manipulate an AST by removing a node from it.
The next plugin we will learn will explain us how to edit an existing node and convert it into something else
๐ Babel Plugin - Alert To Console
So in this we will convert every alert
statement into a console.warn
statement.
So our code we wish to change would look something like this
function test() {
const a = 5;
alert(a);
}
We will convert this to
function test() {
const a = 5;
console.warn(a);
}
Step 1: Identify the AST node type we want to target. Going to AST explorer and copy paste our from
code and click on alert
. It will highlight the node type on the right. We see that the node type to target now is called CallExpression
.
So any function call is a CallExpression
and any function call on an object is called MemberExpression
. So alert
is CallExpression
and console.warn
is MemberExpression
.
A MemberExpression always will have an Object(console) and a property (warn) in our case.
Step 2: Once again fire up your editor and create new file. Let's name it convertAlertToConsole.js
.
Like before we start our plugin with the skeleton code
module.exports = function(babel) {
const t = babel.types;
return {
name: "convert-alert-to-console", // this is optional
visitor: {
}
};
};
Step 3: So now since we know that the node we have to target is a CallExpression
, lets write our code
module.exports = function(babel) {
const t = babel.types;
return {
name: "convert-alert-to-console", // this is optional
visitor: {
CallExpression: function(path)
}
}
};
};
Step 4: Since we do not wish to target every other function call, let us put a if condition to specify that we only wish to target a call expression with the name alert
module.exports = function(babel) {
const t = babel.types;
return {
name: "convert-alert-to-console", // this is optional
visitor: {
CallExpression: function(path) {
if (path.node.callee.name === "alert") {
}
}
}
};
};
Now the only part remaining is figuring out what to replace it with.
Step 5: We go back to AST explorer and copy our to
code this time and clicking on console.warn will tell us that we need to replace it with another call expression as all function calls are call expressions but since this is an object property function call
that is why it needs a call expression with a member expression inside it as it's callee.
module.exports = function(babel) {
const t = babel.types;
return {
name: "convert-alert-to-console", // this is optional
visitor: {
CallExpression: function(path) {
if (path.node.callee.name === "alert") {
const args = path.node.arguments;
path.replaceWith(
t.callExpression(
t.memberExpression(t.identifier("console"), t.identifier("warn")),
args
)
);
}
}
}
};
};
And that's it. We wrote our second plugin too ๐ฅณ. Isn't all of this too easy? ๐ค
๐ Bonus Plugin - Remove data-test-id from react app
In this we will remove every data-test-id
attribute from our react app. As data-test-id
is usually needed only in the dev env, our plugin can safely remove this from our production bundle.
So our code we wish to change would look something like this
import React from "react";
const App = () => {
return (
<div data-test-id="test">Hello World</div>
);
}
We will convert this to
import React from "react";
const App = () => {
return <div>Hello World</div>;
};
Step 1: Identify the AST node type we want to target. Going to AST explorer and copy paste our from
code and click on data-test-id
. It will highlight the node type on the right. We see that the node type to target now is called JSXAttribute
.
Step 2: Fire up your editor and create new file. Let's name it removeDataAttribute.js
.
Like before we start our plugin with the skeleton code
module.exports = function(babel) {
return {
name: "remove-date-test-id", // this is optional
visitor: {
}
};
};
Step 3: So now since we know that the node we have to target is a JSXAttribute
, lets write our code
So our code will look like this now
module.exports = function(babel) {
return {
name: "remove-date-test-id", // this is optional
visitor: {
JSXAttribute: function(path) {
}
}
};
};
Step 4: Now the only step remaining in this babel plugin is to remove this jsx attribute node and we do it like this:
module.exports = function(babel) {
return {
name: "remove-date-test-id", // this is optional
visitor: {
JSXAttribute: function(path) {
if(path.node.name.name === "data-test-id") {
path.remove();
}
}
}
};
};
And that's it. We wrote another plugin ๐ฅณ.
Github Repo: https://github.com/vivek12345/webcamp-zagreb-demo
๐ฌ Conclusion
I hope this helps us all in understanding that AST's are not complicated and all of us can improve our developer tools ecosystem by either making linter plugins or writing babel plugins or doing large scale refactors using codemods. And if you are in the mood, you could also write a css in js library.
๐ References
- Leveling up parsing game by Vaidehi Joshi
- AST explorer by Felix Kling
- Babel handbook
- Step by step guide to write a babel transformation
- Magical land of AST's
Top comments (5)
Hi Vivek,
Very interesting read.
I was aware of AST but the examples in the article made it simple to understand.
Do you have a working demo of those babel plugins?
You have provided some reference but still am asking.
Hi Akash,
You can find the code here
github.com/vivek12345/webcamp-zagr...
Thank you
Thanks Vivek.
I will give the demo a try.
sooo good! thx
I will share the greatness of your post with Koreans.
itchallenger.tistory.com/709
Such beautifil diagrams :) What tool do you use to "draw" them ?