DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Eugene Chernyshov
Eugene Chernyshov

Posted on

Analyzing AST in Go with JSON tools

There are many specific tasks that could significantly improve and automate your ongoing maintenance of big-project. Some of them require building tools that can analyze or change source code created by developers.

For example, such tools could be:

  • gathering metadata from comments,
  • gathering strings that need to be translated,
  • understanding the structure of the code to calculate some complexity metrics or build explanatory diagrams,
  • or even apply some automatic code optimization and refactoring patterns

Solving such tasks seems to lead us to complicated topics of compilers and parsers. But in 2022 every modern programming language comes with batteries included. The structure of code in form of AST that is ready to be searched and manipulated is presented as a built-in library. Basically, parsing files with code and searching for specific things is not much harder as do the same for JSON or XML.

In this article, we will cover AST analysis in Go.

Existing approaches

In golang there is a standard package ast that provides structs of AST nodes and functions for parsing source files. It is quite easy and straightforward for experienced go developers to write code for the tool. Also, there is printer package that can convert AST back into source code.

Here is a list of articles describing how to manipulate AST in golang:

One small aspect, you need to know the structure of golang AST. For me, when I first dive into the topic, the problem was to understand how nodes are combined together and figure out what exactly I need to search in terms of node structure. Of course, you can print AST using built-in capabilities. You will get output in some strange format:

   0  *ast.File {
   1  .  Package: 1:1
   2  .  Name: *ast.Ident {
   3  .  .  NamePos: 1:9
   4  .  .  Name: "main"
   5  .  }
   6  .  Decls: []ast.Decl (len = 2) {
   7  .  .  0: *ast.GenDecl {
   8  .  .  .  TokPos: 3:1
   9  .  .  .  Tok: import
  10  .  .  .  Lparen: 3:8
....
Enter fullscreen mode Exit fullscreen mode

Also, since the format is very specific, you can’t use any tools to navigate it, except text search. Tools like goast-viewer can help with this, but capabilities are limited.

goast-viewer example

Proposed solution

I started thinking of the library that would allow us to convert AST into some very conventional format like JSON. JSON is easy to manipulate, and many tools (like jq) and approaches exist to search and modify JSON.

So, what I end up with is asty

Asty is a small library written in go that allows parsing source code and presenting it in JSON structure. But, moreover, it allows also to do the reverse conversion. It means that now you can manipulate go code with a tool or algorithm developed with any programming language.

You can use it as go package, as a standalone executable, or even as a docker container. Try this page to experiment with asty in web assembly.

Example go code:

package main

import "fmt"

func main() {
    fmt.Println("hello world")
}
Enter fullscreen mode Exit fullscreen mode

Example JSON output:

{
  "NodeType": "File",
  "Name": {
    "NodeType": "Ident",
    "Name": "main"
  },
  "Decls": [
    {
      "NodeType": "GenDecl",
      "Tok": "import",
      "Specs": [
        {
          "NodeType": "ImportSpec",
          "Name": null,
          "Path": {
            "NodeType": "BasicLit",
            "Kind": "STRING",
            "Value": "\"fmt\""
          }
        }
      ]
    },
    {
      "NodeType": "FuncDecl",
      "Recv": null,
      "Name": {
        "NodeType": "Ident",
        "Name": "main"
      },
      "Type": {
        "NodeType": "FuncType",
        "TypeParams": null,
        "Params": {
          "NodeType": "FieldList",
          "List": null
        },
        "Results": null
      },
      "Body": {
        "NodeType": "BlockStmt",
        "List": [
          {
            "NodeType": "ExprStmt",
            "X": {
              "NodeType": "CallExpr",
              "Fun": {
                "NodeType": "SelectorExpr",
                "X": {
                  "NodeType": "Ident",
                  "Name": "fmt"
                },
                "Sel": {
                  "NodeType": "Ident",
                  "Name": "Println"
                }
              },
              "Args": [
                {
                  "NodeType": "BasicLit",
                  "Kind": "STRING",
                  "Value": "\"hello world\""
                }
              ]
            }
          }
        ]
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

asty is also capable to output comments, positions of tokens in original source, and reference ids. In some places, AST of go is not actually a tree but rather a DAG. So nodes may have the same ids specified in JSON.

Development principles and constraints

In the development of asty I tried to follow some rules:

  • Make JSON output as close to real golang structures as possible. There is no additional logic introduced. No normalization. No reinterpretation. The only things that were introduced are the names of some enum values. Even names of fields are preserved in the same way they exist in go ast package.
  • Make it very explicit. No reflection. No listing of fields. This is done to facilitate future maintenance. If something will be changed in future versions of golang this code will probably break compile time. Literally, asty contains 2 copies for each AST node struct to define marshaling and unmarshaling of JSON.
  • Keep polymorphism in JSON structure. If some field references an expression then a particular type will be discriminated from the object type name stored in a separate field NodeType. It is tricky to achieve so if you want something like this for other tasks I would recommend checking out this example https://github.com/karaatanassov/go_polymorphic_json

Other solutions

I looked hard for existing implementations of this approach. Unfortunately, not much to be found.

One project goblin, which I tried to use for a while, is quite good and mature, but it misses support of backward conversion from JSON to AST. It tries to reinterpret some structures in AST to (I guess) simplify and make them more human-readable. My personal opinion - it is not good. But the main issue with it is lack of maintenance. It was developed a long time ago for version 1.16 and was not updated since then. However, you can find a fork of it relatively up to date.

Another project go2json, generates JSON from go code. Also missing the backward conversion and poorly maintained. And it is implemented as a standalone parser with javascript. I think in this way it is very hard to maintain and keep it up with new features in golang.

Further work

I am looking for cooperation with other developers interested in language tools development. Meanwhile, you can check another repository with examples where I experiment with AST JSON in python.

Top comments (0)

Want to rep DEV and be comfy at the same time?

Check out our classic DEV shirt β€” available in multiple colors.