DEV Community

David MacEachern
David MacEachern

Posted on

How to query JSON data in the terminal

Today we will be using jq to work with JSON data, you should be able to get it here.

tl;dr

Return top-level keys from multiple json files in present working directory

jq -rs 'reduce .[] as $item ({}; . * $item) | keys[]' ./*.json
Enter fullscreen mode Exit fullscreen mode

Get the data

Go ahead and clone the assets if you want to work with this too.

git clone https://github.com/davidmaceachern/playground-query-json.git
Enter fullscreen mode Exit fullscreen mode

Navigate to where the JSON documents are.

cd playground-query-json/assets/requests
Enter fullscreen mode Exit fullscreen mode

Have a look at the JSON document names if you wish.

ls
Enter fullscreen mode Exit fullscreen mode

or

cat add_backend.json
Enter fullscreen mode Exit fullscreen mode

Prints the following as the contents.

{
  "id": "ID_TEST",
  "version": 0,
  "type": "PROXY",
  "data": {
    "type": "ADD_BACKEND",
    "data": {
      "app_id": "xxx",
      "backend_id": "xxx-0",
      "ip_address": "127.0.0.1",
      "port": 8080
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The problem

Ok so FYI this is test data for a project I am working on.

I need to decide on which properties I need the application to generate. The goal is to reduce them down as much as possible so the user only provides the items which they need to control and the application handles the rest.

At this stage, each of these files can be considered 1 single dataset.

ls -1q *json | wc -l
Enter fullscreen mode Exit fullscreen mode

If we run the above command we should find there are 16 files that match a '.json' ending.

Combine all the documents into a single dataset

First we can print as we did using cat before but using jq instead:

jq . add_backend.json
Enter fullscreen mode Exit fullscreen mode

jq is the application we are invoking. The . represents the first argument jq needs which is a filter, or a thing we want to do to change our input, which right now is nothing. The third item is the filename which is our input.

Exactly the same but prettier:

{
  "id": "ID_TEST",
  "version": 0,
  "type": "PROXY",
  "data": {
    "type": "ADD_BACKEND",
    "data": {
      "app_id": "xxx",
      "backend_id": "xxx-0",
      "ip_address": "127.0.0.1",
      "port": 8080
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Let's combine two files and see what that looks like.

jq . add_backend.json remove_backend.json
Enter fullscreen mode Exit fullscreen mode

Which returns:

{
  "id": "ID_TEST",
  "version": 0,
  "type": "PROXY",
  "data": {
    "type": "ADD_BACKEND",
    "data": {
      "app_id": "xxx",
      "backend_id": "xxx-0",
      "ip_address": "127.0.0.1",
      "port": 8080
    }
  }
}
{
  "id": "ID_TEST",
  "version": 0,
  "type": "PROXY",
  "data": {
    "type": "REMOVE_BACKEND",
    "data": {
      "app_id": "xxx",
      "backend_id": "xxx-0",
      "ip_address": "127.0.0.1",
      "port": 8080
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Ok great but we want to merge all the files together, along with handling nested
structures:

jq -rs 'reduce .[] as $item ({}; . * $item)' ./*.json
Enter fullscreen mode Exit fullscreen mode

Which returns:

{
  "id": "ID_TEST",
  "version": 0,
  "type": "UPGRADE_MASTER",
  "data": {
    "type": "STATUS",
    "data": {
      "app_id": "xxx",
      "hostname": "yyy",
      "path_begin": "xxx",
      "fingerprint": "ab2618b674e15243fd02a5618c66509e4840ba60e7d64cebec84cdbfeceee0c5"
    },
    "path": "./config_dump.json"
  },
  "proxy_id": 0
}
Enter fullscreen mode Exit fullscreen mode

We can output that result to a file to use later if we wish.

jq -rs 'reduce .[] as $item ({}; . * $item)' ./*.json > ../aggregate.json
Enter fullscreen mode Exit fullscreen mode

For now though I only care about keys, so inside the jq filter we can add a pipe
which only returns the keys.

jq -rs 'reduce keys[] as $item ({}; . * $item) | keys[]' ./*.json
Enter fullscreen mode Exit fullscreen mode

Which prints the unique top-level keys:

data
id
proxy_id
type
version
Enter fullscreen mode Exit fullscreen mode

Great, now I might encounter more issues with the JSON request format further into development, but for now a quick look at these keys is exactly what I
needed.

So what?

Ok so, if we were actually writing a bash script this might be helpful, or using a language such as Python we can actually use jq queries inside our code using a package such as found here.

jq is a great way to work with JSON data in the terminal, picking up knowledge of a couple of filters can come in handy. I've enjoyed using it in the past to explore JSON data and it can be handy for helping to make decisions.

You can achieve similar things manually in VSCode, with JSON linting and copy/paste and find/replace but if it's a task you do over and over again it can be worthwhile to use tools like jq.

I'm writing a Javascript application, so I could have done this type of thing inside my unit tests, as Javascript is great when it comes to JSON, but I thought to share this approach today.

References

Top comments (0)