DEV Community

Cover image for Turbocharge Your AWS CLI Skills with JSON Parsing
Richard Forshaw
Richard Forshaw

Posted on • Edited on • Originally published at developdeploydeliver.com

Turbocharge Your AWS CLI Skills with JSON Parsing

I am a self-professed want-to-be keyboard-wizard, and I was brought up on amazing text-manipulation tools like grep, sed and awk. Text-based data manipulation can still be very fast and powerful and might save you dozens of lines of code. This post shows how a little knowledge of JSON-parsing tools can go a long way.

Tools

The tools used in this article are as follows:

  • AWS CLI
  • JMES Path parser
  • JQ
  • Bash

This combination may mean that some examples below may not map exactly to your particular use-case or configuration, so please bear that in mind.

Basic JSON Formatting and Filtering

I looked at basic formatting and filtering in this post about essential AWS CLI skills. As a quick recap, you can use the --query command-line parameter to pass a string specifying a query expression on the JSON results, and sometimes these can get quite complex.

If you want a bit more power in your querying, it is worth looking at the JQ tool, which also lets you process JSON structures. The good thing about using JQ is that if can operate on files as well as STDIN, so you can save your JSON output into a file and run JQ over and over. This will definitely be faster when writing your queries and may also save you some AWS processing cost.

As a start, JQ is great for simply pretty-formatting JSON output from something like a lambda function, by just piping it into jq '.'. This isn't necessary with the AWS CLI because it already formats the JSON output.

AWS CLI examples

The simplest use of the CLI filter is to print a reduced amount of data so that it is more manageable. This uses JMES Path expressions to process the JSON data in the output.

Here are some basic examples. Note that the query expression is added using the --query argument. In this case we will use CloudFormation output:

aws cloudformation describe-stacks --query "<filter-goes-here>"

Use Case Command
Show only stack ID, name and update time 'Stacks[*].[StackId, StackName, LastUpdatedTime]'
Show stack name/ID for stacks whose name contains 'foo' 'Stacks[?StackId.contains(@, 'foo')].[StackId, StackName]'
Show stack name/ID for stacks which have outputs exported and have been updated since Nov 2022 "Stacks[?Outputs && LastUpdatedTime>'2022-11'].[StackId,StackName,LastUpdatedTime]"

(Note the final example only works because the date format in LastUpdatedTime is able to be compared as a string - more on that later.)

JQ

If you want to do serious local-processing of AWS or any other JSON output, you need to get familiar with JQ, an awesome tool which lets you process JSON structures, not only filtering it like the AWS CLI, but also restructuring it.

Here are a few basic ways to use it. Note that you can either pipe input into JQ or provide a filename which contains your JSON.

Use Case Command
Simply pretty-print JSON output jq '.'
Get all the sort keys from a dynamo query response
(assuming they are strings)
jq '.Items[].SortKey.S'
Put the result into a list jq '[.Items[].SortKey.S]'
Only print keys, not values jq 'keys[]'

An Important Note

It should be stressed here that QUOTING IS IMPORTANT. You may have noticed above that for the AWS --query I used double-quotes to enclose the whole expressions, and single-quotes when quoting literals within the expression. But for JQ, you use the opposite, which is made clear in the manual. At least within the environment I am using (bash), not following this will only lead to endless miserable debugging.

Comparing The Two

In general, the --query filters using JMES are a little more concise than their JQ alternatives, but in my opinion the sequential nature of JQ using pipes (|) is more readable than JMES. However, there are some other important differences to consider:

CLI --query expression JQ Expression
Usage Only with AWS CLI commands With any output or file
Output Only outputs filtered results Can restructure into new JSON
Types Does not handle dates natively Handles date conversions
Scripting Expression must be entered on command-line Expression can be store in file with comments

Below are some common queries I've used, with the CLI query and JQ query side-by-side. You should note that all the JQ expressions are wrapped in [], because by default JQ does not output a list. The AWS CLI query function does output a list, so the additional [] are used to match the outputs. For these examples, I am using DynamoDB output which looks something like this:

{
    "Items": [
        {
            "PartitionKey": { "S": "blog/books/2023-01-drive-daniel-pink/" },
            "SortKey": { "S": "1675071762" },
            "SomeField": { "N": "54" },
            // Other Fields...
        },
        {
            "PartitionKey": { "S": "blog/books/2023-01-drive-daniel-pink/" },
            "SortKey": { "S": "1675071862" },
            "SomeField": { "N": "44" },
            // Other Fields...
        },
        // More items...
    ]
}
Enter fullscreen mode Exit fullscreen mode
Example AWS CLI query expression JQ Expression
Show all sort keys from dynamo output Items[*].SortKey.S [.Items[].SortKey.S]
Particular attributes from dynamo output Items[*].[SortKey.S,SomeField.N] [.Items[] | [.SortKey.S,.SomeField.N]]
Filter by field value Items[?SortKey.S>'1674000000'].SomeField.N [.Items[] | select(.SortKey.S>"1674000000").SomeField.N]
Filter on string prefix Items[?starts_with(SortKey.S, 'TEXT')].SomeField.N [.Items[] | select(.SortKey.S | startswith("TEXT")).SomeField.N]

If you want an example not using the data above, here is one you can run on your CloudFormation stacks right now. Each command (one JMES and one JQ) will show the last updated time of only your 'Dev'-stage stacks:

bash-5.1$ aws cloudformation describe-stacks --query "Stacks[?contains(Tags[], {Key: 'STAGE', Value: 'dev'})].[StackName,LastUpdatedTime]"
bash-5.1$ aws cloudformation describe-stacks | jq '[.Stacks[] | select(.Tags[] | contains({Key: "STAGE", Value: "dev"})) | [.StackName,.LastUpdatedTime]]'
[
    [
        "my-sls-stack-dev",
        "2023-02-03T07:56:49.848Z"
    ],
    [
        "my-www-stack-dev",
        "2022-12-20T11:07:21.155Z"
    ]
]
Enter fullscreen mode Exit fullscreen mode

Getting more complex

Let's do some sorting. Yes, they can do that, and have many other functions built in!

Example AWS CLI query expression JQ Expression
Sort output numerically by field values* sort_by(Items[*], &to_number(SomeField.N))[*][SortKey.S,SomeField.N] [.Items[] | [.SortKey.S,.ServiceTime.N]] | sort_by(.[1] | tonumber)
Sum fields (e.g. get total page access time ) sum(map(&to_number(ServiceTime.N), Items[*])) [.Items[].ServiceTime.N | tonumber] | add
Perform counting e.g. sum of pages accessed by Mozilla length(Items[?AgentString && starts_with(AgentString.S, 'Mozilla')]) [.Items[] | select(.AgentString.S | startswith("Mozilla"))] | length

* Note the expressions convert the fields to numbers here so as to sort numerically rather than textually

Wrapping Up

This post shows that it is possible to perform some complicated transformations on JSON output data. I think most people could see that the above commands, which you can almost call 1-liners, can replace a whole JavaScript or Python function and allow you to perform complicated ad-hoc and maybe even regular tasks, with much less development overhead.

In fact, because JQ can read your filter expression from a file (which can also contain comments), complex filters can turn into 1-liners, with JQ as your script interpreter. This also means that you can version-control and track your JQ scripts.

The specifications of these tools are quite similar and are available in library form in JavaScript, Python, Go and many other languages.

More Resources

Filtering output from the AWS CLI

This post was adapted from a larger post on my blog. See the full post here

Top comments (0)