DEV Community

Cover image for Easily Query your Cloud Inventory with Steampipe
Mike Graff for AWS Community Builders

Posted on • Originally published at cloudyadvice.com

Easily Query your Cloud Inventory with Steampipe

Cloud Inventory is a Challenge

Anyone who has worked with AWS for a while will be quite familiar with the difficulties of getting a complete picture of the inventory of resources in your accounts. While AWS has made some limited strides in this area with tools like the EC2 Global View and the ability to get a count of all VPCs in the VPC dashboard, the bottom line is that AWS does not provide a comprehensive inventory solution for an AWS account or group of accounts.

While there are many excellent third party commercial tools you can use to solve this problem, not everyone (including me!) has the budget available to pay for these tools. So for the past month or two I've been spending time looking at some open source tools to add to my team's toolchain to help us with the inventory problem. One of the tools I've been looking at is an interesting solution called Steampipe. In this post I'm going to dive into this tool and share some of the things I've learned about it.

Steampipe - Query your cloud inventory like it's 1992

Steampipe.io is an open source tool that is maintained by the commercial tool company Turbot. In a nutshell, the software allows you to query your favorite cloud services with SQL. By providing a consistent command line interface that works across multiple IaaS, PasS and SaaS services, Steampipe aims to reduce the time wasted on context switching between different cloud provider's native interfaces. As the end user, you write your queries in a consistent SQL syntax, and Steampipe translates that query into native API calls that are executed in real time against the cloud service's API.

Interaction with various cloud service's APIs is handled via Steampipe plugins for each service. As of this writing there are 86 different plugins available on the Steampipe Hub, including plugins for all the major cloud players (AWS, Alibaba, Azure, Cloudflare, GCP, Digital Ocean, IBM, Oracle, Heroku), SaaS services (Datadog, PagerDuty, Salesforce, Stripe, Twilio, Zendesk, Zoom), as well as some more intriguing options like VMware vSphere, Terraform, Reddit, and Slack.

Getting Started

The easiest way to get started playing with Steampipe is to install it on your local workstation. The Steampipe Downloads page provides step by step instructions for installation on MacOS, Linux, and Windows operating systems.

On a Mac with the Homebrew package manager installed, the process could not be any simpler:

  1. Tap Turbot's Cask: ~$brew tap turbot/tap
  2. Install Steampipe: ~$brew install steampipe
  3. Validate the installation is working by checking the installed version of Steampipe: ~$steampipe - v

Once you have the core Steampipe engine installed, you need to install plugins for whichever cloud services you want to query. First, let's install the Steampipe plugin which will allow us to query Steampipe components, such as the available plugins in the Steampipe hub.

~$ steampipe plugin install steampipe

steampipe            [====================================================================] Done                

Installed plugin: steampipe@latest v0.5.0
Documentation:    https://hub.steampipe.io/plugins/turbot/steampipe

Next, let's install the AWS plugin so we can start querying our AWS environments. Use the following command:

~$ steampipe plugin install aws

aws                  [====================================================================] Done                

Installed plugin: aws@latest v0.78.0
Documentation:    https://hub.steampipe.io/plugins/turbot/aws

Now that we have our AWS plugin installed, we need to do configure some credentials for the plugin so we can start querying our AWS accounts.

Configure Credentials

Basic Setup

When you install the Steampipe AWS plugin, the tool will automatically create a configuration file for you located at the following path: ~/.steampipe/config/aws.spc

Within this configuration file you can setup one or more AWS accounts to query with Steampipe. You can setup credentials directly in the aws.spc file if you want, but even more handy is the fact that Steampipe can refer to credential profiles you've already setup for AWS CLI. Best of all, you can even use AWS SSO profiles with Steampipe!

Credentials statements in my aws.spc file look like this:

connection "itprod" {
  plugin = "aws"
  profile = "sso-itprod"
  regions = ["*"]
}
connection "itshared"
  plugin = "aws"
  profile = "sso-itshared"
  regions = ["*"]
}

As you can see, each connection entry in my aws.spc starts with a name definition, followed by which plugin you are using, the profile name in my ~./aws/config file, and finally a region specification. You can put a list of individual regions here or use * to query all regions simultaneously.

Advanced Configuration

If you have a multi account setup, and you've configured multiple connection entries in your aws.spc file, you can also create what is called an "aggregator connection" which allows you to query against multiple connections from a plugin in a single query.

connection "aws_all" {
  type = aggregator
  plugin = aws
  connections = ["itprod", "itshared"]
}

You can even use wildcards in your aggregator specification, like so:

connection "aws_it_all" {
  type = aggregator
  plugin = aws
  connections = ["it*"]
}

Another handy capability comes into play if you are running Steampipe on an EC2 instance inside your AWS Account. If you have an IAM Instance profile attached to the instance with appropriate permissions, Steampipe will automatically use that IAM role with no other credentials being configured.

Now that we have some credentials configured for Steampipe, let's see what we can do with this tool!

Basic Queries

The Steampipe AWS plugin organizes the many AWS services into discrete tables that you can then write queries against. As of this writing there are 344 discrete tables defined in the AWS plugin, you can review them all here. You can run queries in two different modes with Steampipe:

  1. Enter an interactive query console by entering the command steampipe query
  2. Run a specific query directly by entering it all in a single command: steampipe query "select * from cloud"

I believe it is easier to get started using the interactive query console as it provides you with hints and autocomplete as you write your queries. Let's enter the console and see what commands are available:

➜  steampipe query
Welcome to Steampipe v0.15.4
For more information, type .help
> .help
Welcome to Steampipe shell.

To start, simply enter your SQL query at the prompt:

  select * from aws_iam_user

Common commands:

.help     Show steampipe help                           
.inspect  View connections, tables & column information 
.exit     Exit from steampipe terminal                  

Advanced commands:

.cache [mode]        Enable, disable or clear the query cache                                                     
.clear               Clear the console                                                                            
.connections         List active connections                                                                      
.header on|off       Enable or disable column headers                                                             
.multi on|off        Enable or disable multiline mode                                                             
.output [mode]       Set output format: csv, json, table or line                                                  
.quit                Exit from steampipe terminal                                                                 
.search_path         Display the current search path, or set the search-path by passing in a comma-separated list 
.search_path_prefix  Set a prefix to the current search-path                                                      
.separator           Set csv output separator                                                                     
.tables              List or describe tables                                                                      
.timing on|off       Enable or disable query execution timing                                                     

As the help file suggests, you can use the .tables command to get a list of all the tables available in Steampipe:

Steampipe .tables command

You can use the .inspect command to view details on connections and tables. Inspect is particularly powerful as you can use it to drill into a specific table to see what fields are available. For example here's a screenshot showing the .inspect aws_vpc command result:

Steampipe .inpsect aws_vpc command

Once you've seen the connections and tables, you can get an idea of what types of queries you can run. As the help file suggests, you can start by getting a list of all the IAM users in all your configured accounts with the command select * from aws_iam_user which returns a table like this:

Steampipe query for all IAM users

If you have an aggregator configuration setup, you will get results for all configured AWS accounts by default. If you want to limit your scope to a specific AWS Account, simply prefix the table name with the connection name e.g. select * from itshared.aws_ec2_instance:

Steampipe query for EC2 instances in a specific account

Advanced Queries

If you review the table definitions published on the AWS plugin page at Steampipe Hub, you will see lots of more advanced examples of what you can do with Steampipe. For example, let's say you want to get a count of IAM access keys that are between 45 and 90 days old:

Steampipe query of access keys between 45 and 90 days old

Or perhaps a list of your S3 buckets that are not encrypted:

Maybe you'd like to get a quick count of EC2 instances grouped by region across all your accounts?

Steampipe query counting number of EC2 instances by region

Lastly, let's get really fancy and create an age table of all our EC2 instances:

Steampipe EC2 instance age table

Wrap-up

As you can hopefully see from this brief introduction, there is a ton of things you can do with this tool, especially leveraging the power of SQL. While I personally do not have a lot of SQL experience, I was still able to get up and running pretty quickly following the excellent documentation and examples provided by the Steampipe project.

In the future I hope to do a follow up post on some of the more advanced capabilities of Steampipe including the ability to serve custom dashboards based on the Steampipe engine. In the meantime, if you've tried this tool yourself, please share your experience with me in the comments section.

Happy querying!

Top comments (0)