Newsletter #89. This week we have another selection of great new projects for you to take a look at. Kicking things off with the latest open source project from Airbnb, ottr, a Public Key Infrastructure framework that handles end-to-end certificate rotations, the other projects include cloudkey, clock-bound, aws-recon, cdk-dia and more. Make sure you check these out.
As always, we have a wide selection of new blog posts from the AWS and Community bloggers covering topics from Alphafold, BayerCLAW, and Babelfish to OpenSearch, AWS CDK, ffmpeg, Amazon Corretto, Spring Boot, Bottlerocket, Snyk, MariaDB and GitHub actions.
To finish things off we have a new video covering Suricata on AWS, as well as a new event coming up later this week which you still have time to sign up for.
The articles posted in this series are only possible thanks to contributors and project maintainers and so I would like to shout out and thank those folks who really do power open source and enable us all to build on top of what they have created.
So thank you to the following open source heroes: Qi Wang, Tom Roshko, Christos Matskas, Vadivelu Murali Pranavan, Kenneth Yang, Danny Gitelman, Daniel Begimher, Afza Wajid, Sudhir Reddy Maddulapally, Alexey Vorovich, Jesse Butler, Damien Martins, Masahiro Imai, Hidenori Koizumi, Jorge Lanzarotti, Ramesh Kumar Venkatraman, Dave Currie, Frank Dallezotte, Maxwell Moon, Jack Tabaska, Ian Davis, Jani Muuriaisniemi, Jose Juhala, Vacha Shah, Sarat Vemulapalli, Irshad Buchh and Yang Xiao.
Make sure you find and follow these builders and keep up to date with their open source projects and contributions.
Great news from Vadivelu Murali Pranavan last week, where he shared the following updated:
I'm happy to share with you that myself along with my peers Sanjay Thiyagarajan, Naresh Kumar, Jayanth Vikash S, Xavier Emmanuel and Sri Varmaa won the first place in Amazon Web Services (AWS) Graviton Hackathon 2021 in Migration track. Check out the project they created, Genie
ottr this is the latest open source project from Airbnb engineering, Ottr. Ottr is a serverless Public Key Infrastructure framework that handles end-to-end certificate rotations without the use of an agent. You can check out the super detailed blog post, Meet Ottr: A Serverless Public Key Infrastructure Framework from Kenneth Yang provides an overview on Ottr with details of the architecture, logical and network flows and details on how to deploy.
cloudkey this project from Aidan Steele is perfect if you have a Yubikey and want to use it to assume IAM roles to interact with AWS. As Aidan says:
"I could create certificates on the Yubikey, enrol them into AWS IoT (for free) and assume roles in AWS with no IAM secret access keys stored on disk."
Worth checking out Aidan's thread on twitter, here for more context.
clock-bound this new project provides you with a consistent, trusted time service will allow you to compare timestamps to determine order and consistency for events and transactions, independent from the instances’ respective geographic locations.
aws-recon this project from Darkbit is a multi-threaded AWS security-focused inventory collection tool written in Ruby, and was created to facilitate efficient collection of a large amount of AWS resource attributes and metadata. It aims to collect nearly everything that is relevant to the security configuration and posture of an AWS environment. It is being used by some interesting customers, so well worth checking this out.
cdk-dia this project from Tom Roshko looks super neat, it diagrams your CDK provisioned infrastructure using the Graphviz dot language. After getting Graphviz running on my Macbook (thanks Mac Ports) I tried it on one of my projects, and here is the output. What do you think? A great start, so it will be interesting to see how this project evolves and develops. Nice work Tom!
aws-cdk-github-oidc is a CDK constructs to use OpenID Connect for authenticating your Github Action workflow with AWS IAM. These constructs allows you to harden your AWS deployment security by removing the need to create long-term access keys for Github Actions and instead use OpenID Connect to Authenticate your Github Action workflow with AWS IAM.
You can check out last weeks newsletter where Richard Boyd shows you using this new capability of GitHub Actions.
fiware-orion-on-aws FIWARE is a curated framework of open source platform components to help with the development of smart applications and solutions. This repository is a reference implementation of one of those components, the Orion Context Broker which uses another component in that project, Cygnus. To help you get started, check out the blog post How to build smart cities with FIWARE Orion Context Broker and Cygnus on AWSfrom Masahiro Imai, Hidenori Koizumi, and Jorge Lanzarotti
Thanks to Corey Quin for highlighting this tool I had missed.
aws-key-disabler this open source project is a small lambda script that will disable access keys older than a given amount of days. Small but perfectly formed, I think this is a great solution if you find yourself needing to automate the vending of your keys.
This tweet from Christos Matskas (@christosmatskas) came up on my timeline last week, where he shared how he was able to use the open sourced NodeJS library for verifying JWTs that I shared in the last episode, and verify AAD access tokens from Azure Active Directory. He also shared the code, which you can check out at here. Christos also put this post together, Open Standards, Security, Azure AD and AWS which shows you the end to end story. Nice!
Building Software as a Service (SaaS) is an increasingly popular approach for open source projects to provide customers with immediate access to their capabilities. There are several approaches you can take, but being able to well and ensure a good experience during on boarding, you need to have a reliable, fast, and multi-region capable provisioning and software lifecycle management. In the post, Parallel and dynamic SaaS deployments with AWS CDK Pipelines Jani Muuriaisniemi and Jose Juhala describe a deployment system for achieving this using AWS CDK and AWS CDK Pipelines. [hands on]
I shared details of BayerCLAW in a previous newsletter (#86). BayerCLAW a workflow orchestration system for AWS, targeted at bioinformatics pipelines. Jack Tabaska and Ian Davis from the Bayer Crop Sciences team have put together this blog post, BayerCLAW – Open-Source, Serverless Orchestrator for Scientific Workflows on AWS that takes a look at the motivations and technical implementation of BayerCLAW.
AlphaFold is an artificial intelligence program developed by Alphabets's/Google's DeepMind which performs predictions of protein structure. In this post, Run AlphaFold v2.0 on Amazon EC2, Qi Wang provides a step-by-step guide on how to install AlphaFold on an EC2 instance with Nvidia GPU.
In the post Migrate from SQL Server to Amazon Aurora using Babelfish, Ramesh Kumar Venkatraman provides and overview of how you can migrate from SQL Server to Babelfish for Aurora PostgreSQL. [hands on]
Dave Currie shares details of the Amazon Corretto support roadmap in his post, Announcing Amazon Corretto 17 support roadmap. Make sure you read this short post and understand what this means for any workloads you have running Amazon Corretto 8 or 11.
Build and deploy a Spring Boot application to AWS App Runner with a CI/CD pipeline using Terraform is the perfect post if you want to learn about how to setup a really nice automated deployment pipeline for your Spring Boot applications on AWS. Irshad Buchh and Yang Xiao walk you through setting up a pipeline for automatic build and deployment onto AWS App Runner. Read on to find out more [hands on]
Jesse Butler opens this post up with the question “Does the OS even matter anymore?” - intrigued? Have your own opinion? Well find out what he thinks in the excellent post, Bottlerocket, A Year in the Life - (and I totally agree, for anyone interested!) [hands on]
Danny Gitelman and Daniel Begimher share how to use tools like Snyk in combination with an automated workflow to reduce the risk of downloading new packages from public repositories. Read more in their post, How to automate your software-composition analysis on AWS [hands on]
SkySQL is a database as a service (DBaaS) solution on AWS that makes it easy for customers to start using MariaDB Enterprise in the cloud. In the post, MariaDB Collaborates with AWS to Deliver SkySQL on AWS Afza Wajid and Sudhir Reddy Maddulapally speak with Alexey Vorovich, VP of Engineering for SkySQL at MariaDB Corporation, about the recent SkySQL launch.
Damien Martins shares with you a how-to guide describes the steps to invoke an automatic extraction of media asset metadata through ffprobe (part of the FFmpeg project) in his post,
Analyzing media files using ffprobe in AWS Lambda [hands on]
Frank Dallezotte and Maxwell Moon have collaborated on this post, Building ARM64 applications on AWS Graviton2 using the AWS CDK and Self-Hosted Runners for GitHub Actions where they show how to configure of a self-hosted GitHub Runner on an EC2 instance with a Graviton2 processor, the required network resources, and a workflow that will run on the Runner on each repository push or pull request for the example application. This will allow you to start to create multi architecture builds so that you can start leveraging Arm based AWS Graviton2 instances and the improved price/performance as well as power characteristics. [hands on]
In the post, Backwards Compatibility Testing for OpenSearch Vacha Shah and Sarat Vemulapalli show you how backwards compatibility testing works within OpenSearch, something that is used to test and determine the safe upgrade paths from a supported version to the current version.
Amazon Time Sync Service
Amazon Time Sync Service now allows you to easily generate and compare timestamps from Amazon EC2 instances with ClockBound, an open source daemon and library. This information is valuable to determine order and consistency for events and transactions across EC2 instances, independent from the instances’ respective geographic locations. ClockBound calculates your Amazon EC2 instance’s clock error bound to measure its clock accuracy and allows you to check if a given timestamp is in the past or future with respect to your instance’s current clock. On every call, ClockBound simultaneously returns two pieces of information: the current time and the associated absolute error range. This means that the actual time of a ClockBound timestamp is within a set range.
To get started, first make sure you are using Chrony. Then install the ClockBound daemon and library, or build your own library to integrate ClockBound into your application. For the best clock accuracy, we also recommend using the Amazon Time Sync Service. The Amazon Time Sync Service and Chrony are configured by default on Amazon Linux 2 instances.
Check out the code repo at the top of this post.
Nick Coval & Adam Palmer presented "Building an Open Source IDS/IPS Service on AWS with Suricata" at SuriCon, where they talk about how they built a quick-start solution on AWS that creates a Suricata-based solution, powered by AWS Gateway Load Balancer service (GWLB); enabling centralized and distributed deployment models.
MLOps: End-to-End Hugging Face Transformers with the Hub & SageMaker Pipelines
November 10th 2021 - 6:00 PM (GMT)
Later this week, we have this workshop where you will learn how to build an End-to-End MLOps Pipeline for Hugging Face Transformers from training to production using Amazon SageMaker. Join the always amazing Julien Simon, together with Matteu Desve and Phillip Schmid for this webinar. Read more and register here.
Databricks | AWS Lakehouse Dev Day Live Workshop
November 16th 9:00 AM PT
Delta Lake is an open source storage layer that provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. You can use Delta Lake on top of your existing data lake. During this workshop you will learn how to:
- Make your existing Amazon S3 data lakes into a lakehouse with Delta Lake.
- Provide an easy-to-use platform for analysts to directly query data on your data lake using SQL Analytics
- Simplify and automate data pipelines for streaming and batch data to lower costs and boost productivity for your data teams