Daniel Quackenbush

Posted on May 14, 2020 • Edited on May 16, 2020

Getting Started with AWS Batch

#aws #containers #batch #terraform

AWS Batch is a scheduled executor based on a job queue. At my $job, we were evaluating it for dynamic workloads, whereas a container service is needed to execute dynamic workloads based on a queue. The general workflow looks a little like this:

The external worker submits a job -> the job is scheduled on a spot instance -> ECS takes over and executes the task -> results are logged.

Given that most applications communicate with different external systems, there would a wide variety of IAM configuration and container scripts. For simplicity purposes, and to provide a generic system, this batch job will copy /etc/motd to a parametrized S3 bucket.

Configure Foundation

VPC

First, define out the VPC, and private subnets. In my use case, I have predefined which VPC through a variable (vpc_id), and then dynamically lookup the subnet through tags with the key/value of subnet/private.

Security Groups

For this topology, I am utilizing VPCE Endpoints, such that my containers remain locked down on available egress, however according to AWS' Setting Up with AWS Batch, they recommend you can just configure open traffic.

IAM

Utilizing the principle of least privilege is important, however, for simplicity, I am utilizing largely AWS managed roles. Batch with ECS requires two roles, first the Batch Role which allows the service to create ec2 instances, create and modify the auto-scaling group, etc. Second is the ECS service role which provides two purposes, being the task execution role (permissions needed for your container) and the service role. It is worth noting that during my research, I did not see a breakdown of task execution vs task role, a feature for which the ECS service itself provides.

Batch Instances/Service Roles:

Additional policies for uploading to s3:

Configure Batch

Batch is compiled of several pieces:

Compute Environment
Queue
Job Definition

Compute Environment

Batch allows you to configure any variety of the EC2 flavors you want to configure. For this concept design, I went with strictly optimal spot instances, however for the production workloads, it’s likely the environment won’t be as ephemeral and some on-demand instances might be required.

To ensure the most optimally secure environment, I had to create a task definition to accomplish two main purposes:

Ensure the base volume is encrypted

Requesting a blank spot instance won’t encrypt your volumes at rest. To do so, you must define the block_device_mappings, ensuring setting encryption to true.

Utilize Amazon Linux 2 over 1, to provide the latest patches. By default, the optimized spot instance ship with Amazon Linux 1, which was last updated in March of 2018. Utilizing the parameter store, AWS provides the ability to dynamically look up the image id: /AWS/service/ecs/optimized-ami/amazon-Linux-2.

Queue

The queue is where definitions will be associated with compute environments. If designed for production, it’s possible to combine different types of compute fleet.

Job Definition

This is where the meat of the operations will happen. When you submit a job to the queue, you specify a definition for which you want to utilize, which then is like the cookbook/playbook for the instance. This is important because the definition is what gets defined for compute requirements, entry point command, etc. Job definitions also allow for parameterization, such that you can create dynamic workloads.

Below I break down properties of the container, which highlights the parameterized implementation:

Execution

To wrap the job altogether, simply submit a job to the queue, and watch the magic happen.

aws batch submit-job --job-name test --job-queue queue --job-definition batch-job-definition --parameters BUCKET_NAME=s3://quack-batch-testing

DEV Community

Getting Started with AWS Batch

Configure Foundation

VPC

Security Groups

IAM

Configure Batch

Compute Environment

Queue

Job Definition

Execution

Top comments (0)

Read next

All the latest feature releases, updates and announcements of AWS re:Invent 2024

Secure Remote Access to Private EC2 Instances with AWS SSM Session Manager

Securely Connecting to Private EC2 Instances with EC2 Instance Connect Endpoint

Step-by-Step Dockerization of a Node.js App Connecting to AWS CloudHSM with PKCS#11 SDK