DEV Community

Cover image for How to become an infrastructure-as-code ninja, using AWS CDK - part 6
Erik Lundevall Zara
Erik Lundevall Zara

Posted on • Updated on • Originally published at cloudgnosis.org

How to become an infrastructure-as-code ninja, using AWS CDK - part 6

In this article, we are continuing with building our AWS infrastructure, which is a container-based solution running in AWS Elastic Container Service (AWS ECS).

Back in part 5, we set up an Apache web server-based solution running in a container, in a container cluster using AWS ECS, using Typescript as a language to describe the infrastructure. Building the solution, we divided the solution into two separate files:

  • bin/my-container-infrastructure.ts - this is the main program for describing our infrastructure
  • lib/containers/container-management.ts - this contains support functions to describe container-based infrastructure

Before adding new feature to our solution, we are going to take a step back and look at what we can do to organise and test the building blocks in our infrastructure-as-code solution. In particular, testing is something we want to keep in mind right from the start ideally.

We will look at how to incorporate testing into what we already have and then continue to add new infrastructure and testing as we move ahead.

Warning! The built-in testing support libraries in AWS CDK expect you to know AWS CloudFormation. Some familiarity of CloudFormation is recommended if you use these.


Update:

This article series uses Typescript as an example language. However, there are repositories with example code for multiple languages.

The repositories will contain all the code examples from the articles series, implemented in different languages with the AWS CDK. You can view the code from specific parts and stages in the series, by checking out the code tagged with a specific part and step in that part, see the README file in each repository for more details.


Different testing

What do we mean by testing? There are various aspects to consider, which include:

  • That we get the infrastructure we expect to have
  • That the solution and its infrastructure adhere to any security and compliance policies in place
  • That the solution itself works as expected when deployed with the infrastructure

The testing aspect we will focus on here in this article is the first one, that we get the infrastructure we expect to have. This may include both actual resources that will be provisioned, that t*hese resources have expected properties* and that relations between resources are as expected. Also, that we do not introduce unexpected changes.

We will treat these tests as unit tests essentially, so they will run in our (local) development environment, and will run in the order of seconds (or minutes).

Who writes the tests?

It depends a bit on how you have the infrastructure-as-code work organised, but the people that build and maintain (re-usable) infrastructure building blocks should write tests for those building blocks.

That may mean every developer, or specific platform developers or other groups of people. For YAML/JSON-based infrastructure using CloudFormation, there is limited support for testing and validation of the infrastructure. Frankly, the need may sometimes be limited, since you in those cases also just declare what you want to have, there may not be that much logic to test. However, when you get enough logic and conditions included with CloudFormation YAML/JSON, it can get quite messy.

If you use programming languages and AWS CDK, you get a more imperative layer to generate the declarative model. This can both make it easier to make it clear what is intended, but also make it more complex to understand exactly what infrastructure you will get.

Get started with writing tests

Enough preparation talk now, let us get into practical work! If you have followed along in this article series to set up the project using the cdk init command for a Typescript project, you will already have some test tooling in place. AWS CDK includes the Jest test framework by default in the installation.

You can use whichever testing framework you want, though, with the testing support provided in AWS CDK. The examples we will build here will use Jest though, since that it what is set up by default in an AWS CDK project.

If you have set up the AWS CDK project as described in part 4 and part 5 of this series, you will have a directory in your project folder named test. This is where will take place or tests to write. Since we have cheated and not practiced test-driven development (TDD) right from the start, we will build some test for the existing infrastructure we have defined, before moving further with new infrastructure.

Infrastructure recap

Let us first re-cap what we had built so far in the two source files in our project:

bin/my-container-infrastructure.ts

import { App, Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { 
  addCluster, 
  addService,
  addTaskDefinitionWithContainer, 
  ContainerConfig, 
  TaskConfig 
} from '../lib/containers/container-management';

const app = new App();
const stack = new Stack(app, 'my-container-infrastructure', {
  env: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: process.env.CDK_DEFAULT_REGION,
  },
});

const vpc = Vpc.fromLookup(stack, 'vpc', {
  isDefault: true,
});

const id = 'my-test-cluster';
const cluster = addCluster(stack, id, vpc);

const taskConfig: TaskConfig = { cpu: 512, memoryLimitMB: 1024, family: 'webserver' };
const containerConfig: ContainerConfig = { dockerHubImage: 'httpd' };
const taskdef = addTaskDefinitionWithContainer(stack, `taskdef-${taskConfig.family}`, taskConfig, containerConfig);
addService(stack, `service-${taskConfig.family}`, cluster, taskdef, 80, 2, true);
Enter fullscreen mode Exit fullscreen mode

lib/containers/container-management.ts

import { IVpc, Peer, Port, SecurityGroup } from 'aws-cdk-lib/aws-ec2';
import { Cluster, ContainerImage, FargateService, FargateTaskDefinition, LogDriver, TaskDefinition } from 'aws-cdk-lib/aws-ecs';
import { RetentionDays } from 'aws-cdk-lib/aws-logs';
import { Construct } from 'constructs';

export const addCluster = function(scope: Construct, id: string, vpc: IVpc): Cluster {
    return new Cluster(scope, id, {
        vpc,
    });
}

export interface TaskConfig {
    readonly cpu: 256 | 512 | 1024 | 2048 | 4096;
    readonly memoryLimitMB: number;
    readonly family: string;
}

export interface ContainerConfig {
    readonly dockerHubImage: string;
}

export const addTaskDefinitionWithContainer = 
function(scope: Construct, id: string, taskConfig: TaskConfig, containerConfig: ContainerConfig): TaskDefinition {
    const taskdef = new FargateTaskDefinition(scope, id, {
        cpu: taskConfig.cpu,
        memoryLimitMiB: taskConfig.memoryLimitMB,
        family: taskConfig.family,
    });

    const image = ContainerImage.fromRegistry(containerConfig.dockerHubImage);
    const logdriver = LogDriver.awsLogs({ 
        streamPrefix: taskConfig.family,
        logRetention: RetentionDays.ONE_DAY,
    });
    taskdef.addContainer(`container-${containerConfig.dockerHubImage}`, { image, logging: logdriver });

    return taskdef;
};

export const addService = 
function(scope: Construct, 
         id: string, 
         cluster: Cluster, 
         taskDef: FargateTaskDefinition, 
         port: number, 
         desiredCount: number, 
         assignPublicIp?: boolean,
         serviceName?: string): FargateService {
    const sg = new SecurityGroup(scope, `${id}-security-group`, {
        description: `Security group for service ${serviceName ?? ''}`,
        vpc: cluster.vpc,
    });
    sg.addIngressRule(Peer.anyIpv4(), Port.tcp(port));

    const service = new FargateService(scope, id, {
        cluster,
        taskDefinition: taskDef,
        desiredCount,
        serviceName,
        securityGroups: [sg],
        circuitBreaker: {
            rollback: true,
        },
        assignPublicIp,
    });

    return service;
};
Enter fullscreen mode Exit fullscreen mode

We will start with the functions we have defined in lib/containers/container-management.ts, addCluster, addService, and addTaskDefinitionWithContainer. We will use the assertions module provided with AWS CDK for your testing, and use Jest to define the tests.

Let us write the first test

To build our first tests, let us create a new file in the test directory, called /containers/container-management.test.ts and add our test code there. We will start with a single test for the addCluster function and look at how that is built up:

import { Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { Template } from 'aws-cdk-lib/assertions';
import { addCluster } from '../../lib/containers/container-management';

test('ECS cluster is defined with existing vpc', () => {
    // Test setup
    const stack = new Stack();
    const vpc = new Vpc(stack, 'vpc');

    // Test code
    const cluster = addCluster(stack, 'test-cluster', vpc);

    // Check result
    const template = Template.fromStack(stack);

    template.resourceCountIs('AWS::ECS::Cluster', 1);

    expect(cluster.vpc).toEqual(vpc);
});
Enter fullscreen mode Exit fullscreen mode

We include the assertions sub-module from AWS CDK, which has features to generate CloudFormation templates from different sources, and then perform tests on these templates.

To use this, we need to create a stack, so we import that as well. Since our addCluster function requires some kind of construct, an identifier and a reference to a Vpc construct, we will create an provide that. The stack is possible to create with no AWS CDK App, or even an identifier, so we will just create an empty stack.

The actual test code is to call the addCluster function and pick up the resulting cluster object. What are we expecting the result will be?

We expect that an ECS cluster has been added to the stack we supply, and the provided VPC parameter is included with the cluster.

So testing this, we check two things:

  • There is a CloudFormation AWS::ECS::Cluster resource in the stack
  • The returned cluster object contains a reference to the provided Vpc object.

Note here that in CloudFormation, the ECS cluster (AWS::ECS::Cluster) does not have a reference to a VPC. This is something we can see if we look in the AWS CloudFormation documentation for the AWS::ECS::Cluster resource.This is purely something that the AWS CDK itself has added, for use later with other constructs. Besides checking that there is an AWS::ECS::Cluster in the stack, we currently do not care more about any details. So for us it suffices to check that the cluster resource is in the stack, and that we have one of it.

The AWS CDK Cluster object should have a reference to a Vpc though and it should be the one we provide to the addCluster function. So we simply test that and use the Jest expect() function to test this.

We can run the test with the command npm test and see what we get:

❯ npm test

> my-container-infrastructure@0.1.0 test
> jest

 PASS  test/containers/container-management.test.ts (13.649 s)
  ✓ ECS cluster is defined with existing vpc (125 ms)

Test Suites: 1 passed, 1 total
Tests:       1 passed, 1 total
Snapshots:   0 total
Time:        13.739 s, estimated 15 s
Ran all test suites.
Enter fullscreen mode Exit fullscreen mode

Success! If this would have been a proper test-driven development cycle, we would go through a different feedback loop here, though. For now, we are mainly catching up a bit, though.

Our first test case involves both a check that goes down into the generated CloudFormation, and another test which checks the state of a higher level construct. These are both valid types of tests to do, and to what extent you do explicit CloudFormation tests depends on the use case. If you build your own high-level construct from direct CloudFormation resources, it makes sense to do a lot of lower-level testing. If you are combining higher-level constructs, then there might not be the same need.

Task definition testing

Our next test target is the addTaskDefinitionWithContainer function, which should create an ECS Fargate task definition and associate a container from DockerHub with it. From that description, we can think about what we could test.

  • The function signature says it returns a TaskDefinition. We can look for ways that the returned task definition says it will be used with Fargate in its interface.
  • The task definition should have the family, cpu and memory settings we have provided
  • We can also check the underlying CloudFormation AWS::ECS::TaskDefinition that is has been created and has the expected properties.
  • We also need to check that the container we provide has been added to the task definition. We can check if we can use the TaskDefinition object returned to check that.
  • We can also check the underlying CloudFormation for an appropriate AWS::ECS::ContainerDefinition as well.

We can do testing on the higher level constructs the AWS CDK provides, or we can do more low-level testing on the generated CloudFormation.

Initially, when I first saw the assertions support functions provided for the AWS CDK, my mind was very much set on testing a lot of CloudFormation details. But I have changed my mind there. If I am building higher-level constructs from other high-level constructs in AWS CDK, the need to check explicitly generated CloudFormation is slightly limited. If you build your own constructs which use resources that map directly to CloudFormation resource, then it is very useful to check the generated CloudFormation. In other cases, that is only partially true. Look at what to can check from the construct interfaces first, and if that is not sufficient, then go to the CloudFormation-oriented tests.

I want to consider CloudFormation an implementation detail of the AWS CDK, and preferably not think about it if I can. It cannot be avoided currently though, and in practice you will have to deal with it sometimes.

Test for Fargate Task Definition

Let us take a first stab at the tests for addTaskDefinitionWithContainer and check that we have a task definition that is Fargate compatible.

test('ECS Fargate task definition defined', () => {
    // Test setup
    const stack = new Stack();
    const cpuval = 512;
    const memval = 1024;
    const familyval = 'test';
    const taskCfg: TaskConfig = { cpu: cpuval, memoryLimitMB: memval, family: familyval };
    const imageName = 'httpd';
    const containerCfg: ContainerConfig = { dockerHubImage: imageName };

    // Test code
    const taskdef = addTaskDefinitionWithContainer(stack, 'test-taskdef', taskCfg, containerCfg);

    // Check result
    const template = Template.fromStack(stack);

    expect(taskdef.isFargateCompatible).toBeTruthy();
    expect(stack.node.children.includes(taskdef)).toBeTruthy();

    template.resourceCountIs('AWS::ECS::TaskDefinition', 1);
    template.hasResourceProperties('AWS::ECS::TaskDefinition', {
        RequiresCompatibilities: [ 'FARGATE' ],
        Cpu: cpuval.toString(),
        Memory: memval.toString(),
        Family: familyval,
    });

});
Enter fullscreen mode Exit fullscreen mode

Again, in this test, we use both higher-level tests and some low-level CloudFormation tests. We can check directly that the returned task definition is Fargate compatible and we can check that it has been added to the stack without resorting to checking the CloudFormation.

We can also as before, check at CloudFormation level that one Task Definition has been added. The TaskDefinition interface does not allow us to check for the cpu, memory limit and family values, though, so in this case we would need to dive into the actual CloudFormation. The Template.hasResourceProperties() function is quite useful for that. We can specify the properties we expect to find in the resource, and only the properties we are interested in. The other properties we do not need to care about.

So we added a check for the cpu, memory and family settings to verify that those are in place.

Note: If you look at the test code, you see that the Cpu and Memory values are converted to strings. In the CloudFormation documentation examples, these values are numbers. However, according to the CloudFormation specification, the values are strings. The AWS CDK generates the direct CloudFormation resources from the specification. So if there is a discrepancy, the AWS CDK is likely handling it correctly.

Test for container definition

Let us add another test to check that the container definition is added to the task definition. Our function creates a task definition with a single container. The TaskDefinition construct can provide a reference to the default container definition, so it makes sense to check that this is in place - there is at least some container definition in place.

However, the container definition provided by the AWS CDK does not allow us (easily) to check what the image reference is. So in this case, we might complement this with a CloudFormation-oriented test. In this case, we will still check the AWS::ECS::TaskDefinition, but we will look at a different part of the structure.

Update 2022-02-10: From version 2.11 of aws-cdk-lib, the image name is available from the container definition. A check for this is added in the test, without removing the CloudFormation check, as it illustrates nested matching as well.

In the previous test case, we could just enter the properties we wanted to match with. Here, we will go deeper into the CloudFormation resource, so we have to be a bit more explicit about the type of matching to do.

test('Container definition added to task definitio', () => {
    // Test setup
    const stack = new Stack();
    const cpuval = 512;
    const memval = 1024;
    const familyval = 'test';
    const taskCfg: TaskConfig = { cpu: cpuval, memoryLimitMB: memval, family: familyval };
    const imageName = 'httpd';
    const containerCfg: ContainerConfig = { dockerHubImage: imageName };

    // Test code
    const taskdef = addTaskDefinitionWithContainer(stack, 'test-taskdef', taskCfg, containerCfg);

    // Check result
    const template = Template.fromStack(stack);
    const containerDef = taskdef.defaultContainer;

    expect(taskdef.defaultContainer).toBeDefined();
    expect(containerDef?.imageName).toEqual(imageName); // Works from v2.11 of aws-cdk-lib
    template.hasResourceProperties('AWS::ECS::TaskDefinition', {
        ContainerDefinitions: Match.arrayWith([
            Match.objectLike({
                Image: imageName,
            }),
        ]),
    });
});
Enter fullscreen mode Exit fullscreen mode

The functions in the Match class provide different features to use. Match.objectLike() is the same as we did implicitly at the top level in the previous test. Match.arrayWith() allows us to check that there is an element in an array that matches what we are looking for. These functions help us check that there is a container definition inside the task definition, and it refers to the DockerHub image we provided.

Test the service

The last function to test here is addService(). This is a function that ties our previously defined resources together and adds something we will spin up in a cluster and actually run. We provide a port that should be available to access our service on, and we provide a desired count for the container to run, plus tie all the pieces together.

Based on this information and what we have implemented, we can create a test like this:

test('Fargate service created, with provided mandatory properties only', () => {
    // Test setup
    const stack = new Stack();
    const vpc = new Vpc(stack, 'vpc');
    const cluster = addCluster(stack, 'test-cluster', vpc);

    const cpuval = 512;
    const memval = 1024;
    const familyval = 'test';
    const taskCfg: TaskConfig = { cpu: cpuval, memoryLimitMB: memval, family: familyval };
    const imageName = 'httpd';
    const containerCfg: ContainerConfig = { dockerHubImage: imageName };
    const taskdef = addTaskDefinitionWithContainer(stack, 'test-taskdef', taskCfg, containerCfg);

    const port = 80;
    const desiredCount = 1;

    // Test code
    const service = addService(stack, 'test-service', cluster, taskdef, port, desiredCount);

    // Check result
    const sgCapture = new Capture();
    const template = Template.fromStack(stack);

    expect(service.cluster).toEqual(cluster);
    expect(service.taskDefinition).toEqual(taskdef);

    template.resourceCountIs('AWS::ECS::Service', 1);
    template.hasResourceProperties('AWS::ECS::Service', {
        DesiredCount: desiredCount,
        LaunchType: 'FARGATE',
        NetworkConfiguration: Match.objectLike({
            AwsvpcConfiguration: Match.objectLike({
                AssignPublicIp: 'DISABLED',
                SecurityGroups: Match.arrayWith([sgCapture]),
            }),
        }),
    });

    template.resourceCountIs('AWS::EC2::SecurityGroup', 1);
    template.hasResourceProperties('AWS::EC2::SecurityGroup', {
        SecurityGroupIngress: Match.arrayWith([
            Match.objectLike({
                CidrIp: '0.0.0.0/0',
                FromPort: port,
                IpProtocol: 'tcp',
            }),
        ]),
    });
});
Enter fullscreen mode Exit fullscreen mode

A new feature added here is the ability to capture values from the generated CloudFormation. This will literally be whatever is at the location where the capture object has been placed. We will just use that as a placeholder for now.

If you look at the test built here, you may spot some concerns and issues with our infrastructure design. While the test passes, there are some design issues here.

Does our design suck? Wrapping up

There are several issues one may spot when testing our infrastructure design, some of which include:

  • Run things as Fargate is implicit in the interface. Should it be?
  • Specifying a port number to access the service through addService() is right now fine for a single container instance only. For multiple containers (desiredCount > 1) there would need to be a load balancer.
  • The design opens for traffic from everywhere, regardless of whether it uses public or private IP addresses
  • Do we have the right abstraction level for this? If our test cases become too complicated, maybe we need to find a different approach.
  • No configuration or tweaking of container setup

We skipped some complexities by setting up a container-based environment in a cluster when we did the initial solution. This was a conscious choice then. It is easy to forget about some of these decisions later. Adding tests is a way both to validate that we get what we want, but also that our design choices for how we build our infrastructure are sound, for our use cases.

We will work with the tests and also change the design somewhat, based on what our end goals are.

If you enjoyed this material, you can check out the other parts of this series and other articles at Tidy Cloud AWS. Send comments, questions, and suggestions!

Oldest comments (0)