Say you have dev and prod VPCs that already exist and you want to use the same stack to deploy to both. Files: config/dev.yaml config/prod.yaml In the stack: In typical yaml fashion everything is a dictionary or a list and you can, therefore, navigate and loop. (details below)TLDR
I don't like using the cdk.context.json
file. I use a config/<env>.yaml
file that holds environment specific values and omegaconf to parse it.
vpc:
vpc_id: vpc-1234567890
vpc:
vpc_id: vpc-9876543210
from omegaconf import Omegaconf
deploy_env = "dev" # <-- This could/should be set by your CICD pipeline
conf = OmegaConf.load("config/{0}.yaml".format(deploy_env))
# load pre-existing vpc into variable
vpc = ec2.Vpc.from_lookup(self, conf.vpc.vpc_id, vpc_id=conf.vpc.vpc_id)
About Me
My name is Jakob and I am a DevOps Engineer. I used to be a lot of other things as well (Dish Washer, Retail Employee, Camp Counselor, Army Medic, Infectious Disease Researcher), but now I am a DevOps Engineer. I received no formal CS education but I'm not self taught, because I had thousands of instructors who taught me through their tutorials and blog posts. The culture of information sharing within the software engineering community is vital to everyone, especially those like me who didn't have other options. So, as I learn new things I will be documenting them through the eyes of someone learning for the first time, because those are the people most in need of a guide. Happy Learning! And don't be a stranger.
Note: I am NOT going to be sourcing and fact checking everything here. This is not an O'Reilly book. The language and descriptions are intended to allow beginners to understand. If you want to be pedantic about details or critique my framing of what something is or how it works feel free to write your own post about it. These posts are intended to be enough to get you started so that you can begin breaking things in new ways on your own!
The Problem
One thing that I found some difficulty with when I started using the AWS-CDK was how to handle deploying into multiple pre-existing environments. Of course, the CDK makes it easy to create a stand-alone stack with a new VPC, subnets, buckets, and certificates, but sometimes we have a pre-existing environment we need to deploy into. Or perhaps we want to use variations on the same stack to deploy into multiple environments, eg. smaller instances instances for a development environment.
In Terraform, we might pass in a dev.tfvars
or prod.tfvars
file. With the CDK you can use the recommended cdk.context.json
file and pass context dependent parameters into the stack. But after scanning over the documentation for how to add values to the context file, I decided it was too annoying and I wanted a better way.
My Solution
I have settled on pairing a yaml parsing library called omegaconf
to make my own .tfvars
-like parameter file. Let me show you how it works.
The Setup
First, create yourself come configuration files. I made a folder at the root of the project called /config
where I created yaml files named for each environment, eg. dev.yaml
, uat.yaml
, prod.yaml
. Lets also add a couple things to dev.yaml
aws:
account: "12345678910"
region: us-east-1
env: Dev
Now, throw omegaconf
into your requirements.txt file and pip install
it. (you are using a virtual environment right?)
Import it into both the app.py
file at the root of your project and any stack files you plan on using the parameters.
from omegaconf import Omegaconf
Implementation
Depending on how you are going to manage deploying to your environments, you are going to load a different config file.
For example, if you are just going to deploy from your local computer, you can set an variable the the deploy environment at the top of your file (so that it is visible and easy to change) and then use that variable to load your config.
1 import aws_cdk as cdk
2 from uber_for_cats.uber_for_cats_stack import UberForCatsStack
3 from omegaconf import Omegaconf
4
5 deploy_env = "dev"
6
7 conf = OmegaConf.load("config/{0}.yaml".format(deploy_env))
8
9 app = cdk.App()
10 UberForCatsStack(app, "UberForCats{0}".format(conf.env),
11 env=cdk.Environment(account=conf.aws.account, region=conf.aws.region),
12 )
13
14 app.synth()
On lines 5-7
the environment is set to dev and the config/dev.yaml
file is set to conf
.
If you were using a CICD pipeline to automatically deploy, deploy_env
could be set based on a pipeline variable and line 5
could look like this.
5 deploy_env = os.getenv("DEPLOY_ENV")
But the result will be the same the values set in the config/dev.yaml
file will be used and the file will be read as if the strings were there.
1 import aws_cdk as cdk
2 from uber_for_cats.uber_for_cats_stack import UberForCatsStack
3 from omegaconf import Omegaconf
4
5 deploy_env = "dev"
6
7 conf = OmegaConf.load("config/dev.yaml")
8
9 app = cdk.App()
10 UberForCatsStack(app, "UberForCatsDev",
11 env=cdk.Environment(account="12345678910", region="us-east-1"),
12 )
13
14 app.synth()
This allows you to set other values in config/uat.yaml
such as a different account, different sized instances/EBS volumes, autoscaling rules, etc. depending on what is required in each environment.
Something to Remember
Yaml is basically a nested dictionary that can also contain lists. When you see a -
before something it is a list and therefore iterable. Take, for example, the following.
aws:
account: "12345678910"
region: us-east-1
vpc:
vpc_id: vpc-aabbccdd
subnet:
private:
- subnet-65asdf651sadf65
- subnet-c65as1df65f56sa
- subnet-afas65df1a6sdf5
Those private subnet IDs are a list and you can iterate over them.
private_subnets = []
for i, subnet in enumerate(conf.vpc.subnet.private):
private_subnets.append(
ec2.Subnet.from_subnet_id(self, "pri{0}".format(i), subnet_id=subnet)
)
This would create a list of subnet objects (ISubnet
) that you can use for the placement of an autoscaling group or EKS cluster. Neat!
Congrats, Those are the basics! You should be able to get started.
Some Additional Tricks
While using a yaml file for configuration settings, I ran into some situations that might be worth sharing.
Sometimes strings require extra steps
I am using the CDK to make some EKS clusters. Part of this process is creating node-groups (think autoscaling groups for Kubernetes). I wanted to leverage some spot instances for a portion of our development cluster and part of the process of creating these nodegroups in CDK is specifying the compute size and class. Because this could be different between environments I put it in the config file.
node_group:
spot:
min: 1
max: 5
type:
- i_class: BURSTABLE3
i_size: LARGE
- i_class: BURSTABLE3
i_size: XLARGE
- i_class: COMPUTE6_INTEL
i_size: LARGE
- i_class: COMPUTE6_INTEL
i_size: XLARGE
But that won't work for setting instance types:
instance_type = ec2.InstanceType.of(
ec2.InstanceClass.conf.node_group.spot.type[0].i_class,
ec2.InstanceSize.conf.node_group.spot.type[0].i_size,
)
And, honestly, it isn't very readable either. My workaround was making dictionaries of instance classes and sizes. then using the value in the yaml as the key the the appropriate class/size.
node_group:
spot:
min: 1
max: 5
type:
- i_class: t3
i_size: large
- i_class: t3
i_size: xl
- i_class: c6i
i_size: large
- i_class: c6i
i_size: xl
ec2_class = {
"t3": ec2.InstanceClass.BURSTABLE3, # max 2xl
"c6i": ec2.InstanceClass.COMPUTE6_INTEL, # min large
}
ec2_size = {
"large": ec2.InstanceSize.LARGE,
"xl": ec2.InstanceSize.XLARGE,
}
With the combination of the above, we can now use the string in the yaml as the key to pull in the ec2 objects in the format that the CDK requires. Below I make a list of the specified instance combinations and pass it into the EKS cluster as nodegroup capacity, but you could just as easily specify a list of classes and types and use a python library like itertools
to make ALL of the combinations in one line. That might sacrifice readability though. So actually, don't do that. But you could...
spot_instance_types = []
for instance_type in node_group.spot.type:
this_type = ec2.InstanceType.of(
ec2_class[instance_type.i_class],
ec2_size[instance_type.i_size],
)
spot_instance_types.append(this_type)
cluster.add_nodegroup_capacity(
"{0}-spot-nodegroup".format(conf.env),
nodegroup_name="{0}-spot-ng".format(conf.env),
capacity_type=eks.CapacityType.SPOT,
min_size=node_group.spot.min,
max_size=node_group.spot.max,
instance_types=spot_instance_types, # <-- list of instance types
disk_size=250,
subnets=ec2.SubnetSelection(subnets=private_subnets), # <-- those subnets from before!
)
Booleans are Useful
So we are promoting this project out to production and someone doesn't think spot instances are a good idea even though you have diversified your spot pools. Throw a trigger into the config.
node_group:
spot:
enabled: False
min: 1
max: 5
Then you can run your node creation based on it!
if conf.node_group.spot.enabled:
spot_instance_types = []
for instance_type in node_group.spot.type:
this_type = ec2.InstanceType.of(
ec2_class[instance_type.i_class],
ec2_size[instance_type.i_size],
)
spot_instance_types.append(this_type)
cluster.add_nodegroup_capacity(
"{0}-spot-nodegroup".format(conf.env),
nodegroup_name="{0}-spot-ng".format(conf.env),
capacity_type=eks.CapacityType.SPOT,
min_size=node_group.spot.min,
max_size=node_group.spot.max,
instance_types=spot_instance_types, # <-- list of instance types
disk_size=250,
subnets=ec2.SubnetSelection(subnets=private_subnets), # <-- those subnets from before!
)
Beautiful!
Top comments (0)