Have you ever been into the AWS console and been completely baffled about all the concepts and jargon? You’ve got: Security Groups, Inbound rules, VPC’s, Subnets, Internet Gateways, NAT, ENI’s and all of them are related to networking somehow. Put simply: there’s a lot to AWS networking. So if you’re going to break into it somehow you need to know what to focus on: the fundamentals.
Today we’re going to be going through the main networking components you should be familiar with in AWS. We’ll talk you through why you’d need the component, what it is and how you’d use it. Throughout the article we’ll be building up an example of running a web server in a public subnet as part of our own VPC.
By the end of this article you’ll understand the main networking concepts: Private IP’s, Virtual Private Cloud (VPC), Classless Inter Domain Routing (CIDR), Subnets, Internet Gateways and Security Groups and use these to implement a basic network design.
Maybe right now you’re thinking:
But, I’m a software engineer? Do I really need to know about networking? Isn’t that the job of operations or something? I already have a lot on my plate, why would I take on one more thing?
But you’d be right to ask the question!
If that’s you, and you’re not yet convinced about why you should learn AWS networking fundamentals then let me give you two big reasons…
Whether you’re launching a simple EC2 or even Lambda these networking topics: VPC’s, security groups etc will arise. And they’ll pose a constant distraction where you’re saying “Oh, I’ll just ignore that for now”. And you ignore it, and you ignore it.
But over time not understanding these fundamentals will start to waste your time, as you wrestle with features you don’t really understand. Even if you’re a bog-standard application developer at some point you’re going to come into contact with the AWS networking fundamentals we’re covering today and it’s going to make your life a lot easier.
2. Understanding AWS networking fundamentals allows you to implement fundamentally better solutions.
If you’ve got a hammer — everything’s a nail.
If you’re not aware of what’s possible at a network level you’ll perpetually implement sub-optimal solutions at the application level. I’m guilty of doing this in the past, implementing things like IP whitelisting inside of the application when ideally they should be in the network level.
When you know what is possible at the network level (even if you can’t necessarily implement it) it will allow you to challenge your own architectures and designs to come up with fundamentally better software.
Working in cloud native environments is becoming more and more common. Engineers are now required more than ever to step out of the realms of pure application development and to actually understand infrastructure concerns, like networking. More is demanded of you.
Okay so that concludes why we’d want to learn these networking concepts — let’s talk about how we’re actually going to do it.
My preferred way of learning in the cloud is via infrastructure-as-code.
Why? Because it allows me to write the infrastructure as code in my own time, prepare the changes and execute them when I’m ready.
Not only does infrastructure-as-code enable a nice workflow, but undo-ing mistakes and keeping a history in version control simply makes life that bit less stressful.
So as we go through the networking concepts today I’ll also give you snippets of terraform infrastructure-as-code. Don’t worry if you’re not familiar with Terraform as the snippets are small you shouldn’t need any prior knowledge. The snippets are mainly there to show you what types of arguments (or properties) that you’d need to pass when creating those resources.
And of course, practical learning is the best. So when we’re done with reading the article, you can find, clone and run the full code example from this repo and experiment with the infrastructure as you please.
Rightio — that’s enough of the intro, let’s get to it!
First up, we need go through a bit of theory about IP and IP addresses, but stick with me, it’s worth it. Remember, this post is about fundamentals, so resist the urge to skip ahead!
IP addresses are a series of unique numbers that are assigned to a computer to make it accessible within a given network. IP addresses look like:
126.96.36.199. Typically when we talk about IP addresses we’re talking in the context of the public internet. The public internet is an open network available around the world.
But, on the internet we don’t come into contact with these raw IP’s all that often. And that’s because DNS (the Domain Name System) maps these machine relevant IP’s to friendlier names that we use more often.
If you’re interested try putting
188.8.131.52 into your search bar. That’s the current elastic IP address of this website. Or to be more precise, it’s an IP owned by amazon that is currently routing to an EC2 machine that is running wordpress.
My website is an example of something that’s public on the internet, and I want it to be! However, not every machine needs to be on the internet. Such as back-office business functions like accounting. These machines need to be accessed by someone, but not anyone on the internet.
And to do enable this, we use private address space.
When the internet was coming of age, a decision was taken to reserve the following address spaces for private use:
You’ll recognise the IP address format, but you’ll notice the numbers are followed by a number:
/12. That number is known as a netmask, and it defines a range of IP addresses. But, we’ll cover that in more detail later. All you need to know for now is that there are a (large!) number of IP addresses that are reserved for private use, i.e not on the public internet. And we can use these private address spaces to our advantage.
And that neatly brings us to what we’ll be building today! Which is a VPC network, that contains a private network (using the above address spaces) and is broken down into three smaller subnetworks. One subnetwork will be granted internet access, and we’ll deploy into it a web server. The other two subnets will be private, and could be used for things like internal business functions as we said before.
Todays reference architecture.
We’ll go through the components we’d need to build this type of architecture component-by-component.
So let’s go ahead and start with the most important topic: Virtual Private Clouds.
A VPC or Virtual Private Cloud is a way to logically separated resources when you’re working in AWS. AWS own lots of machines so a VPC is basically a way to lay claim to the machines that belong to you so that no-one else can access them. When we have a VPC the resources contained within it can only communicate with other VPC resources. Well, unless we do some special tricks to connect VPC’s, but we’ll not be covering that today.
VPC’s are therefore simply a way to ring-fence a business, or even sub-sections of a business. We could even use VPC’s to implement different environments, such as demo, test, staging and production environments. Since production environments don’t need access to our test environments, and vice versa. There are many different use cases for the VPC.
A VPC, being a network has an allocated address space. When we create resources in our VPC they have to sit within our VPC’s dedicated address space. Remember the private address ranges we talked about before?
Well, a VPC in AWS can be as big as
64,000 unique IP’s right down to as small as
But what determines the size? Well, many factors including our prediction of future growth, but let’s not worry too much about that now.
A final thing to consider about VPC’s, or any private network are: namespace collisions.
What do I mean?
Well if we want to connect two VPC’s together and they are using the same private address space it’s going to cause issues. Which means that we’ll likely want to create all our private networks with different IP address ranges if we can.
Let’s take a look at what creating a VPC in Terraform looks like…
Creating a VPC with Terraform
As you can see from above all that’s required to create a new VPC is the address range, which is a CIDR block.
If you are wondering what that weird looking IP is don’t worry — we’ll cover it soon.
All you need to know is that a VPC is created when we define a range of private IP’s to allocate to it. Simple.
But before we talk about those IP address ranges, CIDR blocks, let’s introduce subnets as they are closely related to VPC’s.
A subnet is (one of the few intuitively named fundamentals that we’ll cover today!) very much what it sounds like. A subnets is simply a smaller piece of a larger network. A network can be chopped up into smaller pieces so that different networking rules can be applied to them.
Let’s take a look at the Terraform:
Here you can see we’re creating the VPC (as before). But now we’re also creating a subnet. In order to create the subnet we need to define the availability zone, since they can only exist in one, the VPC it belongs to, and it’s size.
But, there’s that strange looking IP address thingy again (the CIDR block)…
I’ve stalled enough about talking about it, so let’s cover off the Netmask, what it is and how it lets us define ranges for VPC’s and subnets.
Netmasks can seem a little daunting at first — they’re probably the most confusing AWS networking fundamental we’ll cover today.
But when the fog clears you’ll be much better for knowing them, they’re useful and they come up a lot.
A CIDR block with a Netmask looks like this:
You’ll see this CIDR block format in AWS all the time — and it might have you wondering about what exactly it is.
The first part of the CIDR block, the four numbers
10.0.0.1 (separated by dots) represent a number up to
256 each. Why
256? because each chunk is 8 bits, and 8 bits can represent
2^8 numbers in total.
That’s the anatomy of an IP address.
Well, in reality an IP address looks like this:
4 blocks of
8 bits (so
32 in total) where each block holds up to
256 unique address spaces.
So why don’t we just use the long binary representation? Because the base 10 equivalent (e.g.
10.256.0.0) is shorter to write and easier to read (eventually!). However — do keep in mind the underlying binary equivalent as it’s important when you’re making CIDR calculations.
The second part of the number, in the previous example:
/24 is what’s called a Netmask. A Netmask defines where a network range ends. When combined with the IP (the four numbers with dots) we then have a range which starts at the IP and ends after the Netmask’s specified number of addresses. The Netmask represents the number of bits in an IP (out of 32) that are dedicated to the outside network. So whatever is left (out of the original 32) is what is given to our network.
To understand this better, let’s look at an example…
For instance, a Netmask of
/24 gives up
24 bits to the network. Which leaves us with
8 bits left because
32 - 8 = 24. And 8 bits is equal to
2^8 total addresses which is
Why is it
2^8 you might be wondering? Because that’s how binary works. For every additional bit space we have, it squares the number of possibilities of numbers. The binary
0 has two possible values
1 , which is equal to:
2^1 = 2. The binary
00has four possible values:
11 which is equal to
2^2 = 4.
So to recap…
To calculate a netmask, take the
/22 number, subtract from
32 and put the remainder as the power of
32 -22 = 10 therefore a netmask of
22 gives our network a potential
As we said before,
64,000 is the limit to the size of a network in AWS — which is a
/16 bit Netmask. Whereas the smallest range that you can have is 16, which means that you have four bits:
Remember — a Netmask simply defines a range of IP’s, starting from the IP and ending at the Netmask determined endpoint.
If you’re interested to play around more with CIDR address ranges a subnet calculator would be a smart choice.
A subnet calculator cidr.xyz
A good exercise to do with the calculator is imagine you’re implementing a VPC and associated child networks. Have a go putting in different networks, look at their size and try to guess which answers you’d get right.
Next up on our list of fundamentals for AWS networking is a route table.
A route table is heavily related to a subnet, as a route table is what decides how traffic flows between subnets. Do you need to move traffic from your public to your private network? Routing tables would need to be setup in order to define where services can access.
Example creation of 2 route tables (private and public)
The above code is quite granularly broken down but you can see we’re creating two route table entries, a private route entry with no routes allowed (truly private) and another that is connected to the VPC’s internet gateway. We can re-use our private route table with both of our private subnets.
Without routing tables we’d have chunks of network with no rules about who can talk to who and how. The final record in the above merely binds together the two parts: our subnet and our route table (public).
Using this code we’re now enabling allowing traffic to flow to and from our public subnet! Well, nearly we need one more component…
An internet gateway is an AWS component that when attached to a VPC, gives the VPC public internet access.
But… let’s take it back a step first! Remember, a subnet can be public or private. A public network is simply a network that has internet access.
However, it’s worth pointing out that currently you can’t create a private or public subnet in AWS directly as such.
What do I mean?
Well, to create a public subnet we need to create a regular subnet, but update it’s route table to point to the internet, and ensure that our VPC is setup with an internet gateway.
The topic of internet gateways gets a little more confusing when we start thinking about internet access. Why? Because some services need internet coming in (like a static website). Whereas other services need traffic that flows out, but not in — such as an internal private micro-service that needs to pull in dependencies from other internet based services.
Setting up an internet gateway is simple, merely create the resource and attach it to a VPC.
By this point if you were following along with the code we have a VPC and some subnets, we’ve made one of our subnets public by attaching a route table and an internet gateway. Now it’s time to place a resource, our EC2 instance (a web server) into our public subnet so that it can be accessed by the world. Yay!
Well… actually not so fast. Because we’ve got one more issue. Now traffic can come from the internet, through the internet gateway, into our public subnet and knock on the door of our instance, but it can’t get in. Why? Because instances have firewalls. And these firewalls are called security groups. In order to allow our instance to be accessed we need to enable public traffic through our security group.
A security group is a set of networking rules that are applied to a resource. A security group is responsible for defining what traffic (based on port and protocol) can enter or leave certain resources. A single resource can reference many different security groups to aggregate different types of access. For instance, we might want to have a security group that allows HTTP and HTTPS traffic into our website. However we might want SSH access for our service, but we’ll want to ensure the SSH access is limited by IP for instance and not the whole internet.
Let’s see what a security group looks like in Terraform…
An example of a subnet.
Here we can see that we’ve got an ingress rule, which means traffic that is flowing into our firewall has to adhere to the rule, which is allowing TCP traffic on port 80. Which is to allow basic web request access. We’ve also got an egress rule, which allows all traffic. The egress rule allows the instance to perform calls out to the internet however it wants, in our case we’re allowing it to do so because we need to install our web server, which requires access to the internet.
And with our subnet in place traffic can now get to our public instance via our API gateway, it uses the route table entry to flow traffic into our public subnet, the security group allows traffic to hit the instance and voila! We have our a working web server running inside a public subnet in a custom VPC — Awesome!
We covered a lot of AWS networking fundamentals today, so I want to quickly re-cap everything:
- IP — Address used to map requests to machines, can be public or private.
- VPC — A slice of the AWS cloud infrastructure
- Subnet – A portion of a larger network, usually a subnetwork of a VPC.
- Netmask — A way of denoting a range of IP used to splice up a network into subnets.
- Route Table — A set of rules that are assigned to a subnet which define how subnets communicate.
- Internet Gateway — An AWS resource that gives a subnet access to the public internet.
- Security group — Essentially a firewall that dictates which traffic (via protocols and ports) can access a resource.
And that’s all we have time for today. We covered a lot, I know. But I wanted to just touch the surface of these important concepts. If you focus your energy on learning anything related to AWS networking, focus on these aspects first.
And as always remember that the best way to learn concepts is to get your hands dirty. So take the code examples, we discussed, try to break them and generally have fun with them. Just make sure to setup your AWS account correctly first. The repo has the Terraform to create your VPC, Subnets, Internet Gateway, Route Table entries, a web server instance and a security group. Try using it as a reference and re-building your very own VPC piece-by-piece.
I hope the fog cleared a little more for you today, stick at it, keep reading and you’ll get there. Don’t forget to sign-up to the newsletter where every two weeks you’ll get articles on fundamentals of cloud native, the latest news and generally stay up-to-date.
What were the hardest networking concepts you’ve come into contact with so far?
The post AWS networking fundamentals: A simple guide for software engineers. appeared first on The Dev Coach.