AWS EKS is a managed container service to run and scale Kubernetes applications in the cloud or on-premises.
HashiCorp Terraform is an Infrastructure as Code (IaC) tool that lets us define both cloud and on-prem resources in human-readable configuration files that can version, reuse, and share.
Amazon EKS Cluster using Terraform:
This Repository will be used to keep infrastructure configuration for eks cluster
Prerequisite
- AWS Account
- Terraform
- AWS CLI
To configure AWS CLI, We need to enter AWS Access Key ID, Secret Access Key, region and output format. Please note proper privillage is required to create eks cluster resources.
$ aws configure
AWS Access Key ID [None]: AWS_ACCESS_KEY_ID
AWS Secret Access Key [None]: AWS_SECRET_ACCESS_KEY
Default region name [None]: AWS_REGION
Default output format [None]: json
Terraform Initial Setup Configuration
Need to create an AWS provider. It allows to interact with the AWS resources, such as VPC, EKS, S3, EC2, and many others.
providers.tf
terraform {
required_version = ">= 1.1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
region = var.region
access_key = var.aws_access_key
secret_key = var.aws_secret_key
# other options for authentication
}
Terraform State Setup
Now, We need to create terraform backend to specify the location of the backend Terraform state file on S3.
Remote state is storing that state file remotely, rather than on my local filesystem.
backend.tf
terraform {
backend "s3" {
bucket = "mondev-terraform-states"
key = "terraform-aws-eks-mondev.tfstate"
region = "ap-southeast-1"
encrypt = true
}
}
Network Infrastructure Setup
Setting up the VPC, Subnets, Security Groups, etc.
Amazon EKS requires subnets must be in at least two different availability zones.
- AWS VPC (Virtual Private Cloud).
- Two public and two private Subnets in different availability zones.
- Internet Gateway to provide internet access for services within VPC.
- NAT Gateway in public subnets. It is used in private subnets to allow services to connect to the internet.
- Routing Tables and associate subnets with them. Add required routing rules.
- Security Groups and associate subnets with them. Add required routing rules.
vpc.tf
# VPC
resource "aws_vpc" "mondev" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project}-vpc",
"kubernetes.io/cluster/${var.project}-cluster" = "shared"
}
}
# Public Subnets
resource "aws_subnet" "public" {
count = var.availability_zones_count
vpc_id = aws_vpc.mondev.id
cidr_block = cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "${var.project}-public-sg"
"kubernetes.io/cluster/${var.project}-cluster" = "shared"
"kubernetes.io/role/elb" = 1
}
map_public_ip_on_launch = true
}
# Private Subnets
resource "aws_subnet" "private" {
count = var.availability_zones_count
vpc_id = aws_vpc.mondev.id
cidr_block = cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, count.index + var.availability_zones_count)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "${var.project}-private-sg"
"kubernetes.io/cluster/${var.project}-cluster" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
# Internet Gateway
resource "aws_internet_gateway" "mondev" {
vpc_id = aws_vpc.mondev.id
tags = {
"Name" = "${var.project}-igw"
}
depends_on = [aws_vpc.mondev]
}
# Route Table(s)
# Route the public subnet traffic through the IGW
resource "aws_route_table" "main" {
vpc_id = aws_vpc.mondev.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.mondev.id
}
tags = {
Name = "${var.project}-Default-rt"
}
}
# Route table and subnet associations
resource "aws_route_table_association" "internet_access" {
count = var.availability_zones_count
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.main.id
}
# NAT Elastic IP
resource "aws_eip" "main" {
vpc = true
tags = {
Name = "${var.project}-ngw-ip"
}
}
# NAT Gateway
resource "aws_nat_gateway" "main" {
allocation_id = aws_eip.main.id
subnet_id = aws_subnet.public[0].id
tags = {
Name = "${var.project}-ngw"
}
}
# Add route to route table
resource "aws_route" "main" {
route_table_id = aws_vpc.mondev.default_route_table_id
nat_gateway_id = aws_nat_gateway.main.id
destination_cidr_block = "0.0.0.0/0"
}
# Security group for public subnet
resource "aws_security_group" "public_sg" {
name = "${var.project}-Public-sg"
vpc_id = aws_vpc.mondev.id
tags = {
Name = "${var.project}-Public-sg"
}
}
# Security group traffic rules
resource "aws_security_group_rule" "sg_ingress_public_443" {
security_group_id = aws_security_group.public_sg.id
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "sg_ingress_public_80" {
security_group_id = aws_security_group.public_sg.id
type = "ingress"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "sg_egress_public" {
security_group_id = aws_security_group.public_sg.id
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
# Security group for data plane
resource "aws_security_group" "data_plane_sg" {
name = "${var.project}-Worker-sg"
vpc_id = aws_vpc.mondev.id
tags = {
Name = "${var.project}-Worker-sg"
}
}
# Security group traffic rules
resource "aws_security_group_rule" "nodes" {
description = "Allow nodes to communicate with each other"
security_group_id = aws_security_group.data_plane_sg.id
type = "ingress"
from_port = 0
to_port = 65535
protocol = "-1"
cidr_blocks = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 0), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 1), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
}
resource "aws_security_group_rule" "nodes_inbound" {
description = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
security_group_id = aws_security_group.data_plane_sg.id
type = "ingress"
from_port = 1025
to_port = 65535
protocol = "tcp"
cidr_blocks = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
# cidr_blocks = flatten([var.private_subnet_cidr_blocks])
}
resource "aws_security_group_rule" "node_outbound" {
security_group_id = aws_security_group.data_plane_sg.id
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
# Security group for control plane
resource "aws_security_group" "control_plane_sg" {
name = "${var.project}-ControlPlane-sg"
vpc_id = aws_vpc.mondev.id
tags = {
Name = "${var.project}-ControlPlane-sg"
}
}
# Security group traffic rules
resource "aws_security_group_rule" "control_plane_inbound" {
security_group_id = aws_security_group.control_plane_sg.id
type = "ingress"
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 0), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 1), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
# cidr_blocks = flatten([var.private_subnet_cidr_blocks, var.public_subnet_cidr_blocks])
}
resource "aws_security_group_rule" "control_plane_outbound" {
security_group_id = aws_security_group.control_plane_sg.id
type = "egress"
from_port = 0
to_port = 65535
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
EKS Cluster Setup
Creating EKS cluster. Kubernetes clusters managed by Amazon EKS make calls to other AWS services on your behalf to manage the resources that we use with the service. For example, EKS will create an Auto Scaling Groups for each instance group if we use managed nodes.
Setting up the IAM Roles and Policies for EKS: EKS requires a few IAM Roles with relevant Policies to be pre-defined to operate correctly.
IAM Role: Create Role with the needed permissions that Amazon EKS will use to create AWS resources for Kubernetes clusters and interact with AWS APIs.
IAM Policy: Attach the trusted Policy (AmazonEKSClusterPolicy) which will allow Amazon EKS to assume and use this role.
eks-cluster.tf
# EKS Cluster
resource "aws_eks_cluster" "mondev" {
name = "${var.project}-cluster"
role_arn = aws_iam_role.cluster.arn
version = "1.22"
vpc_config {
# security_group_ids = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id] # already applied to subnet
subnet_ids = flatten([aws_subnet.public[*].id, aws_subnet.private[*].id])
endpoint_private_access = true
endpoint_public_access = true
public_access_cidrs = ["0.0.0.0/0"]
}
tags = merge(
var.tags
)
depends_on = [
aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy
]
}
# EKS Cluster IAM Role
resource "aws_iam_role" "cluster" {
name = "${var.project}-Cluster-Role"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "eks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.cluster.name
}
# EKS Cluster Security Group
resource "aws_security_group" "eks_cluster" {
name = "${var.project}-cluster-sg"
description = "Cluster communication with worker nodes"
vpc_id = aws_vpc.mondev.id
tags = {
Name = "${var.project}-cluster-sg"
}
}
resource "aws_security_group_rule" "cluster_inbound" {
description = "Allow worker nodes to communicate with the cluster API Server"
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.eks_cluster.id
source_security_group_id = aws_security_group.eks_nodes.id
to_port = 443
type = "ingress"
}
resource "aws_security_group_rule" "cluster_outbound" {
description = "Allow cluster API Server to communicate with the worker nodes"
from_port = 1024
protocol = "tcp"
security_group_id = aws_security_group.eks_cluster.id
source_security_group_id = aws_security_group.eks_nodes.id
to_port = 65535
type = "egress"
Node Groups (Managed) Setup
Creating a Node Group(s) to run application workload.
IAM Role: Similar to the EKS cluster, before we create worker node group, we must create IAM role with needed permissions for the node group to communicate with other AWS services.
IAM Policy: Attach the trusted Policy (AmazonEKSWorkerNodePolicy) which will allow amazon EC2 to assume and using this role. Also, attach the AWS managed permission Policy (AmazonEKS_CNI_Policy, AmazonEC2ContainerRegistryReadOnly).
node-groups.tf
# EKS Node Groups
resource "aws_eks_node_group" "mondev" {
cluster_name = aws_eks_cluster.mondev.name
node_group_name = var.project
node_role_arn = aws_iam_role.node.arn
subnet_ids = aws_subnet.private[*].id
scaling_config {
desired_size = 2
max_size = 5
min_size = 1
}
ami_type = "AL2_x86_64" # AL2_x86_64, AL2_x86_64_GPU, AL2_ARM_64, CUSTOM
capacity_type = "ON_DEMAND" # ON_DEMAND, SPOT
disk_size = 20
instance_types = ["t2.medium"]
tags = merge(
var.tags
)
depends_on = [
aws_iam_role_policy_attachment.node_AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.node_AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.node_AmazonEC2ContainerRegistryReadOnly,
]
}
# EKS Node IAM Role
resource "aws_iam_role" "node" {
name = "${var.project}-Worker-Role"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "node_AmazonEKSWorkerNodePolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.node.name
}
resource "aws_iam_role_policy_attachment" "node_AmazonEKS_CNI_Policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.node.name
}
resource "aws_iam_role_policy_attachment" "node_AmazonEC2ContainerRegistryReadOnly" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.node.name
}
# EKS Node Security Group
resource "aws_security_group" "eks_nodes" {
name = "${var.project}-node-sg"
description = "Security group for all nodes in the cluster"
vpc_id = aws_vpc.mondev.id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project}-node-sg"
"kubernetes.io/cluster/${var.project}-cluster" = "owned"
}
}
resource "aws_security_group_rule" "nodes_internal" {
description = "Allow nodes to communicate with each other"
from_port = 0
protocol = "-1"
security_group_id = aws_security_group.eks_nodes.id
source_security_group_id = aws_security_group.eks_nodes.id
to_port = 65535
type = "ingress"
}
resource "aws_security_group_rule" "nodes_cluster_inbound" {
description = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
from_port = 1025
protocol = "tcp"
security_group_id = aws_security_group.eks_nodes.id
source_security_group_id = aws_security_group.eks_cluster.id
to_port = 65535
type = "ingress"
}
Terraform Variables
Creating IAM user with administrator access to the AWS account, and get access key and secret key for authentication.
variables.tf
variable "aws_access_key" {
description = "AWS access key"
type = string
}
variable "aws_secret_key" {
description = "AWS secret key"
type = string
}
variable "region" {
description = "The aws region. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html"
type = string
default = "ap-southeast-1"
}
variable "availability_zones_count" {
description = "The number of AZs."
type = number
default = 2
}
variable "project" {
description = "MonirulProject"
type = string
}
variable "vpc_cidr" {
description = "The CIDR block for the VPC. Default value is a valid CIDR, but not acceptable by AWS and should be overridden"
type = string
default = "10.0.0.0/16"
}
variable "subnet_cidr_bits" {
description = "The number of subnet bits for the CIDR. For example, specifying a value 8 for this parameter will create a CIDR with a mask of /24."
type = number
default = 8
}
variable "tags" {
description = "A map of tags to add to all resources"
type = map(string)
default = {
"Project" = "MonirulProject"
"Environment" = "Development"
"Owner" = "Monirul"
}
}
Set terraform variables values as per requirements.
terraform.tfvars
aws_access_key = "aaaaaaaaaaaaaa"
aws_secret_key = "bbbbbbbbbbbbbbbbbbbbb"
region = "ap-southeast-1"
availability_zones_count = 2
project = "MonirulProject"
vpc_cidr = "10.0.0.0/16"
subnet_cidr_bits = 8
And, terraform data sources as well.
data-sources.tf
data "aws_availability_zones" "available" {
state = "available"
}
Launch EKS Infrastructure
Once we have finished declaring the resources, we can deploy all resources.
$ terraform init
$ terraform plan
$ terraform apply
Output
$ terraform output
Project Structure
Cluster
|-- README.md
|-- backend.tf
|-- data-sources.tf
|-- eks-cluster.tf
|-- node-groups.tf
|-- outputs.tf
|-- providers.tf
|-- terraform.tfvars
|-- variables.tf
|-- vpc.tf
Access Cluster and create different namespace if required
aws eks --region ap-southeast-1 update-kubeconfig --name MonirulProject-cluster
kubectl create ns dev && kubectl create ns stg && kubectl create ns prd
Availability zone (Pod Topology)
Clean up workspace
$ terraform destroy
Workspaces for multiple environments
To manage multiple distinct sets of infrastructure resources/environments.
Instead of creating a new directory for each environment to manage we need to just create workspace and use them.
$ terraform workspace new dev
$ terraform workspace new stg
$ terraform workspace new prd
$ terraform workspace list
Top comments (0)