DEV Community

M.M.Monirul Islam
M.M.Monirul Islam

Posted on

Provisioning Amazon Elastic Kubernetes Service (Amazon EKS) with Terraform

AWS EKS is a managed container service to run and scale Kubernetes applications in the cloud or on-premises.

HashiCorp Terraform is an Infrastructure as Code (IaC) tool that lets us define both cloud and on-prem resources in human-readable configuration files that can version, reuse, and share.

Amazon EKS Cluster using Terraform:

This Repository will be used to keep infrastructure configuration for eks cluster

Prerequisite

  • AWS Account
  • Terraform
  • AWS CLI

To configure AWS CLI, We need to enter AWS Access Key ID, Secret Access Key, region and output format. Please note proper privillage is required to create eks cluster resources.

$ aws configure
AWS Access Key ID [None]: AWS_ACCESS_KEY_ID
AWS Secret Access Key [None]: AWS_SECRET_ACCESS_KEY
Default region name [None]: AWS_REGION
Default output format [None]: json
Enter fullscreen mode Exit fullscreen mode

Terraform Initial Setup Configuration

Need to create an AWS provider. It allows to interact with the AWS resources, such as VPC, EKS, S3, EC2, and many others.

providers.tf

terraform {
  required_version = ">= 1.1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

provider "aws" {
  region = var.region

  access_key = var.aws_access_key
  secret_key = var.aws_secret_key

  # other options for authentication
}

Enter fullscreen mode Exit fullscreen mode

Terraform State Setup

Now, We need to create terraform backend to specify the location of the backend Terraform state file on S3.
Remote state is storing that state file remotely, rather than on my local filesystem.
backend.tf

terraform {
  backend "s3" {
    bucket  = "mondev-terraform-states"
    key     = "terraform-aws-eks-mondev.tfstate"
    region  = "ap-southeast-1"
    encrypt = true
  }
}
Enter fullscreen mode Exit fullscreen mode

Network Infrastructure Setup

Setting up the VPC, Subnets, Security Groups, etc.
Amazon EKS requires subnets must be in at least two different availability zones.

  • AWS VPC (Virtual Private Cloud).
  • Two public and two private Subnets in different availability zones.
  • Internet Gateway to provide internet access for services within VPC.
  • NAT Gateway in public subnets. It is used in private subnets to allow services to connect to the internet.
  • Routing Tables and associate subnets with them. Add required routing rules.
  • Security Groups and associate subnets with them. Add required routing rules.

vpc.tf

# VPC
resource "aws_vpc" "mondev" {
  cidr_block = var.vpc_cidr

  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name                                           = "${var.project}-vpc",
    "kubernetes.io/cluster/${var.project}-cluster" = "shared"
  }
}

# Public Subnets
resource "aws_subnet" "public" {
  count = var.availability_zones_count

  vpc_id            = aws_vpc.mondev.id
  cidr_block        = cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name                                           = "${var.project}-public-sg"
    "kubernetes.io/cluster/${var.project}-cluster" = "shared"
    "kubernetes.io/role/elb"                       = 1
  }

  map_public_ip_on_launch = true
}

# Private Subnets
resource "aws_subnet" "private" {
  count = var.availability_zones_count

  vpc_id            = aws_vpc.mondev.id
  cidr_block        = cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, count.index + var.availability_zones_count)
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name                                           = "${var.project}-private-sg"
    "kubernetes.io/cluster/${var.project}-cluster" = "shared"
    "kubernetes.io/role/internal-elb"              = 1
  }
}

# Internet Gateway
resource "aws_internet_gateway" "mondev" {
  vpc_id = aws_vpc.mondev.id

  tags = {
    "Name" = "${var.project}-igw"
  }

  depends_on = [aws_vpc.mondev]
}

# Route Table(s)
# Route the public subnet traffic through the IGW
resource "aws_route_table" "main" {
  vpc_id = aws_vpc.mondev.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.mondev.id
  }

  tags = {
    Name = "${var.project}-Default-rt"
  }
}

# Route table and subnet associations
resource "aws_route_table_association" "internet_access" {
  count = var.availability_zones_count

  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.main.id
}

# NAT Elastic IP
resource "aws_eip" "main" {
  vpc = true

  tags = {
    Name = "${var.project}-ngw-ip"
  }
}

# NAT Gateway
resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.main.id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.project}-ngw"
  }
}

# Add route to route table
resource "aws_route" "main" {
  route_table_id         = aws_vpc.mondev.default_route_table_id
  nat_gateway_id         = aws_nat_gateway.main.id
  destination_cidr_block = "0.0.0.0/0"
}

# Security group for public subnet
resource "aws_security_group" "public_sg" {
  name   = "${var.project}-Public-sg"
  vpc_id = aws_vpc.mondev.id

  tags = {
    Name = "${var.project}-Public-sg"
  }
}

# Security group traffic rules
resource "aws_security_group_rule" "sg_ingress_public_443" {
  security_group_id = aws_security_group.public_sg.id
  type              = "ingress"
  from_port         = 443
  to_port           = 443
  protocol          = "tcp"
  cidr_blocks       = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "sg_ingress_public_80" {
  security_group_id = aws_security_group.public_sg.id
  type              = "ingress"
  from_port         = 80
  to_port           = 80
  protocol          = "tcp"
  cidr_blocks       = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "sg_egress_public" {
  security_group_id = aws_security_group.public_sg.id
  type              = "egress"
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  cidr_blocks       = ["0.0.0.0/0"]
}

# Security group for data plane
resource "aws_security_group" "data_plane_sg" {
  name   = "${var.project}-Worker-sg"
  vpc_id = aws_vpc.mondev.id

  tags = {
    Name = "${var.project}-Worker-sg"
  }
}

# Security group traffic rules
resource "aws_security_group_rule" "nodes" {
  description       = "Allow nodes to communicate with each other"
  security_group_id = aws_security_group.data_plane_sg.id
  type              = "ingress"
  from_port         = 0
  to_port           = 65535
  protocol          = "-1"
  cidr_blocks       = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 0), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 1), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
}

resource "aws_security_group_rule" "nodes_inbound" {
  description       = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
  security_group_id = aws_security_group.data_plane_sg.id
  type              = "ingress"
  from_port         = 1025
  to_port           = 65535
  protocol          = "tcp"
  cidr_blocks       = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
  # cidr_blocks       = flatten([var.private_subnet_cidr_blocks])
}

resource "aws_security_group_rule" "node_outbound" {
  security_group_id = aws_security_group.data_plane_sg.id
  type              = "egress"
  from_port         = 0
  to_port           = 0
  protocol          = "-1"
  cidr_blocks       = ["0.0.0.0/0"]
}

# Security group for control plane
resource "aws_security_group" "control_plane_sg" {
  name   = "${var.project}-ControlPlane-sg"
  vpc_id = aws_vpc.mondev.id

  tags = {
    Name = "${var.project}-ControlPlane-sg"
  }
}

# Security group traffic rules
resource "aws_security_group_rule" "control_plane_inbound" {
  security_group_id = aws_security_group.control_plane_sg.id
  type              = "ingress"
  from_port         = 0
  to_port           = 65535
  protocol          = "tcp"
  cidr_blocks       = flatten([cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 0), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 1), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 2), cidrsubnet(var.vpc_cidr, var.subnet_cidr_bits, 3)])
  # cidr_blocks       = flatten([var.private_subnet_cidr_blocks, var.public_subnet_cidr_blocks])
}

resource "aws_security_group_rule" "control_plane_outbound" {
  security_group_id = aws_security_group.control_plane_sg.id
  type              = "egress"
  from_port         = 0
  to_port           = 65535
  protocol          = "-1"
  cidr_blocks       = ["0.0.0.0/0"]
}
Enter fullscreen mode Exit fullscreen mode

EKS Cluster Setup

Creating EKS cluster. Kubernetes clusters managed by Amazon EKS make calls to other AWS services on your behalf to manage the resources that we use with the service. For example, EKS will create an Auto Scaling Groups for each instance group if we use managed nodes.

Setting up the IAM Roles and Policies for EKS: EKS requires a few IAM Roles with relevant Policies to be pre-defined to operate correctly.

IAM Role: Create Role with the needed permissions that Amazon EKS will use to create AWS resources for Kubernetes clusters and interact with AWS APIs.

IAM Policy: Attach the trusted Policy (AmazonEKSClusterPolicy) which will allow Amazon EKS to assume and use this role.

eks-cluster.tf

# EKS Cluster
resource "aws_eks_cluster" "mondev" {
  name     = "${var.project}-cluster"
  role_arn = aws_iam_role.cluster.arn
  version  = "1.22"

  vpc_config {
    # security_group_ids      = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id] # already applied to subnet
    subnet_ids              = flatten([aws_subnet.public[*].id, aws_subnet.private[*].id])
    endpoint_private_access = true
    endpoint_public_access  = true
    public_access_cidrs     = ["0.0.0.0/0"]
  }

  tags = merge(
    var.tags
  )

  depends_on = [
    aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy
  ]
}


# EKS Cluster IAM Role
resource "aws_iam_role" "cluster" {
  name = "${var.project}-Cluster-Role"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.cluster.name
}


# EKS Cluster Security Group
resource "aws_security_group" "eks_cluster" {
  name        = "${var.project}-cluster-sg"
  description = "Cluster communication with worker nodes"
  vpc_id      = aws_vpc.mondev.id

  tags = {
    Name = "${var.project}-cluster-sg"
  }
}

resource "aws_security_group_rule" "cluster_inbound" {
  description              = "Allow worker nodes to communicate with the cluster API Server"
  from_port                = 443
  protocol                 = "tcp"
  security_group_id        = aws_security_group.eks_cluster.id
  source_security_group_id = aws_security_group.eks_nodes.id
  to_port                  = 443
  type                     = "ingress"
}

resource "aws_security_group_rule" "cluster_outbound" {
  description              = "Allow cluster API Server to communicate with the worker nodes"
  from_port                = 1024
  protocol                 = "tcp"
  security_group_id        = aws_security_group.eks_cluster.id
  source_security_group_id = aws_security_group.eks_nodes.id
  to_port                  = 65535
  type                     = "egress"
Enter fullscreen mode Exit fullscreen mode

Node Groups (Managed) Setup

Creating a Node Group(s) to run application workload.
IAM Role: Similar to the EKS cluster, before we create worker node group, we must create IAM role with needed permissions for the node group to communicate with other AWS services.

IAM Policy: Attach the trusted Policy (AmazonEKSWorkerNodePolicy) which will allow amazon EC2 to assume and using this role. Also, attach the AWS managed permission Policy (AmazonEKS_CNI_Policy, AmazonEC2ContainerRegistryReadOnly).

node-groups.tf

# EKS Node Groups
resource "aws_eks_node_group" "mondev" {
  cluster_name    = aws_eks_cluster.mondev.name
  node_group_name = var.project
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = aws_subnet.private[*].id

  scaling_config {
    desired_size = 2
    max_size     = 5
    min_size     = 1
  }

  ami_type       = "AL2_x86_64" # AL2_x86_64, AL2_x86_64_GPU, AL2_ARM_64, CUSTOM
  capacity_type  = "ON_DEMAND"  # ON_DEMAND, SPOT
  disk_size      = 20
  instance_types = ["t2.medium"]

  tags = merge(
    var.tags
  )

  depends_on = [
    aws_iam_role_policy_attachment.node_AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.node_AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.node_AmazonEC2ContainerRegistryReadOnly,
  ]
}


# EKS Node IAM Role
resource "aws_iam_role" "node" {
  name = "${var.project}-Worker-Role"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "node_AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = aws_iam_role.node.name
}

resource "aws_iam_role_policy_attachment" "node_AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = aws_iam_role.node.name
}

resource "aws_iam_role_policy_attachment" "node_AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = aws_iam_role.node.name
}


# EKS Node Security Group
resource "aws_security_group" "eks_nodes" {
  name        = "${var.project}-node-sg"
  description = "Security group for all nodes in the cluster"
  vpc_id      = aws_vpc.mondev.id

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name                                           = "${var.project}-node-sg"
    "kubernetes.io/cluster/${var.project}-cluster" = "owned"
  }
}

resource "aws_security_group_rule" "nodes_internal" {
  description              = "Allow nodes to communicate with each other"
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = aws_security_group.eks_nodes.id
  source_security_group_id = aws_security_group.eks_nodes.id
  to_port                  = 65535
  type                     = "ingress"
}

resource "aws_security_group_rule" "nodes_cluster_inbound" {
  description              = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
  from_port                = 1025
  protocol                 = "tcp"
  security_group_id        = aws_security_group.eks_nodes.id
  source_security_group_id = aws_security_group.eks_cluster.id
  to_port                  = 65535
  type                     = "ingress"
}
Enter fullscreen mode Exit fullscreen mode

Terraform Variables

Creating IAM user with administrator access to the AWS account, and get access key and secret key for authentication.
variables.tf

variable "aws_access_key" {
  description = "AWS access key"
  type        = string
}

variable "aws_secret_key" {
  description = "AWS secret key"
  type        = string
}

variable "region" {
  description = "The aws region. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html"
  type        = string
  default     = "ap-southeast-1"
}

variable "availability_zones_count" {
  description = "The number of AZs."
  type        = number
  default     = 2
}

variable "project" {
  description = "MonirulProject"
  type        = string
}

variable "vpc_cidr" {
  description = "The CIDR block for the VPC. Default value is a valid CIDR, but not acceptable by AWS and should be overridden"
  type        = string
  default     = "10.0.0.0/16"
}

variable "subnet_cidr_bits" {
  description = "The number of subnet bits for the CIDR. For example, specifying a value 8 for this parameter will create a CIDR with a mask of /24."
  type        = number
  default     = 8
}

variable "tags" {
  description = "A map of tags to add to all resources"
  type        = map(string)
  default = {
    "Project"     = "MonirulProject"
    "Environment" = "Development"
    "Owner"       = "Monirul"
  }
}
Enter fullscreen mode Exit fullscreen mode

Set terraform variables values as per requirements.
terraform.tfvars

aws_access_key = "aaaaaaaaaaaaaa"
aws_secret_key = "bbbbbbbbbbbbbbbbbbbbb"

region                   = "ap-southeast-1"
availability_zones_count = 2

project = "MonirulProject"

vpc_cidr         = "10.0.0.0/16"
subnet_cidr_bits = 8
Enter fullscreen mode Exit fullscreen mode

And, terraform data sources as well.
data-sources.tf

data "aws_availability_zones" "available" {
  state = "available"
}
Enter fullscreen mode Exit fullscreen mode

Launch EKS Infrastructure

Once we have finished declaring the resources, we can deploy all resources.

$ terraform init
$ terraform plan
$ terraform apply
Enter fullscreen mode Exit fullscreen mode

Image description

Output

$ terraform output
Enter fullscreen mode Exit fullscreen mode

Project Structure

Cluster
|-- README.md
|-- backend.tf
|-- data-sources.tf
|-- eks-cluster.tf
|-- node-groups.tf
|-- outputs.tf
|-- providers.tf
|-- terraform.tfvars
|-- variables.tf
|-- vpc.tf
Enter fullscreen mode Exit fullscreen mode

Access Cluster and create different namespace if required

aws eks --region ap-southeast-1 update-kubeconfig --name MonirulProject-cluster
kubectl create ns dev && kubectl create ns stg && kubectl create ns prd
Enter fullscreen mode Exit fullscreen mode

Availability zone (Pod Topology)

Image description

Clean up workspace

$ terraform destroy
Enter fullscreen mode Exit fullscreen mode

Workspaces for multiple environments

To manage multiple distinct sets of infrastructure resources/environments.
Instead of creating a new directory for each environment to manage we need to just create workspace and use them.

$ terraform workspace new dev
$ terraform workspace new stg
$ terraform workspace new prd
$ terraform workspace list
Enter fullscreen mode Exit fullscreen mode

Discussion (0)