Meysam

Posted on Apr 22, 2024 • Originally published at developer-friendly.blog on Apr 22, 2024

Grant Kubernetes Pods Access to AWS Services Using OpenID Connect

#kubernetes #aws #ansible #opentofu

Grant Kubernetes Pods Access to AWS Services Using OpenID Connect

Learn how to establish a trust relationship between a Kubernetes cluster and AWS IAM to grant cluster generated Service Account tokens access to AWS services using OIDC & without storing long-lived credentials.

Originally published at https://developer-friendly.blog on April 22, 2024.

Introduction

In our previous post, we discussed what OpenID Connect (OIDC) is and how to use it to authenticate identities from one system to another.

We covered why it is crucial to avoid storing long-lived credentials and the benefits of employing OIDC for the task of authentication.

If you haven't read that one already, here's a recap:

OIDC is an authentication protocol that allows the identities in one system to authenticate to another system.
It is based on OAuth 2.0 and JSON Web Tokens (JWT).
Storing long-lived credentials is risky and should be avoided at all cost if possible.
OIDC provides a secure way to authenticate identities without storing long-lived credentials.
It is widely used in modern applications and systems.
The hard requirements is that both the Service Provider and the Identity Provider must be OIDC compliant.
With OIDC you will only keep the identities and their credentials in one system and authenticate them to another system without storing any long-lived credentials. The former is called the Identity Provider and the latter is called the Service Provider.

We also covered a practical example of authenticating GitHub runners to AWS IAM by establishing a trust relationship between GitHub and AWS using OIDC.

In this post, we will take it one step further and provide a way for the pods of our Kubernetes cluster to authenticate to AWS services using OIDC.

This post will provide a walkthrough of granting such access to a bear-metal Kubernetes cluster (k3s) using only the power of OpenID Connect protocol. In a later post, we'll show you how easy it is to achieve the same with a managed Kubernetes cluster like Azure Kubernetes Service (AKS). But, first let's understand the fundamentals by trying it on a bear-metal cluster.

We will not store any credentials in our pods and as such, won't ever have to worry about other security concerns such as secret rotations!

With that intro out of the way, let's dive in!

Prerequisites

Make sure you have the following prerequisites in place before proceeding:

A Kubernetes cluster that can be exposed to the internet.
An AWS account to create an OIDC provider and IAM roles.
A verified root domain name that YOU own. Skip this if you're using a managed Kubernetes cluster.
OpenTofu v1.6
Ansible v2.16

Roadmap

Let's see what we are trying to achieve in this guide.

Our end goal is to create an Identity Provider (IdP) in AWS. After doing so, we will be able to create an IAM Role with a trust relationship to the IdP.

Ultimately, the pods in our Kubernetes cluster that have the desired Service Account(s) will be able to talk to the AWS services.

To achieve this, and as per the OIDC specification, the following endpoints must be exposed through an HTTPS endpoint with a verified TLS certificate:

/.well-known/openid-configuration: This is a MUST for OIDC compliance.
/openid/v1/jwks: This is configurable through the first endpoint as you'll see later.

These endpoints provide the information of the OIDC provider and the public keys used to sign the JWT tokens, respectively. The former will be used by the service provider to validate the OIDC provider and the latter will be used to validate the JWT access tokens provided by the entities that want to talk to the Serivce Provider.

Service Provider refers to the host that provides the service. In our example, AWS is the service provider.

Exposing such endpoints will make our OIDC provider compliant with the OIDC specification. In that regard, any OIDC compliant service provider will be able to trust our OIDC provider.

For an OIDC provider and a Service Provider to trust each other, they must both be OIDC compliant. This means that the OIDC provider must expose certain endpoints and the Service Provider must be able to validate the OIDC provider through those endpoints.

In practice, we will need the following two absolute URLs to be accessible publicly through internet with a verified TLS certificate signed by a trusted Certificate Authority (CA):

https://mydomain.com/.well-known/openid-configuration
https://mydomain.com/openid/v1/jwks

Again, and just to reiterate, as per the OIDC specification the HTTPS is a must and the TLS certificate has to be signed by a trusted Certificate Authority (CA).

When all this is set up, we shall be able to add the https://mydomain.com to the AWS as an OIDC provider.

Step 0: Directory Structure

There are a lot of codes we will cover in this post. It is good to know that to expect. Here's the layout of the directories we will be working with:

.
├── ansible.cfg
├── app/
├── configure-oidc/
├── inventory/
├── k8s/
├── playbook.yml
├── provision-k8s/
├── requirements.yml
└── vars/

Step 1: Dedicated Domain Name

As mentioned, we need to assign a dedicated domain name to the OIDC provider. This will be the address we will add to the AWS IAM as an Identity Provider.

Any DNS provider will do, but for our example, we're using Cloudflare.

variable "hetzner_api_token" {
  type      = string
  nullable  = false
  sensitive = true
}

variable "cloudflare_api_token" {
  type      = string
  nullable  = false
  sensitive = true
}

variable "stack_name" {
  type    = string
  default = "k3s-cluster"
}

variable "primary_ip_datacenter" {
  type    = string
  default = "nbg1-dc3"
}

variable "root_domain" {
  type    = string
  default = "developer-friendly.blog"
}

terraform {
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = "~> 1.46"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.30"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.6"
    }
  }
}

provider "hcloud" {
  token = var.hetzner_api_token
}

provider "cloudflare" {
  api_token = var.cloudflare_api_token
}

resource "hcloud_primary_ip" "this" {
  for_each = toset(["ipv4", "ipv6"])

  name          = "${var.stack_name}-${each.key}"
  datacenter    = var.primary_ip_datacenter
  type          = each.key
  assignee_type = "server"
  auto_delete   = false
}

data "cloudflare_zone" "this" {
  name = var.root_domain
}

resource "random_uuid" "this" {}

resource "cloudflare_record" "this" {
  zone_id = data.cloudflare_zone.this.id

  name    = "${random_uuid.this.id}.${var.root_domain}"
  proxied = false
  ttl     = 60
  type    = "A"
  value   = hcloud_primary_ip.this["ipv4"].ip_address
}


resource "cloudflare_record" "this_v6" {
  zone_id = data.cloudflare_zone.this.id

  name    = "${random_uuid.this.id}.${var.root_domain}"
  proxied = false
  ttl     = 60
  type    = "AAAA"
  value   = hcloud_primary_ip.this["ipv6"].ip_address
}

output "public_ip" {
  value = hcloud_primary_ip.this["ipv4"].ip_address
}

output "public_ipv6" {
  value = hcloud_primary_ip.this["ipv6"].ip_address
}

We would need the required access token which you can get from their respective account settings. If you want to apply the stack, you will need a Cloudflare token and a Hetzner API token.

export TF_VAR_cloudflare_api_token="PLACEHOLDER"
export TF_VAR_hetzner_api_token="PLACEHOLDER"

tofu plan -out tfplan
tofu apply tfplan

Step 2: A Live Kubernetes Cluster

At this point, we should have a live Kuberntes cluster. We've already covered how to set up a lightweight Kubernetes cluster on a Ubuntu 22.04 machine before and so, we won't go too deep into that.

But for the sake of completeness, we'll resurface the code one more time, with some minor tweaks here and there.

variable "hetzner_api_token" {
  type      = string
  nullable  = false
  sensitive = true
}

variable "cloudflare_api_token" {
  type      = string
  nullable  = false
  sensitive = true
}

variable "stack_name" {
  type    = string
  default = "k3s-cluster"
}

variable "primary_ip_datacenter" {
  type    = string
  default = "nbg1-dc3"
}

variable "root_domain" {
  type    = string
  default = "developer-friendly.blog"
}

variable "server_datacenter" {
  type    = string
  default = "nbg1"
}

variable "username" {
  type    = string
  default = "k8s"
}

terraform {
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = "~> 1.46"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.30"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.6"
    }
    http = {
      source  = "hashicorp/http"
      version = "~> 3.4"
    }
    tls = {
      source  = "hashicorp/tls"
      version = "~> 4.0"
    }
  }
}

provider "hcloud" {
  token = var.hetzner_api_token
}

provider "cloudflare" {
  api_token = var.cloudflare_api_token
}

resource "tls_private_key" "this" {
  algorithm   = "ECDSA"
  ecdsa_curve = "P384"
}

resource "hcloud_ssh_key" "this" {
  name       = var.stack_name
  public_key = tls_private_key.this.public_key_openssh
}

resource "hcloud_server" "this" {
  name        = var.stack_name
  server_type = "cax11"
  image       = "ubuntu-22.04"
  location    = "nbg1"

  ssh_keys = [
    hcloud_ssh_key.this.id,
  ]

  public_net {
    ipv4 = hcloud_primary_ip.this["ipv4"].id
    ipv6 = hcloud_primary_ip.this["ipv6"].id
  }

  user_data = <<-EOF
    #cloud-config
    users:
      - name: ${var.username}
        groups: users, admin, adm
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
        ssh_authorized_keys:
          - ${tls_private_key.this.public_key_openssh}
    packages:
      - certbot
    package_update: true
    package_upgrade: true
    runcmd:
      - sed -i -e '/^\(#\|\)PermitRootLogin/s/^.*$/PermitRootLogin no/' /etc/ssh/sshd_config
      - sed -i -e '/^\(#\|\)PasswordAuthentication/s/^.*$/PasswordAuthentication no/' /etc/ssh/sshd_config
      - sed -i '$a AllowUsers ${var.username}' /etc/ssh/sshd_config
      - |
        curl https://get.k3s.io | \
          INSTALL_K3S_VERSION="v1.29.3+k3s1" \
          INSTALL_K3S_EXEC="--disable traefik
            --kube-apiserver-arg=service-account-jwks-uri=https://${cloudflare_record.this.name}/openid/v1/jwks
            --kube-apiserver-arg=service-account-issuer=https://${cloudflare_record.this.name}
            --disable-network-policy
            --flannel-backend none
            --write-kubeconfig /home/${var.username}/.kube/config
            --secrets-encryption" \
          sh -
      - chown -R ${var.username}:${var.username} /home/${var.username}/.kube/
      - |
        CILIUM_CLI_VERSION=v0.16.4
        CLI_ARCH=arm64
        curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/$CILIUM_CLI_VERSION/cilium-linux-$CLI_ARCH.tar.gz{,.sha256sum}
        sha256sum --check cilium-linux-$CLI_ARCH.tar.gz.sha256sum
        sudo tar xzvfC cilium-linux-$CLI_ARCH.tar.gz /usr/local/bin
      - kubectl completion bash | tee /etc/bash_completion.d/kubectl
      - k3s completion bash | tee /etc/bash_completion.d/k3s
      - |
        cat << 'EOF2' >> /home/${var.username}/.bashrc
        alias k=kubectl
        complete -F __start_kubectl k
        EOF2
      - reboot
  EOF
}

data "http" "this" {
  url = "https://checkip.amazonaws.com"
}

resource "hcloud_firewall" "this" {
  name = var.stack_name

  rule {
    direction   = "in"
    protocol    = "tcp"
    port        = 22
    source_ips  = [format("%s/32", trimspace(data.http.this.response_body))]
    description = "Allow SSH access from my IP"
  }

  rule {
    direction = "in"
    protocol  = "tcp"
    port      = 80
    source_ips = [
      "0.0.0.0/0",
      "::/0",
    ]
    description = "Allow HTTP access from everywhere"
  }

  rule {
    direction = "in"
    protocol  = "tcp"
    port      = 443
    source_ips = [
      "0.0.0.0/0",
      "::/0",
    ]
    description = "Allow HTTPS access from everywhere"
  }


  depends_on = [
    hcloud_server.this,
  ]
}

resource "hcloud_firewall_attachment" "this" {
  firewall_id = hcloud_firewall.this.id
  server_ids  = [hcloud_server.this.id]
}

output "public_ip" {
  value = hcloud_primary_ip.this["ipv4"].ip_address
}

output "public_ipv6" {
  value = hcloud_primary_ip.this["ipv6"].ip_address
}

output "ssh_private_key" {
  value     = tls_private_key.this.private_key_pem
  sensitive = true
}

output "ansible_inventory_yaml" {
  value = <<-EOF
    k8s:
      hosts:
        ${var.stack_name}:
          ansible_host: ${hcloud_server.this.ipv4_address}
          ansible_user: ${var.username}
          ansible_ssh_private_key_file: ~/.ssh/k3s-cluster
          ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o PasswordAuthentication=no'
  EOF
}

output "ansible_vars" {
  value = <<-EOF
    domain_name: ${cloudflare_record.this.name}
  EOF
}

output "oidc_provider_url" {
  value = cloudflare_record.this.name
}

Notice the lines where we specify the OIDC issuer URL & JWK URL for the Kubernetes API server to be a publicly accessible address and pass it as an argument to the k3s server.

--kube-apiserver-arg=service-account-jwks-uri=https://${cloudflare_record.this.name}/openid/v1/jwks
--kube-apiserver-arg=service-account-issuer=https://${cloudflare_record.this.name}

If not specified, the rest of this guide won't work and additional configuration is required. In summary, these are the URLs that will be used by the Service Provider when trying to verify the OIDC provider & the access tokens of the Service Accounts.

Business as usual, we apply the stack as below.

tofu plan -out tfplan
tofu apply tfplan

And for connecting to the machine:

tofu output -raw ssh_private_key > ~/.ssh/k3s-cluster
chmod 600 ~/.ssh/k3s-cluster

IP_ADDRESS=$(tofu output -raw public_ip)
ssh -i ~/.ssh/k3s-cluster k8s@$IP_ADDRESS

To be able to use the Ansible playbook in the next steps, we shall write the inventory where Ansible expects them.

# ansible.cfg
[defaults]
become = false
cache_timeout = 3600
fact_caching = ansible.builtin.jsonfile
fact_caching_connection = /tmp/ansible_facts
gather_facts = false
interpreter_python = auto_silent
inventory = ./inventory
log_path = /tmp/ansible.log
roles_path = ~/.ansible/roles:./roles
ssh_common_args = -o ConnectTimeout=5
verbosity = 2

[inventory]
cache = true
cache_connection = /tmp/ansible_inventory
enable_plugins = 'host_list', 'script', 'auto', 'yaml', 'ini', 'toml', 'azure_rm', 'aws_ec2', 'auto'

mkdir -p ../inventory/group_vars
tofu output -raw ansible_inventory_yaml > ../inventory/k3s-cluster.yml
tofu output -raw ansible_vars > ../inventory/group_vars/all.yml

The result of ansible-inventory --list:

{
  "_meta": {
    "hostvars": {
      "k3s-cluster": {
        "ansible_host": "XX.XX.XX.XX",
        "ansible_ssh_common_args": "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o PasswordAuthentication=no",
        "ansible_ssh_private_key_file": "~/.ssh/k3s-cluster",
        "ansible_user": "k8s",
        "discovered_interpreter_python": {
          "__ansible_unsafe": "/usr/bin/python3"
        }
      }
    }
  },
  "all": {
    "children": [
      "ungrouped",
      "k8s"
    ]
  },
  "k8s": {
    "hosts": [
      "k3s-cluster"
    ]
  }
}

At this stage we're ready to move on to the next step.

Step 3: Bootstrap the Cluster

At this point we have installed the Cilium binary in our host machine, yet we haven't installed the CNI plugin in our Kubernetes cluster.

Let's create an Ansible role and a playbook to take care of all the Day 1 operations.

ansible-galaxy init k8s
touch playbook.yml

The first step is to install the Cilium CNI.

# k8s/defaults/main.yml
---
cilium_version: 1.15.4

# k8s/tasks/cilium.yml
---
- name: Install cilium
  ansible.builtin.command:
    cmd: cilium install --set kubeProxyReplacement=true --wait --version {{ cilium_version }}
  register: cilium_install
  changed_when: false
  ignore_errors: true
  environment:
    KUBECONFIG: /etc/rancher/k3s/k3s.yaml

# k8s/tasks/main.yml
---
- name: Cilium
  include_tasks: cilium.yml
  tags: cilium

# playbook.yml
---
- name: Bootstrap k8s node
  hosts: k3s-cluster
  gather_facts: false
  become: true
  roles:
    - k8s

To run the playbook:

ansible-playbook playbook.yml

Step 4: Fetch the TLS Certificate

At this point, we need a CA verified TLS certificate for the domain name we created in the first step.

We will carry our tasks with Ansible throughout the entire Day 1 to Day n operations.

# k8s/defaults/main.yml
---
cilium_version: 1.15.4
acme_home: /var/www/html

# k8s/templates/wellknown-server.service.j2
[Unit]
Description=Wellknown Server

[Service]
ExecStartPre=/bin/mkdir -p {{ acme_home }}/.well-known/acme-challenge
ExecStart=/usr/bin/python3 -m http.server -d {{ acme_home }} 80
Restart=on-failure
WorkingDirectory={{ acme_home }}
User=acme
Group=acme
RestartSec=5
AmbientCapabilities=cap_net_bind_service

[Install]
WantedBy=multi-user.target

# k8s/handlers/main.yml
---
- name: Restart wellknown-server
  ansible.builtin.systemd:
    name: wellknown-server
    state: restarted
    daemon_reload: true

# k8s/tasks/certbot.yml
---
- name: Create acme group
  ansible.builtin.group:
    name: acme
    state: present
    system: true
- name: Create acme user
  ansible.builtin.user:
    name: acme
    state: present
    group: acme
    shell: /bin/false
    system: true
    create_home: false
- name: Create working dir for acme user
  ansible.builtin.file:
    path: "{{ acme_home }}"
    state: directory
    owner: acme
    group: acme
    mode: "0755"
- name: Create an standalone server to respond to challenges
  ansible.builtin.template:
    src: wellknown-server.service.j2
    dest: /etc/systemd/system/wellknown-server.service
    owner: root
    group: root
    mode: "0644"
  notify: Restart wellknown-server
- name: Start the wellknown-server
  ansible.builtin.systemd:
    name: wellknown-server
    state: started
    enabled: true
    daemon_reload: true
- name: Use certbot to fetch TLS certificate for {{ domain_name }}
  ansible.builtin.command:
    cmd: >-
      certbot certonly
        --webroot
        -w {{ acme_home }}
        --non-interactive
        --agree-tos
        --email {{ domain_email }}
        --domains {{ domain_name }}
  args:
    creates: /etc/letsencrypt/live/{{ domain_name }}/fullchain.pem

# k8s/tasks/main.yml
---
- name: Cilium
  ansible.builtin.import_tasks: cilium.yml
  tags:
    - cilium
- name: Certbot
  ansible.builtin.import_tasks: certbot.yml
  tags:
    - certbot

# playbook.yml
---
- name: Bootstrap k8s node
  hosts: k3s-cluster
  gather_facts: false
  become: true
  vars:
    domain_email: admin@developer-friendly.blog
  roles:
    - k8s

Certificate Renewal

Although not required, one of the benefits of using certbot for
TLS certificates is the ease of renewal.

After your initial certbot command, you will find the following two
systemd files in your system.

# /lib/systemd/system/certbot.service
[Unit]
Description=Certbot
Documentation=file:///usr/share/doc/python-certbot-doc/html/index.html
Documentation=https://certbot.eff.org/docs
[Service]
Type=oneshot
ExecStart=/usr/bin/certbot -q renew
PrivateTmp=true

# /lib/systemd/system/certbot.timer
[Unit]
Description=Run certbot twice daily

[Timer]
OnCalendar=*-*-* 00,12:00:00
RandomizedDelaySec=43200
Persistent=true

[Install]
WantedBy=timers.target

Although on the same host, you will find a crontab entry for the certbot
as you see below:

# /etc/cron.d/certbot: crontab entries for the certbot package
#
# Upstream recommends attempting renewal twice a day
#
# Eventually, this will be an opportunity to validate certificates
# haven't been revoked, etc.  Renewal will only occur if expiration
# is within 30 days.
#
# Important Note!  This cronjob will NOT be executed if you are
# running systemd as your init system.  If you are running systemd,
# the cronjob.timer function takes precedence over this cronjob.  For
# more details, see the systemd.timer manpage, or use systemctl show
# certbot.timer.
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew

All of these files are created by the certbot binary during the initial
run. You are free to modify and customize it, although it's unlikely that
you will need to.

After adding another task to our Ansible role, we can run the new tasks with the following command:

ansible-playbook playbook.yml --tags certbot

Step 5: Expose OIDC Configuration to the Internet

We've prepared all these works so far for this next step.

In here, we will fetch the OIDC configuration from the Kubernetes API server and expose them to the internet on HTTPS using the newly acquired TLS certificate with the help of static web server.

# k8s/defaults/main.yml
---
cilium_version: 1.15.4
acme_home: /var/www/html
static_web_server_home: /var/www/static-web-server
kubeconfig: /etc/rancher/k3s/k3s.yaml

# k8s/handlers/main.yml
---
- name: Restart wellknown-server
  ansible.builtin.systemd:
    name: wellknown-server
    state: restarted
    daemon_reload: true
- name: Restart static-web-server-prepare
  ansible.builtin.systemd:
    name: static-web-server-prepare
    state: restarted
    daemon_reload: true
- name: Restart static-web-server
  ansible.builtin.systemd:
    name: static-web-server
    state: restarted
    daemon_reload: true

# k8s/templates/static-web-server.service.j2
[Unit]
Description=Static Web Server

[Service]
ExecStartPre=/usr/bin/test -s cert.pem
ExecStartPre=/usr/bin/test -s key.pem
ExecStartPre=/usr/bin/test -s .well-known/openid-configuration
ExecStartPre=/usr/bin/test -s openid/v1/jwks
ExecStart=/usr/local/bin/static-web-server \
  --host 0.0.0.0 \
  --port 443 \
  --root . \
  --log-level info \
  --http2 \
  --http2-tls-cert cert.pem \
  --http2-tls-key key.pem \
  --compression \
  --health \
  --experimental-metrics
Restart=on-failure
WorkingDirectory={{ static_web_server_home }}
User=static-web-server
Group=static-web-server
RestartSec=5
AmbientCapabilities=cap_net_bind_service

[Install]
WantedBy=multi-user.target

# k8s/tasks/static-server.yml
- name: Create static-web-server group
  ansible.builtin.group:
    name: static-web-server
    state: present
    system: true
- name: Create static-web-server user
  ansible.builtin.user:
    name: static-web-server
    state: present
    group: static-web-server
    shell: /bin/false
    system: true
    create_home: false
- name: Create working dir for static-web-server user
  ansible.builtin.file:
    path: "{{ static_web_server_home }}"
    state: directory
    owner: static-web-server
    group: static-web-server
    mode: "0755"
- name: Download static web server binary
  ansible.builtin.get_url:
    url: "{{ static_web_server_download_url }}"
    dest: "/tmp/{{ static_web_server_download_url | basename }}"
    checksum: "sha256:{{ static_web_server_checksum }}"
    owner: root
    group: root
    mode: "0644"
  register: download_static_web_server
- name: Extract static web server binary
  ansible.builtin.unarchive:
    src: "{{ download_static_web_server.dest }}"
    dest: /usr/local/bin/
    owner: root
    group: root
    mode: "0755"
    remote_src: true
    extra_opts:
      - --strip-components=1
      - --wildcards
      - "**/static-web-server"
  notify: Restart static-web-server
- name: Create static-web-server-prepare script
  ansible.builtin.template:
    src: static-web-server-prepare.sh.j2
    dest: /usr/local/bin/static-web-server-prepare
    owner: root
    group: root
    mode: "0755"
  notify: Restart static-web-server-prepare
- name: Create static-web-server-prepare service
  ansible.builtin.template:
    src: static-web-server-prepare.service.j2
    dest: /etc/systemd/system/static-web-server-prepare.service
    owner: root
    group: root
    mode: "0644"
  notify: Restart static-web-server-prepare
- name: Create static-web-server-prepare timer
  ansible.builtin.template:
    src: static-web-server-prepare.timer.j2
    dest: /etc/systemd/system/static-web-server-prepare.timer
    owner: root
    group: root
    mode: "0644"
  notify: Restart static-web-server-prepare
- name: Start static-web-server-prepare
  ansible.builtin.systemd:
    name: static-web-server-prepare.timer
    state: started
    enabled: true
    daemon_reload: true
- name: Create static-web-server service
  ansible.builtin.template:
    src: static-web-server.service.j2
    dest: /etc/systemd/system/static-web-server.service
    owner: root
    group: root
    mode: "0644"
  notify: Restart static-web-server
- name: Start static-web-server service
  ansible.builtin.systemd:
    name: static-web-server
    state: started
    enabled: true
    daemon_reload: true

# k8s/tasks/main.yml
---
- name: Cilium
  ansible.builtin.import_tasks: cilium.yml
  tags:
    - cilium
- name: Certbot
  ansible.builtin.import_tasks: certbot.yml
  tags:
    - certbot
- name: Static web server
  ansible.builtin.import_tasks: static-server.yml
  tags:
    - static-web-server

# vars/aarch64.yml
---
static_web_server_download_url: https://github.com/static-web-server/static-web-server/releases/download/v2.28.0/static-web-server-v2.28.0-armv7-unknown-linux-musleabihf.tar.gz
static_web_server_checksum: 492dda3749af5083e5387d47573b43278083ce62de09b2699902e1ba40bf1e45

# playbook.yml
---
- name: Bootstrap k8s node
  hosts: k3s-cluster
  gather_facts: true
  become: true
  vars:
    domain_email: admin@developer-friendly.blog
  vars_files:
    - vars/{{ ansible_architecture }}.yml
  roles:
    - k8s
  tags:
    - provision

Running this will be as follows:

ansible-playbook playbook.yml --tags static-web-server

You can notice that we have turned on fact gathering in this step. This is due to our desire to include host-specific variables as you see with vars_files entry.

From the above tasks, there are references to a couple of important files. One is the static-web-server-prepare which has both a service file as well as a timer file.

This gives us flexibility to define oneshot services which will only run to completion on every tick of the timer. Effectively, we'll be able to separate the executable task and the scheduling of the task.

The definitions for those files are as following:

# k8s/templates/static-web-server-prepare.sh.j2
#!/usr/bin/env sh

# This script will run as root to prepare the files for the static web server

set -eu

mkdir -p {{ static_web_server_home }}/.well-known \
  {{ static_web_server_home }}/openid/v1

kubectl get --raw /.well-known/openid-configuration > \
  {{ static_web_server_home }}/.well-known/openid-configuration

kubectl get --raw /openid/v1/jwks > \
  {{ static_web_server_home }}/openid/v1/jwks

cp /etc/letsencrypt/live/{{ domain_name }}/fullchain.pem \
  {{ static_web_server_home }}/cert.pem

cp /etc/letsencrypt/live/{{ domain_name }}/privkey.pem \
  {{ static_web_server_home }}/key.pem


chown -R static-web-server:static-web-server {{ static_web_server_home }}

Notice how we are manually fetching the OIDC configurations from the Kubernetes as well as the TLS certificate. This is due to a possibility of renewal for any of the given files:

Firstly, the Kubernetes API server might rotate its Service Account issuer key pair and with that, the JWKs URL will have different output.
Secondly, the TLS certificate will be renewed by certbot in the background and we have to keep up with that.

Now, let's take a look at our preparation service and timer definition.

# k8s/templates/static-web-server-prepare.service.j2
[Unit]
Description=Update TLS & K8s OIDC Config

[Service]
Environment=KUBECONFIG={{ kubeconfig }}
ExecStartPre=/bin/mkdir -p .well-known openid/v1
ExecStart=/usr/local/bin/static-web-server-prepare
Restart=on-failure
Type=oneshot
WorkingDirectory={{ static_web_server_home }}

[Install]
WantedBy=multi-user.target

# k8s/templates/static-web-server-prepare.timer.j2
[Unit]
Description=Update TLS & K8s OIDC Config Every Minute

[Timer]
OnCalendar=*-*-* *:*:00

[Install]
WantedBy=multi-user.target

Notice that the service file specifies the working directory for the script. Which means the static-web-server-prepare shell script will be executed in the specified directory.

Also, watch out for oneshot systemd service type. These services are not long-running processes in an infitie loop. Instead, they will run to completion and the systemd will not report their state as Active as it would with simple services.

Step 6: Add the OIDC Provider to AWS

That's it. We have done all the hard work. Anything after this will be a breeze compared to what we've done so far as you shall see shortly.

Now, we have a domain name that is publishing its OIDC configuration and JWKs over the HTTPS endpoint and is ready to be used as a trusted OIDC provider.

All we need right now, is a couple of TF resource in the AWS account and after that, we can test the setup using a sample Job that takes a Service Account in its definition and uses its token to talk to AWS.

Note that we're starting a new TF module below.

terraform {
  required_providers {
    tls = {
      source  = "hashicorp/tls"
      version = "~> 4.0"
    }
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.46"
    }
  }
}

data "terraform_remote_state" "k8s" {
  backend = "local"

  config = {
    path = "../provision-k8s/terraform.tfstate"
  }
}

data "tls_certificate" "this" {
  url = "https://${data.terraform_remote_state.k8s.outputs.oidc_provider_url}"
}

resource "aws_iam_openid_connect_provider" "this" {
  url = "https://${data.terraform_remote_state.k8s.outputs.oidc_provider_url}"

  # JWT token audience (aud)
  client_id_list = [
    "sts.amazonaws.com"
  ]

  thumbprint_list = [
    data.tls_certificate.this.certificates[0].sha1_fingerprint
  ]
}

Let's apply this stack:

export AWS_PROFILE="PLACEHOLDER"

tofu plan -out tfplan
tofu apply tfplan

Believe it or not, but after all these efforts, it is finally done.

Now it is time for the test.

In order to be able to assume a role from inside a pod of our cluster, we will create a sample IAM Role with a trust relationship to the OIDC provider we just created.

data "aws_iam_policy_document" "this" {
  statement {
    actions = [
      "sts:AssumeRoleWithWebIdentity"
    ]

    effect = "Allow"

    principals {
      type = "Federated"
      identifiers = [
        aws_iam_openid_connect_provider.this.arn
      ]
    }

    condition {
      test     = "StringEquals"
      variable = "${aws_iam_openid_connect_provider.this.url}:aud"

      values = [
        "sts.amazonaws.com"
      ]
    }

    condition {
      test     = "StringEquals"
      variable = "${aws_iam_openid_connect_provider.this.url}:sub"

      values = [
        "system:serviceaccount:${var.service_account_namespace}:${var.service_account_name}"
      ]
    }
  }
}

resource "aws_iam_role" "this" {
  name               = "k3s-demo-app"
  assume_role_policy = data.aws_iam_policy_document.this.json
  managed_policy_arns = [
    "arn:aws:iam::aws:policy/AmazonSSMReadOnlyAccess"
  ]
}

output "iam_role_arn" {
  value = aws_iam_role.this.arn
}

output "service_account_namespace" {
  value = var.service_account_namespace
}

output "service_account_name" {
  value = var.service_account_name
}

The AWS IAM Role trust relationship will look something like this:

{
  "Statement": [
    {
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "4f0fce7c-9efa-9ee3-5fe0-467d95d2584c.developer-friendly.blog:aud": "sts.amazonaws.com",
          "4f0fce7c-9efa-9ee3-5fe0-467d95d2584c.developer-friendly.blog:sub": "system:serviceaccount:default:demo-service-account"
        }
      },
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::XXXXXXXXXXXX:oidc-provider/4f0fce7c-9efa-9ee3-5fe0-467d95d2584c.developer-friendly.blog"
      }
    }
  ],
  "Version": "2012-10-17"
}

This, of course, shouldn't come as a surprise. We have already seen this in the TF definition above.

Step 7: Test the Setup

We have created the IAM Role with the trust relationship to the OIDC provider of the cluster. With the conditional in the AWS IAM Role you se in the previous step, only the Service Accounts with the specified audience, in the defaultnamespace and with the Service Account name demo-service-account will be able to assume the role.

That said, let's use create another Ansible role to create a Kuberentes Job.

ansible-galaxy init app

We will need the Kubernetes core Ansible collection, so let's install that.

# requirements.yml
- name: kubernetes.core
  version: 2.4.1

ansible-galaxy collection install -r requirements.yml

# app/defaults/main.yml
---
aws_region: eu-central-1
role_session_name: k3s-cluster

# app/templates/manifest.yml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: demo-service-account
  namespace: default
---
apiVersion: batch/v1
kind: Job
metadata:
  name: demo-app
  namespace: default
spec:
  selector:
    matchLabels:
      job-name: demo-app
  template:
    metadata:
      labels:
        job-name: demo-app
    spec:
      restartPolicy: Never
      containers:
        - image: amazon/aws-cli:2.15.40
          name: demo-app
          command:
            - sh
            - -c
            - |
              aws sts get-caller-identity

              aws ssm get-parameters-by-path \
                --path / --recursive \
                --with-decryption \
                --query "Parameters[*].[Name]" \
                --output text
          env:
            - name: AWS_REGION
              value: "{{ aws_region }}"
            - name: AWS_ROLE_ARN
              value: "{{ role_arn }}"
            - name: AWS_ROLE_SESSION_NAME
              value: "{{ role_session_name }}"
            - name: AWS_WEB_IDENTITY_TOKEN_FILE
              value: /var/run/secrets/tokens/token
          securityContext:
            readOnlyRootFilesystem: true
          volumeMounts:
            - name: token
              mountPath: /var/run/secrets/tokens
              readOnly: true
            - name: aws-config
              mountPath: /root/.aws
      serviceAccountName: demo-service-account
      volumes:
        - name: token
          projected:
            sources:
              - serviceAccountToken:
                  path: token
                  audience: sts.amazonaws.com
        - name: aws-config
          emptyDir: {}

# app/tasks/main.yml
---
- name: Apply the app job
  kubernetes.core.k8s:
    template: manifest.yml
    state: present
    force: true
    wait: true

# playbook.yml
---
- name: Bootstrap k8s node
  hosts: k3s-cluster
  gather_facts: true
  become: true
  vars:
    domain_email: admin@developer-friendly.blog
  vars_files:
    - vars/{{ ansible_architecture }}.yml
  roles:
    - k8s
  tags:
    - provision
- name: Test the AWS Access
  hosts: k3s-cluster
  gather_facts: false
  become: true
  environment:
    KUBECONFIG: /etc/rancher/k3s/k3s.yaml
  pre_tasks:
    - name: Install pip3
      ansible.builtin.package:
        name: python3-pip
        state: present
    - name: Install kubernetes library
      ansible.builtin.pip:
        name: kubernetes<30
        state: present
    - name: Read Tofu output from ./configure-oidc
      ansible.builtin.command:
        cmd: tofu output -raw iam_role_arn
        chdir: "{{ playbook_dir }}/configure-oidc"
      delegate_to: localhost
      become: false
      changed_when: false
      register: configure_oidc
    - name: Set the AWS role arn
      ansible.builtin.set_fact:
        role_arn: "{{ configure_oidc.stdout }}"
  roles:
    - app
  tags:
    - test
    - never

A few important notes are worth mentioning here:

The second playbook is tagged with never. That is because there is a dependency on the second TF module. We have to manually resolve it before being able to run the second playbook. As soon as the dependency is resolved, we can run the second playbook with the --tags test flag.
There is a fact gathering in the pre_task of the second playbook. That is, again, because of the dependency to the TF module. We will grab the output of the TF module and pass it to our next role. If you notice there is a aws_region variable in the Jinja template that is being initialized by this fact gathering step.
In the fact gathering step, there is an Ansible delegation happening. This will ensure that the task is running in our own machine and not the target machine. The reason is that the TF module and its TF state file is in our local machine. We also do not need the become and as such it is turned off.
You will notice that the job manifest is using AWS CLI Docker image. By specifying some of the expected environment variables, we are able to use the AWS CLI without the requirement of manual aws configure.

This playbook can be run after the second TF module with the following command:

ansible-playbook playbook.yml --tags test

When checking the logs of the deployed Kubernetes Job, we can see that it has been successful.

kubectl logs job/demp-app

There is no AWS SSM Parameter in the target AWS account and as such, the
AWS CLI will not return an empty list; it will return nothing!

{
  "Account": "XXXXXXXXXXXX",
  "Arn": "arn:aws:sts::XXXXXXXXXXXX:assumed-role/k3s-demo-app/k3s-cluster",
  "UserId": "AROAYTLV5GLUYXO2EKETN:k3s-cluster"
}

Lastly, to test if the Service Account and the IAM Role trust policy plays any role in any of this, we can remove the serviceAccountToken and try to recreate the job.

The output is as expected:

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

That's all folks! We can now wrap this up.

Bonus: JWKs URL

Remember at the beginning of this guide when we mentioned that the JWKs URL is configurable through the OIDC configuration endpoint?

Let's see it in action.

DOMAIN=$(grep domain_name inventory/group_vars/all.yml | awk '{print $2}')
curl https://$DOMAIN/.well-known/openid-configuration | jq -r .jwks_uri

This means that you can host your JWKs on a different server than the OIDC server. Although I don't suggest this to be a good idea because of all the maintenance overhead.

That said, if your JWKs URL is at a different server or hosted on a different endpoint, all you gotta do, is pass the value to the kube-apiserver as you see below:

kube-apiserver --service-account-jwks-uri=https://mydomain.com/some/different/endpoint

Conclusion

OpenID Connect is one of the most powerful protocols that powers the internet. Yet, it is so underestimated and easily overlooked. If you look closely enough on any system around you, you will see a lot of practical applications of OIDC.

One of the cues that you can look for when trying to identify applicability of OIDC is when trying to authenticate an identity of one system to another. You will almost always never need to create another identity in the target system, nor do you need to pass any credentials around. All that's needed is to establish a trust relationship between the two systems and you're good to go.

This gives you a lot of flexibility and enhances your security posture. You will also remove the overhead of secret rotations from your workload.

In this post, we have seen how to establish a trust relationship between a bear-metal Kubernetes cluster and AWS IAM to grant cluster generated Service Account tokens access to AWS services using OIDC.

Having this foundation in place, it's easy to extend this pattern to managed Kubernetes clusters such as Azure Kubernetes Service (AKS) or Google Kubernetes Engine (GKE). All you need from the managed Kubernetes cluster is the OIDC configuration endpoint, which in turn has the JWKs URL. With that, you can create the trust relationship in AWS or any other Service Provider and grant the relevant access to your services as needed.

Hope you've enjoyed reading the post as much as I've enjoyed writing it. I wish you have learned something new and useful from it.

Until next time, ciao 🤠 & happy coding! 🐧 🦀

DEV Community

Grant Kubernetes Pods Access to AWS Services Using OpenID Connect

Grant Kubernetes Pods Access to AWS Services Using OpenID Connect

Introduction

Prerequisites

Roadmap

Step 0: Directory Structure

Step 1: Dedicated Domain Name

Step 2: A Live Kubernetes Cluster

Step 3: Bootstrap the Cluster

Step 4: Fetch the TLS Certificate

Certificate Renewal

Step 5: Expose OIDC Configuration to the Internet

Step 6: Add the OIDC Provider to AWS

Step 7: Test the Setup

Bonus: JWKs URL

Conclusion

Top comments (0)

Read next

Automating AWS Cost Management Reports with Lambda

DynamoDB-style Limits for Predictable SQL Performance?

Key Components of a VPC: Detailed Breakdown

Amazon Q Developer Tips: No.11 Scaffolding