Last week we covered the theory of high availability in a bare-metal Kubernetes cluster, which means that this week is where the magic happens.
First of all, there are a few dependencies that you need to have installed to initialize a Kubernetes cluster. Since this is not a guide on how to set up Kubernetes, I will assume that you have already done this before, and if not you can use the same guide as I used when installing Kubernetes for the first time: guide.
Also, if you did not follow the guide and have already installed Kubernetes and Docker (or your favorite container runtime), you will also have installed a key Kubernetes toolbox kubeadm, which is what we will use to initialize the cluster. First, we need to deal with the problems of high availability, which we discussed last week.
The stable control plane IP
As mentioned, we will use a self-hosted solution where we set up a stable IP with HAProxy and Keepalived as pods inside the Kubernetes cluster. To achieve this, we will need to configure a few files for each master node:
- A keepalived configuration.
- A keepalived health check script.
- A manifest file for the keepalived static pod.
- A HAproxy configuration file.
- A manifest file for the HAProxy static pod.
Keepalived:
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state ${STATE}
interface ${INTERFACE}
virtual_router_id ${ROUTER_ID}
priority ${PRIORITY}
authentication {
auth_type PASS
auth_pass ${AUTH_PASS}
}
virtual_ipaddress {
${APISERVER_VIP}
}
track_script {
check_apiserver
}
}
We have some placeholders in bash that we need to fill out manually or through scripting:
-
STATE
Will be MASTER for the node initializing the cluster because it will also be the first one to host the virtual IP address of the control plane. -
INTERFACE
Is the network interface of the network where the nodes will communicate. For Ethernet connections, this is ofteneth0
, and can be found with the commandifconfig
on most Linux operating systems. -
ROUTER_ID
Needs to be the same for all the hosts. Often set to51
. -
PRIORITY
A unique number that decides which node should host the virtual IP of the control plane in case the first MASTER node goes down. Often set to 100 for the node initializing the cluster, and then decreasing values for the rest. -
AUTH_PASS
should be the same for all nodes. Often set to42
. -
APISERVER_VIP
The virtual IP for the control plane. This will be created.
For the health check script we have the following:
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi
We see the APISERVER_VIP
placeholder again, which is just the same as before. If some variables are repeated I will not repeat the explanation, which means that the only new variable is:
APISERVER_DEST_PORT
, which is the front end port on the virtual IP for the API server. This can be any unused port e.g. 4200.
Last, the manifest file for Keepalived:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: keepalived
namespace: kube-system
spec:
containers:
- image: osixia/keepalived:1.3.5-1
name: keepalived
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_BROADCAST
- NET_RAW
volumeMounts:
- mountPath: /usr/local/etc/keepalived/keepalived.conf
name: config
- mountPath: /etc/keepalived/check_apiserver.sh
name: check
hostNetwork: true
volumes:
- hostPath:
path: /etc/keepalived/keepalived.conf
name: config
- hostPath:
path: /etc/keepalived/check_apiserver.sh
name: check
status: {}
This creates a pod that uses the two configuration files.
HAProxy
We have one configuration file for the HAProxy:
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
bind *:${APISERVER_DEST_PORT}
mode tcp
option tcplog
default_backend apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
server ${HOST2_ID} ${HOST2_ADDRESS}:${APISERVER_SRC_PORT} check
server ${HOST3_ID} ${HOST3_ADDRESS}:${APISERVER_SRC_PORT} check
Here, we plug in the control plane IPs. Assuming a 3 node cluster we input a symbolic HOST_ID
, which is just a unique name, for each as well as the HOST_ADDRESS
. The APISERVER_SRC_PORT is by default port 6443, where the apiserver listens for traffic.
The last file is the HAProxy manifest file:
apiVersion: v1
kind: Pod
metadata:
name: haproxy
namespace: kube-system
spec:
containers:
- image: haproxy:2.1.4
name: haproxy
livenessProbe:
failureThreshold: 8
httpGet:
host: localhost
path: /healthz
port: ${APISERVER_DEST_PORT}
scheme: HTTPS
volumeMounts:
- mountPath: /usr/local/etc/haproxy/haproxy.cfg
name: haproxyconf
readOnly: true
hostNetwork: true
volumes:
- hostPath:
path: /etc/haproxy/haproxy.cfg
type: FileOrCreate
name: haproxyconf
status: {}
This is all we actually need to configure to get a cluster up and running. Some of these are constants that need to be the same for all three master nodes, and some need to vary between nodes. Some are just values you have to input and for some values, you have to make a decision.
Values sanity check
Let us just take a quick sanity check over the variables and what they are by default for each node.
Constants
ROUTER_ID=51
AUTH_PASS=42
APISERVER_SRC_PORT=6443
Variables to input
STATE
MASTER for the node that initializes the cluster, BACKUP for the two others.
PRIORITY
100 for the node that initializes the cluster, 99 and 98 for the two others.
Variables to retrieve
APISERVER_VIP
An IP within your network subnet. If your node has IP 192.168.1.140, this could be 192.168.1.50.
APISERVER_DEST_PORT
A port for your choosing. Must not conflict with other service ports.
INTERFACE
The network interface. Use ifconfig
to find it.
HOSTX_ID
Any unique name for each of the 3 master nodes.
HOSTX_ADDRESS
The ip addresses of your machines. Can also be found with ifconfig
on each machine.
Files
Now that the files are configured they should be but in the right destination so that kubeadm
can find them when the cluster is initializing.
The absolute file paths are:
/etc/keepalived/check_apiserver.sh
/etc/keepalived/keepalived.conf
/etc/haproxy/haproxy.cfg
/etc/kubernetes/manifests/keepalived.yaml
/etc/kubernetes/manifests/haproxy.yaml
Putting manifest files into /etc/kubernetes/manifests/
is what does the magic here. Everything in this folder will be applied when the cluster initializes. Even the control plane pods that are generated by kubeadm
will be put in here before the cluster initializes.
Initializing the cluster
When the files are in place, initializing the cluster is as simple as running the kubeadm init
command with a few extra pieces of information.
kubeadm init --control-plane-endpoint APISERVER_VIP:APISERVER_DEST_PORT --upload-certs
Will do the trick. The extra arguments tell the cluster that the control plane should not be contacted on the actual nodes IP, but on the virtual IP address. When the other nodes join, this is what makes the cluster highly available. If the node that is currently hosting the virtual IP goes down, the virtual IP will just jump to another available master node.
Last, join the other two nodes to the cluster with the join command output by kubeadm init
.
If this even peaked your interest a little bit, you are in for a treat. The whole manual process is being eliminated in an open-source project right here. It is still a work in progress, but feel free to drop in and join the discussion.
Top comments (0)