Description

This article explains the steps you need to take to deploy a high availability kubernetes cluster in your environment. This cluster will comprise of 3 control nodes and 3 worker nodes. We will not dive into the details and concepts behind the differences between the control plane nodes and worker nodes but these are some responsibilities of each:

  • Control nodes modules
    • kube-apiserver – the interface through which the cluster is controlled
    • etcd – data store for the cluster state
      • stacked – collocated within the control node
      • external
    • scheduler – assigns newly created pods to nodes
    • controller-manager – runs specific controllers( logic ) for checking nodes, executing tasks, etc
  • Worker nodes modules
    • dns addon – dns resolution for pods and services
  • Common modules
    • kube-proxy – maintains network rules
    • kubelet – checks if containers are running inside the pod

Getting started

The following image shows the connection diagram between nodes and the inner-connections between modules. We will have to add some parts but most of the heavy lifting is done through the kubernetes tool kubeadm.

Configuring k8s cluster topology
k8s topology

Hardware:

  • 6 hosts with Ubuntu 22.04
    • 3 control hosts 2 cores, 2GB RAM( at least )
    • 3 worker hosts 2 cores, 2GB RAM( at least )

Steps:

  • Prepare systems
    • update package list, install needed packages
    • disable swap( k8s prerequisite )
  • Configure high availability for master nodes
    • configure failover with keepalived
    • configure http balancing
  • Configure kubernetes for nodes
    • configure master nodes
    • configure worker nodes
  • Test

Prepare systems

This stage should be performed 6 times for each of the cluster nodes( masters and workers ). You could speed up this process by having a VM template or a disk image of an already configured system.

Bash
# set hostname !! replace hostname on each node
k8suser@master-1:~$ sudo hostnamectl set-hostname "master-1"

# start new bash
k8suser@master-1:~$ exec bash

# update package list
k8suser@master-1:~$ sudo apt update

# upgrade installed packages
k8suser@master-1:~$ sudo apt upgrade

# disable swap
k8suser@master-1:~$ sudo swapoff -a

# after this /proc/swaps should be empty
k8suser@master-1:~$ cat /proc/swaps 
Filename				Type		Size		Used		Priority

Open /etc/fstab and make sure that no line with filesystem type “swap” exists:

Plaintext
UUID=30507c9c-18b0-41c2-a331-95f8af945a27 none swap sw 0 0 # <--- delete or comment
/swap.img     none    swap    sw      0       0 # <---- delete or comment

Enable ipv4 forwarding and bridged traffic:

Bash
# forwarding IPv4 and letting iptables see bridged traffic
k8suser@master-1:~$ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

k8suser@master-1:~$ sudo modprobe overlay br_netfilter

# Persist parameters across reboots
k8suser@master-1:~$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
k8suser@master-1:~$ sudo sysctl --system

Install the containerd runtime :

Bash
k8suser@master-1:~$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

k8suser@master-1:~$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

k8suser@master-1:~$ sudo apt update
k8suser@master-1:~$ sudo apt install containerd.io

# generate a default config and write it to the default location
k8suser@master-1:~$ containerd config default | sudo tee /etc/containerd/config.toml >/dev/null 2>&1

# use systemd as the cgroup manager
k8suser@master-1:~$ sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml

# use same sandbox as kubernetes tools
k8suser@master-1:~$ sudo sed -i 's/sandbox_image = "registry.k8s.io\/pause:3.6"/sandbox_image = "registry.k8s.io\/pause:3.9"/g' /etc/containerd/config.toml

# restart containerd
k8suser@master-1:~$ sudo systemctl restart containerd

Install kubernetes tools:

Bash
# trust kubernetes repo key, we'll use xenial, newer repo may be used
k8suser@master-1:~$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/kubernetes-xenial.gpg

# add repo
k8suser@master-1:~$ sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"

k8suser@master-1:~$ sudo apt install kubeadm kubelet kubectl kubernetes-cni

Configure high availability for master nodes

By completing this stage you will make sure that fail-over and load balancing are configured for the control plane. Next steps should be performed only on the master nodes, that is 3 times for our example.

If we analyse the proposed topology again, we will see that our cluster state is held and proxied by the master nodes. One question that arises is what happens if one of the nodes disappears. We will mitigate this issue by using keepalived. One of the nodes will start as a master and will take a VIP address( a designated address ) and set it to its interface. If it goes down for some reason the backup node with the highest priority will become master and set the VIP address to the selected interface.

Bash
k8suser@master-1:~$ sudo mkdir /etc/keepalived

k8suser@master-1:~$ sudo vim /etc/keepalived/keepalived.conf
Plaintext
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
  router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}
vrrp_instance VI_1 {
  state ${STATE}
  interface ${INTERFACE}
  virtual_router_id ${ROUTER_ID}
  priority ${PRIORITY}
  authentication {
    auth_type PASS
    auth_pass ${AUTH_PASS}
  }
  virtual_ipaddress {
    ${APISERVER_VIP}
  }
  track_script {
    check_apiserver
  }
}

Replace placeholders with:

  • ${STATE} – MASTER on the default master and BACKUP on the other 2
  • ${INTERFACE} – enp0s3 or any other interface on the same network segment as the others nodes
  • ${ROUTER_ID} – any number, must be the same on all nodes
  • ${PRIORITY} – any number, determines the selection priority for the next master
  • ${AUTH_PASS} – a common password for all nodes
  • ${APISERVER_VIP} – the VIP address that will be used to access the API Server through kubectl

Add the health status script used by the configuration.

Bash
k8suser@master-1:~$ sudo vim /etc/keepalived/check_apiserver.sh
Bash
#!/bin/sh
# !!!! change below values !!!
APISERVER_DEST_PORT="8443"
APISERVER_VIP="192.168.0.60"
errorExit() {
  echo "*** $*" 1>&2
  exit 1
}
curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
  curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi

Next we will balance between the API Servers with a round-robin logic through haproxy.

Bash
k8suser@master-1:~$ sudo mkdir /etc/haproxy
k8suser@master-1:~$ sudo vim /etc/haproxy/haproxy.cfg
Plaintext
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:${APISERVER_DEST_PORT}
    mode tcp
    option tcplog
    default_backend apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
        # modify the configuration as you add more nodes
        # [...]
  • ${APISERVER_DEST_PORT} – the VIP port – ex.: 8443
  • ${APISERVER_SRC_PORT} – API Server port – ex.: 6443
  • ${HOST1_ID} – a node id for the server – ex: master-1
  • ${HOST1_ADDRESS} – the real address of the server
  • should look like:
    • server master-1 192.168.0.50:6443 check
    • server master-2 192.168.0.51:6443 check
    • server master-3 192.168.0.52:6443 check

keepalived and haproxy configurations are in place and we need to start using them. To start these daemons we will rely on kubelet. Kubelet will manage their lifecycle and will start them because we define them in the /etc/kubernetes/manifests folder.

Bash
k8suser@master-1:~$ sudo vim  /etc/kubernetes/manifests/keepalived.yaml
YAML
---
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  name: keepalived
  namespace: kube-system
spec:
  containers:
    - image: osixia/keepalived:2.0.17
      name: keepalived
      resources: {}
      securityContext:
        capabilities:
          add:
            - NET_ADMIN
            - NET_BROADCAST
            - NET_RAW
      volumeMounts:
        - mountPath: /usr/local/etc/keepalived/keepalived.conf
          name: config
        - mountPath: /etc/keepalived/check_apiserver.sh
          name: check
  hostNetwork: true
  volumes:
    - hostPath:
        path: /etc/keepalived/keepalived.conf
      name: config
    - hostPath:
        path: /etc/keepalived/check_apiserver.sh
      name: check
status: {}
Bash
k8suser@master-1:~$ sudo vim /etc/kubernetes/manifests/haproxy.yaml
YAML
---
apiVersion: v1
kind: Pod
metadata:
  name: haproxy
  namespace: kube-system
spec:
  containers:
    - image: haproxy:2.1.4
      name: haproxy
      livenessProbe:
        failureThreshold: 8
        httpGet:
          host: localhost
          path: /healthz
          port: 8443
          scheme: HTTPS
      volumeMounts:
        - mountPath: /usr/local/etc/haproxy/haproxy.cfg
          name: haproxyconf
          readOnly: true
  hostNetwork: true
  volumes:
    - hostPath:
        path: /etc/haproxy/haproxy.cfg
        type: FileOrCreate
      name: haproxyconf
status: {}
Bash
# set a FQDN for our VIP
k8suser@master-1:~$ sudo vim /etc/hosts
192.168.0.60	cluster-1.flobinsa.com

# on master-1 !!!
k8suser@master-1:~$ sudo kubeadm init --control-plane-endpoint cluster-1.flobinsa.com:8443 --pod-network-cidr 10.244.0.0/16 --upload-certs

k8suser@master-1:~$ mkdir -p $HOME/.kube

k8suser@master-1:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

k8suser@master-1:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

# configure network
k8suser@master-1:~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

# check node status
k8suser@master-1:~$ kubectl get nodes
NAME       STATUS   ROLES           AGE    VERSION
master-1   Ready    control-plane   6m4s   v1.27.1

# on master-2, master-3 !!! Parameters are to be taken from the output of the previous command
k8suser@master-1:~$ kubeadm join cluster-1.flobinsa.com:8443 --token ${token} 
	--discovery-token-ca-cert-hash ${cert-hash} 
	--control-plane --certificate-key ${cert-key}

To join worker nodes we need to follow all the steps in the preparation stage and issue the join command.

Bash
# on worker-1, worker-2, worker-3 !!! Parameters are to be taken from the output of the master initialization
kubeadm join cluster-1.flobinsa.com:8443 --token ${token} 
	--discovery-token-ca-cert-hash ${cert-hash}

Test

Bash
k8suser@master-1:~$ kubectl get nodes

Leave a Reply

Your email address will not be published. Required fields are marked *