loading...

Making HPA more responsive for resource-based scaling in OpenFaaS

itzikkotler profile image Itzik Kotler ・3 min read

I've got a Raspberry Pi 4 cluster that is running Kubernetes. Thanks to k3sup and arkade I got it up and running in jiffy. Here are a few tutorials that I've followed to get there:

Having said that k3s is a little bit different and this post is about this difference when it comes to how to change the Kubernetes Horizontal Pod Autoscaling "cool down" period. (aka. --horizontal-pod-autoscaler-downscale-stabilization) in it.

Recently I've started playing with HPAv2 (aka. Kubernetes Horizontal Pod Autoscaling) and OpenFaaS. The reason for that is that my OpenFaaS functions tend to be CPU and MEM intensive (as oppose to high API hit rates, which is where the built-in OpenFaaS autoscaler/alertmanager is focused).

OpenFaaS and HPAv2 play nicely together and I've started by followed the guide here:

https://docs.openfaas.com/tutorials/kubernetes-hpa/

However, as correctly stated in the document the HPAv2 scale-down is a slow process (i.e., default of 5 minutes). The reason for that is:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

And to be more specific:

When managing the scale of a group of replicas using the Horizontal Pod Autoscaler, it is possible that the number of replicas keeps fluctuating frequently due to the dynamic nature of the metrics evaluated. This is sometimes referred to as thrashing.

Starting from v1.6, a cluster operator can mitigate this problem by tuning the global HPA settings exposed as flags for the kube-controller-manager component:

Starting from v1.12, a new algorithmic update removes the need for the upscale delay.

--horizontal-pod-autoscaler-downscale-stabilization: The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s).

Now I'm using k3s (v1.17.2+k3s1) and in order to change it I need to pass a new value to kube-controller-manager. k3s is awesome, but it doesn't include the default path of:

/etc/kubernetes/manifests/kube-controller-manager.yaml on the master node

So I did some poking and found that it's possible to pass this flag/value to k3s upon start. In other words: We need to shutdown our k3s master node, edit the systemd unit of k3s, add this flag in runtime, and start it again.

Here are the steps:

  1. SSH to your master node

  2. Run: sudo systemctl stop k3s.service

  3. Edit /etc/systemd/system/k3s.service with your favorite editor (e.g., sudo vim /etc/systemd/system/k3s.service)

  4. Go to ExecStart section and append:

'--kube-controller-manager-arg' \
'horizontal-pod-autoscaler-downscale-stabilization=1m'

To command line options. Here's an example of my ExecStart Entry:

ExecStart=/usr/local/bin/k3s \
    server \
    '--tls-san' \
    '192.168.86.180' \
    '--no-deploy' \
    'servicelb' \
    '--no-deploy' \
    'traefik' \
    '--kube-controller-manager-arg' \
    'horizontal-pod-autoscaler-downscale-stabilization=1m'

(Note in my setup I've also disabled the built-in LB and traefik v1 ingress; this was done for me when I've passed --no-extras to k3sup)

  1. Run: sudo systemctl daemon-reload

  2. Run: sudo systemctl start k3s.service

That's it!

We have successfully changed the "cool down" period to one minute (i.e., 1m) in k3s. Now it's time to go back to our OpenFaaS functions and enjoy a more responsive resource-based scaling

Discussion

pic
Editor guide