DEV Community

Cover image for Horizontal Pod Autoscaler On EKS Cluster
🚀 Vu Dao 🚀 for AWS Community ASEAN

Posted on

Horizontal Pod Autoscaler On EKS Cluster

Abstract

This post introduce you Horizontal Pod Autoscaler (HPA) which is the best combination with cluster autoscaler to provide HA for your applications.

Table Of Contents


🚀 What is Horizontal Pod Autoscaler (HPA)

The Kubernetes Horizontal Pod Autoscaler automatically scales the number of pods in a deployment, replication controller, or replica set based on that resource's CPU utilization

🚀 Install metric-server

  • Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. These metrics will drive the scaling behavior of the deployments.

flow

  • Without metric server you will get <unknown> metric when trying to add HPA ```

$ kubectl get hpa

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
app Deployment/app /85% 1 2 0 6s


- Describe the HPA and see it is not able to collect metrics
Enter fullscreen mode Exit fullscreen mode

$ kubectl describe hpa app

Events:

Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 4m5s (x12 over 6m54s) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource
cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 111s (x21 over 6m54s) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the
requested resource (get pods.metrics.k8s.io)


- There's no metric installed yet
Enter fullscreen mode Exit fullscreen mode

$ kubectl get apiservice|grep metric


- Now we deploy the Metrics Server
Enter fullscreen mode Exit fullscreen mode

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml


- Check metric-server and apiService
Enter fullscreen mode Exit fullscreen mode

$ kubectl get deployment metrics-server -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE

metrics-server 1/1 1 0 6s

$ kubectl get apiservice|grep metric
v1beta1.metrics.k8s.io kube-system/metrics-server True 92s


- After install metric-server we can apply HPA and use following commands
Enter fullscreen mode Exit fullscreen mode

$ kubectl top nodes
$ kubectl top pods


## 🚀 **Create HPA yaml** <a name="Create-HPA-yaml"></a>
- Use CDK8S to create k8s yaml files as code. [Read more](https://dev.to/vumdao/cdk8s-example-2glk)

- An exmaple of HPA

Enter fullscreen mode Exit fullscreen mode

from constructs import Construct
from cdk8s import Chart
from imports import k8s
from cdk8s import Chart, App

class AppHpa(Chart):
def init(self, scope: Construct, id: str, name_space):
super().init(scope, id, namespace=name_space)

    app_name = 'app'
    app_label = {'dev': app_name}
    k8s.KubeHorizontalPodAutoscalerV2Beta2(
        self, 'AppHpa',
        metadata=k8s.ObjectMeta(labels=app_label, name=app_name),
        spec=k8s.HorizontalPodAutoscalerSpec(
            max_replicas=2,
            min_replicas=1,
            scale_target_ref=k8s.CrossVersionObjectReference(
                kind="Deployment",
                name=app_name,
                api_version='apps/v1'
            ),
            metrics=[
                k8s.MetricSpec(
                    type='Resource',
                    resource=k8s.ResourceMetricSource(
                        name='cpu',
                        target=k8s.MetricTarget(type='Utilization', average_utilization=85)
                    )
                )
            ]
        )
    )
Enter fullscreen mode Exit fullscreen mode

app = App()
AppHpa(app, "app-hpa")
app.synth()


- Output after running `cdk8s sync`
Enter fullscreen mode Exit fullscreen mode

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
labels:
dev: app
name: app
namespace: dev8
spec:
maxReplicas: 2
metrics:
- resource:
name: cpu
target:
averageUtilization: 85
type: Utilization
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app


## 🚀 **Test HPA with cluster autoscaler** <a name="Test-HPA-with-cluster-autoscaler"></a>
- Checkout [Kubernetes Cluster Autoscaler With IRSA](https://dev.to/awscommunity-asean/kubernetes-cluster-autoscaler-with-irsa-3bg5)

- Assume we set the `targetCPUUtilizationPercentage` to **10%** and use the service then we see it auto scaling a new node (due to resource request MEM of app is high 1000Mi) to serve the new pod
Enter fullscreen mode Exit fullscreen mode

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

app Deployment/app 334%/10% 1 2 2 4m32s

$ kubectl get pod |grep app

app-656ff5fcc8-8x875 1/1 Running 0 19h
app-656ff5fcc8-n5htl 0/1 Pending 0 49s

$ kubectl get node
NAME STATUS ROLES AGE VERSION
ip-10-3-162-16.ap-northeast-2.compute.internal Ready 33h v1.19.6-eks-49a6c0
ip-10-3-245-152.ap-northeast-2.compute.internal Ready 2d7h v1.19.6-eks-49a6c0
ip-10-3-249-203.ap-northeast-2.compute.internal Ready 68s v1.19.6-eks-49a6c0

$ kubectl get pod |grep app
app-656ff5fcc8-8x875 1/1 Running 0 19h
app-656ff5fcc8-svkjl 1/1 Running 0 2m9s


- Now we change the `targetCPUUtilizationPercentage` to **85%** and see if the HPA scaledown the number of `app`
Enter fullscreen mode Exit fullscreen mode

$ kubectl get hpa app
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
app Deployment/app 4%/85% 1 2 2 11m

$ kubectl get pod |grep app
app-656ff5fcc8-8x875 1/1 Running 0 19h

$ kubectl get hpa app
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
app Deployment/app 5%/85% 1 2 1 12m


## 🚀 **Troubleshoot** <a name="Troubleshoot"></a>
- When applying HPA I got the error `failed to get cpu utilization: missing request for cpu`
Enter fullscreen mode Exit fullscreen mode

Events:
Type Reason Age From Message


Warning FailedComputeMetricsReplicas 10m (x12 over 12m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
Warning FailedGetResourceMetric 2m52s (x41 over 12m) horizontal-pod-autoscaler missing request for cpu

$ kubectl get hpa css

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
css Deployment/css /85% 1 2 1 13m


- Failing with the metrics is because the POD is not 100% ready... We need to check its `readinessProb` and either resource request (in my case, I just need to add the resource request to it)
Enter fullscreen mode Exit fullscreen mode
resources:
  requests:
    cpu: 50m
    memory: 100Mi
Enter fullscreen mode Exit fullscreen mode

- The first update, it recreate the pod and got high request
Enter fullscreen mode Exit fullscreen mode

$ kubectl get hpa css

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
css Deployment/css 1467%/85% 1 2 2 4m46s


- It scaleout one more pod and ater the target reduce
Enter fullscreen mode Exit fullscreen mode

$ kubectl get hpa css
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
css Deployment/css 43%/85% 1 2 2 6m22s

Conditions:
Type Status Reason Message


AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)


- Later it removes the pod to meet the expected
Enter fullscreen mode Exit fullscreen mode

$ kubectl get pod -l app=css
NAME READY STATUS RESTARTS AGE
css-5645cb85fd-mtxd6 1/1 Running 0 10m

Top comments (0)