DEV Community

Cover image for Horizontal Pod Autoscaler On EKS Cluster

Horizontal Pod Autoscaler On EKS Cluster

Abstract

This post introduce you Horizontal Pod Autoscaler (HPA) which is the best combination with cluster autoscaler to provide HA for your applications.

Table Of Contents


🚀 What is Horizontal Pod Autoscaler (HPA)

The Kubernetes Horizontal Pod Autoscaler automatically scales the number of pods in a deployment, replication controller, or replica set based on that resource's CPU utilization

🚀 Install metric-server

  • Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. These metrics will drive the scaling behavior of the deployments.

flow

  • Without metric server you will get <unknown> metric when trying to add HPA
$ kubectl get hpa            
NAME   REFERENCE        TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
app    Deployment/app   <unknown>/85%   1         2         0          6s
Enter fullscreen mode Exit fullscreen mode
  • Describe the HPA and see it is not able to collect metrics
$ kubectl describe hpa app                                                                                                                                                                 

Events:                                                                                                  
    Type     Reason                        Age                    From                       Message
    ----     ------                        ----                   ----                       -------
    Warning  FailedComputeMetricsReplicas  4m5s (x12 over 6m54s)  horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource
    cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
    Warning  FailedGetResourceMetric       111s (x21 over 6m54s)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the
    requested resource (get pods.metrics.k8s.io)
Enter fullscreen mode Exit fullscreen mode
  • There's no metric installed yet
$ kubectl get apiservice|grep metric
Enter fullscreen mode Exit fullscreen mode
  • Now we deploy the Metrics Server
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Enter fullscreen mode Exit fullscreen mode
  • Check metric-server and apiService
$ kubectl get deployment metrics-server -n kube-system
NAME             READY   UP-TO-DATE   AVAILABLE   AGE                                                                                                                                                              
metrics-server   1/1     1            0           6s

$ kubectl get apiservice|grep metric
v1beta1.metrics.k8s.io                 kube-system/metrics-server   True        92s
Enter fullscreen mode Exit fullscreen mode
  • After install metric-server we can apply HPA and use following commands
$ kubectl top nodes
$ kubectl top pods
Enter fullscreen mode Exit fullscreen mode

🚀 Create HPA yaml

  • Use CDK8S to create k8s yaml files as code. Read more

  • An exmaple of HPA

from constructs import Construct
from cdk8s import Chart
from imports import k8s
from cdk8s import Chart, App


class AppHpa(Chart):
    def __init__(self, scope: Construct, id: str, name_space):
        super().__init__(scope, id, namespace=name_space)

        app_name = 'app'
        app_label = {'dev': app_name}
        k8s.KubeHorizontalPodAutoscalerV2Beta2(
            self, 'AppHpa',
            metadata=k8s.ObjectMeta(labels=app_label, name=app_name),
            spec=k8s.HorizontalPodAutoscalerSpec(
                max_replicas=2,
                min_replicas=1,
                scale_target_ref=k8s.CrossVersionObjectReference(
                    kind="Deployment",
                    name=app_name,
                    api_version='apps/v1'
                ),
                metrics=[
                    k8s.MetricSpec(
                        type='Resource',
                        resource=k8s.ResourceMetricSource(
                            name='cpu',
                            target=k8s.MetricTarget(type='Utilization', average_utilization=85)
                        )
                    )
                ]
            )
        )


app = App()
AppHpa(app, "app-hpa")
app.synth()
Enter fullscreen mode Exit fullscreen mode
  • Output after running cdk8s sync
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  labels:
    dev: app
  name: app
  namespace: dev8
spec:
  maxReplicas: 2
  metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 85
          type: Utilization
      type: Resource
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app
Enter fullscreen mode Exit fullscreen mode

🚀 Test HPA with cluster autoscaler

  • Checkout Kubernetes Cluster Autoscaler With IRSA

  • Assume we set the targetCPUUtilizationPercentage to 10% and use the service then we see it auto scaling a new node (due to resource request MEM of app is high 1000Mi) to serve the new pod

NAME   REFERENCE        TARGETS    MINPODS   MAXPODS   REPLICAS   AGE                                                                                                                                              
app    Deployment/app   334%/10%   1         2         2          4m32s                                                                                                                                            

$ kubectl get pod |grep app                                                                                                                                                                
app-656ff5fcc8-8x875                     1/1     Running   0          19h
app-656ff5fcc8-n5htl                     0/1     Pending   0          49s

$ kubectl get node
NAME                                              STATUS   ROLES    AGE    VERSION
ip-10-3-162-16.ap-northeast-2.compute.internal    Ready    <none>   33h    v1.19.6-eks-49a6c0
ip-10-3-245-152.ap-northeast-2.compute.internal   Ready    <none>   2d7h   v1.19.6-eks-49a6c0
ip-10-3-249-203.ap-northeast-2.compute.internal   Ready    <none>   68s    v1.19.6-eks-49a6c0

$ kubectl get pod |grep app
app-656ff5fcc8-8x875                     1/1     Running   0          19h
app-656ff5fcc8-svkjl                     1/1     Running   0          2m9s
Enter fullscreen mode Exit fullscreen mode
  • Now we change the targetCPUUtilizationPercentage to 85% and see if the HPA scaledown the number of app
$ kubectl get hpa app
NAME   REFERENCE        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
app    Deployment/app   4%/85%    1         2         2          11m

$ kubectl get pod |grep app
app-656ff5fcc8-8x875                     1/1     Running   0          19h

$ kubectl get hpa app
NAME   REFERENCE        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
app    Deployment/app   5%/85%    1         2         1          12m
Enter fullscreen mode Exit fullscreen mode

🚀 Troubleshoot

  • When applying HPA I got the error failed to get cpu utilization: missing request for cpu
Events:
  Type     Reason                        Age                   From                       Message
  ----     ------                        ----                  ----                       -------
  Warning  FailedComputeMetricsReplicas  10m (x12 over 12m)    horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
  Warning  FailedGetResourceMetric       2m52s (x41 over 12m)  horizontal-pod-autoscaler  missing request for cpu

$ kubectl get hpa css                                                           
NAME   REFERENCE        TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
css    Deployment/css   <unknown>/85%   1         2         1          13m
Enter fullscreen mode Exit fullscreen mode
  • Failing with the metrics is because the POD is not 100% ready... We need to check its readinessProb and either resource request (in my case, I just need to add the resource request to it)
    resources:
      requests:
        cpu: 50m
        memory: 100Mi
Enter fullscreen mode Exit fullscreen mode
  • The first update, it recreate the pod and got high request
$ kubectl get hpa css                                                           
NAME   REFERENCE        TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
css    Deployment/css   1467%/85%   1         2         2          4m46s
Enter fullscreen mode Exit fullscreen mode
  • It scaleout one more pod and ater the target reduce
$ kubectl get hpa css
NAME   REFERENCE        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
css    Deployment/css   43%/85%   1         2         2          6m22s

Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
Enter fullscreen mode Exit fullscreen mode
  • Later it removes the pod to meet the expected
$ kubectl get pod -l app=css
NAME                   READY   STATUS    RESTARTS   AGE
css-5645cb85fd-mtxd6   1/1     Running   0          10m
Enter fullscreen mode Exit fullscreen mode

vumdao image

Discussion (0)