loading...

Kubernetes: HorizontalPodAutoscaler — an overview with examples

setevoy profile image Arseny Zinchenko Originally published at itnext.io on ・19 min read

Kubernetes: HorizontalPodAutoscaler — an overview with examples

Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under ReplicationController, Deployment, or ReplicaSet controllers basing on its CPU, memory, or other metrics.

It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling.

For HPA you can use three API types:

Documentation: Support for metrics APIs, and Custom and external metrics for autoscaling workloads.

Besides the HorizontalPodAutoscaler (HPA) you also can use Vertical Pod Autoscaling (VPA) and they can be used together although with some limitations, see Horizontal Pod Autoscaling Limitations.

Content

Create HorizontalPodAutoscaler

Let’s start with a simple HPA which will scale pods basing on CPU usage:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  targetCPUUtilizationPercentage: 10

Here:

  • apiVersion: autoscaling/v1 - an API group autoscaling, pay attention to the API version, as in the v1 at the time of writing, scaling was available by the CPU metrics only, thus memory and custom metrics can be used only with the API v2beta2 (still, you can use v1 with annotations), see API Object.
  • spec.scaleTargetRef: specify for НРА which controller will be scaled (ReplicationController, Deployment, ReplicaSet), in this case, HPA will look for the Deployment object called deployment-example
  • spec.minReplicas, spec.maxReplicas: minimal and maximum pods to be running by this HPA
  • targetCPUUtilizationPercentage: CPU usage % from the requests when HPA will add or remove pods

Create it:

$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example created

Check:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example <unknown>/10% 1 5 0 89s

Currently, its TARGETS has the value as there are no pods created yet, but metrics are already available:

$ kubectl get — raw “/apis/metrics.k8s.io/” | jq{
“kind”: “APIGroup”,
“apiVersion”: “v1”,
“name”: “metrics.k8s.io”,
“versions”: [
{
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”
}
],
“preferredVersion”: {
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”}
}

Add the deployment-example Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-example
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      application: deployment-example
  template:
    metadata:
      labels:
        application: deployment-example
    spec: 
      containers:
      - name: deployment-example-pod
        image: nginx
        ports:
          - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 100Mi

Here we defined Deployment which will spin up one pod with NINGX with requests for 100 millicores and 100 mebibyte memory, see Kubernetes best practices: Resource requests and limits.

Create it:

$ kubectl apply -f hpa-deployment-example.yaml
deployment.apps/deployment-example created

Check the HPA now:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 0%/10% 1 5 1 14m

Our НРА found the deployment and started checking its pods’ metrics.

Let’s check those metrics — find a pod:

$ kubectl get pod | grep example | cut -d “ “ -f 1
deployment-example-86c47f5897–2mzjd

And run the following API request:

$ kubectl get — raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd | jq
{
“kind”: “PodMetrics”,
“apiVersion”: “metrics.k8s.io/v1beta1”,
“metadata”: {
“name”: “deployment-example-86c47f5897–2mzjd”,
“namespace”: “default”,
“selfLink”: “/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd”,
“creationTimestamp”: “2020–08–07T10:41:21Z”
},
“timestamp”: “2020–08–07T10:40:39Z”,
“window”: “30s”,
“containers”: [
{
“name”: “deployment-example-pod”,
“usage”: {
“cpu”: “0”,
“memory”: “2496Ki”
}
}
]
}

CPU usage — zero, memory — 2 megabytes, let’s confirm with the top:

$ kubectl top pod deployment-example-86c47f5897–2mzjd
NAME CPU(cores) MEMORY(bytes)
deployment-example-86c47f5897–2mzjd 0m 2Mi

“Alright, these guys!” (с)

Okay  -  we got our metrics, we’ve created the HPA and deployment -  let’s go to see how the scaling will work here.

Load testing HorizontalPodAutoscaler scaling

For load testing, we can use the loadimpact/loadgentest-wrk utility image.

Now, run ports redirect from the local workstation to the pod with NGINX, as we didn’t add any LoadBalancer (see Kubernetes: ClusterIP vs NodePort vs LoadBalancer, Services, and Ingress — an overview with examples):

$ kubectl port-forward deployment-example-86c47f5897–2mzjd 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80

Check resources once again:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 0%/10% 1 5 1 33m

0% CPU is used, one pod is running (REPLICAS 1).

Run the test:

$ docker run — rm — net=host loadimpact/loadgentest-wrk -c 100 -t 100 -d 5m [http://127.0.0.1:8080](http://127.0.0.1:8080)
Running 5m test @ [http://127.0.0.1:8080](http://127.0.0.1:8080)

Here:

  • open 100 connections using 100 threads
  • run the test for 5 minutes

Check the pod:

$ kubectl top pod deployment-example-86c47f5897–2mzjd
NAME CPU(cores) MEMORY(bytes)
deployment-example-86c47f5897–2mzjd 49m 2Mi

CPU usage now 49mi, and in the requests, we've set the 10 milicpu limit - check the HPA:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 49%/10% 1 5 4 42m

TARGETS 49% from the 10% limit and our HPA started new pods - REPLICAS 4:

$ kubectl get pod | grep example
deployment-example-86c47f5897–2mzjd 1/1 Running 0 31m
deployment-example-86c47f5897–4ntd4 1/1 Running 0 24s
deployment-example-86c47f5897-p7tc7 1/1 Running 0 8s
deployment-example-86c47f5897-q49gk 1/1 Running 0 24s
deployment-example-86c47f5897-zvdvz 1/1 Running 0 24s

Multi-metrics scaling

Okay, we were able to scale by the CPU usage, but what if you want to scale by both CPU and memory usage?

Add another Resource with memory limit set to the same 10%:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 10
  - type: Resource
    resource:
      name: memory 
      targetAverageUtilization: 10

autoscaling API group versions

Let’s go back to the API versions.

In the first manifest for our HPA, we’ve used the autoscaling/v1 API which has the only targetCPUUtilizationPercentage parameter.

Check the autoscaling/v2beta1  -  now, it has the metrics field added which is the MetricSpec array which can hold for new fields - the external, object, pods, resource.

in its turn, the resource has the ResourceMetricSource, which holds two fields - targetAverageUtilization, and targetAverageValue, which are used now in the metrics instead of the targetCPUUtilizationPercentage.

Apply the HPA update:

$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example configured

Check it:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 3%/10%, 0%/10% 1 5 2 126m

The TARGETS now displaying two metrics - CPU and memory.

It’s hard to make NGINX consume a lot of memory, so let’s go to see how much it uses now with the following kubectl command:

$ kubectl get — raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-c4d6f96db-jv6nm | jq ‘.containers[].usage.memory’
“2824Ki”

2 megabytes.

Let’s update our PHA once again and set a new limit, but at this time we’ll use raw values instead of the percent — 1024Ki, 1 megabyte using targetAverageUtilization instead of the previously used targetAverageUtilization:

...
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 10
  - type: Resource
    resource:
      name: memory
      targetAverageValue: 1024Ki

Apply and check:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AG
hpa-example Deployment/deployment-example 2551808/1Mi, 0%/10% 1 5 3 2m8s

REPLICAS == 3 now, pods were scaled, check the value from the TARGETS - convert it to kilobytes:

$ echo 2551808/1024 | bc
2492

And check real memory usage:

$ kubectl get — raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-c4d6f96db-fldl2 | jq ‘.containers[].usage.memory’
“2496Ki”

2492 ~= 2496Ki.

Okay, so — we are able now to scale the Deployment by both CPU and memory usage.

Custom Metrics

Memory metrics scaling

Apart from metrics provided by the API server and cAdvisor we can use any other metrics, for example - metrics, collected by Prometheus.

It can be metrics collected by a Cloudwatch exporter, Prometheus’ node_exporter, or metrics from an application.

Documention is here>>>.

As we are using Prometheus (see Kubernetes: monitoring with Prometheus — exporters, a Service Discovery, and its roles and Kubernetes: a cluster’s monitoring with the Prometheus Operator so let’s add its adapter:

If you’ll try to access external or custom API endpoints now you’ll get the error:

$ kubectl get — raw /apis/custom.metrics.k8s.io/
Error from server (NotFound): the server could not find the requested resource

$ kubectl get — raw /apis/external.metrics.k8s.io/
Error from server (NotFound): the server could not find the requested resource

Install the adapter from the Helm chart:

$ helm install prometheus-adapter stable/prometheus-adapter
NAME: prometheus-adapter
LAST DEPLOYED: Sat Aug 8 13:27:36 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.

In a few minutes you should be able to list metrics using the following command(s):

$ kubectl get — raw /apis/custom.metrics.k8s.io/v1beta

Wait for a minute or two and check API again:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
“kind”: “APIResourceList”,
“apiVersion”: “v1”,
“groupVersion”: “custom.metrics.k8s.io/v1beta1”,
“resources”: []
}

Well, but why the "resources":[] is empty?

Check the adapter’s pod logs:

$ kubectl logs -f prometheus-adapter-b8945f4d8-q5t6x
I0808 10:45:47.706771 1 adapter.go:94] successfully using in-cluster auth
E0808 10:45:47.752737 1 provider.go:209] unable to update list of all metrics: unable to fetch metrics for query “{__name__=~"^container_.*”,container!=”POD”,namespace!=””,pod!=””}”: Get [http://prometheus.default.svc:9090/api/v1/series?match%5B%5D=%7B__name__%3D~%22%5Econtainer_.%2A%22%2Ccontainer%21%3D%22POD%22%2Cnamespace%21%3D%22%22%2Cpod%21%3D%22%22%7D&start=1596882347.736:](http://prometheus.default.svc:9090/api/v1/series?match%5B%5D=%7B __name__ %3D~%22%5Econtainer_.%2A%22%2Ccontainer%21%3D%22POD%22%2Cnamespace%21%3D%22%22%2Cpod%21%3D%22%22%7D&start=1596882347.736:) dial tcp: lookup prometheus.default.svc on 172.20.0.10:53: no such host
I0808 10:45:48.032873 1 serving.go:306] Generated self-signed cert (/tmp/cert/apiserver.crt, /tmp/cert/apiserver.key)
…

Here is the error:

dial tcp: lookup prometheus.default.svc on 172.20.0.10:53: no such host

Let try to access our Prometheus Operator from the pod by its Service DNS name:

$ kubectl exec -ti deployment-example-c4d6f96db-fldl2 curl prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090/metrics | head -5
HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile=”0"} 2.0078e-05
go_gc_duration_seconds{quantile=”0.25"} 3.8669e-05
go_gc_duration_seconds{quantile=”0.5"} 6.956e-05

Okay, we are able to reach it using the prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090 URL.

Edit the adapter’s Deployment:

$ kubectl edit deploy prometheus-adapter

Update its prometheus-url:

...
    spec:
      affinity: {}
      containers:
      - args:
        - /adapter
        - --secure-port=6443
        - --cert-dir=/tmp/cert
        - --logtostderr=true
        - --prometheus-url=http://prometheus-prometheus-oper-prometheus.monitoring.svc.cluster.local:9090
...

Apply changes and check again:

$ kubectl get — raw “/apis/custom.metrics.k8s.io/v1beta1” | jq . |grep “pods/” | head -5
“name”: “pods/node_load15”,
“name”: “pods/go_memstats_next_gc_bytes”,
“name”: “pods/coredns_forward_request_duration_seconds_count”,
“name”: “pods/rest_client_requests”,
“name”: “pods/node_ipvs_incoming_bytes”,

Nice — we’ve got our metrics and can use them now in the HPA.

Check the API server for the memory_usage_bytes metric:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/memory_usage_bytes” | jq .
{
“kind”: “MetricValueList”,
“apiVersion”: “custom.metrics.k8s.io/v1beta1”,
“metadata”: {
“selfLink”: “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/memory_usage_bytes”
},
“items”: [
{
“describedObject”: {
“kind”: “Pod”,
“namespace”: “default”,
“name”: “deployment-example-c4d6f96db-8tfnw”,
“apiVersion”: “/v1”
},
“metricName”: “memory_usage_bytes”,
“timestamp”: “2020–08–08T11:18:53Z”,
“value”: “11886592”,
“selector”: null
},
…

Update the НРА’s manifest:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: memory_usage_bytes
      targetAverageValue: 1024000

Check the HPA’s values now:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 4694016/1Mi, 0%/10% 1 5 5 69m

Apply the latest changes:

$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example configured

Check again:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 11853824/1024k 1 5 1 16s

Still have 1 replica, check events:

…
43s Normal ScalingReplicaSet Deployment Scaled up replica set deployment-example-c4d6f96db to 1
16s Normal ScalingReplicaSet Deployment Scaled up replica set deployment-example-c4d6f96db to 4
1s Normal ScalingReplicaSet Deployment Scaled up replica set deployment-example-c4d6f96db to 5
16s Normal SuccessfulRescale HorizontalPodAutoscaler New size: 4; reason: pods metric memory_usage_bytes above target
1s Normal SuccessfulRescale HorizontalPodAutoscaler New size: 5; reason: pods metric memory_usage_bytes above target
…

And HPA again:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 6996787200m/1024k 1 5 5 104s

Great — “It works!” ©

And it’s good for metrics which are already present in the cluster like the memory_usage_bytes by default collected by the cAdvisor from all containers in the cluster.

Let’s try to use more custom metrics, for example  -  let’s scale a Gorush-server by using its own metrics, see Kubernetes: running a push-server with Gorush behind an AWS LoadBalancer.

Application-based metrics scaling

So, we have the Gorush server running in our cluster which is used to send push-notifications to mobile clients.

It has the built-in /metrics endpoint which returns standard time-series metrics that can be used in Prometheus.

To run the testing Gorush server we can use such Service, ConfigMap, and Deployment:

apiVersion: v1
kind: Service
metadata:
  name: gorush
  labels:
    app: gorush
    tier: frontend
spec:
  selector:
    app: gorush
    tier: frontend
  type: ClusterIP
  ports:
  - name: gorush
    protocol: TCP
    port: 80
    targetPort: 8088
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: gorush-config
data:
  stat.engine: memory
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gorush
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gorush
      tier: frontend
  template:
    metadata:
      labels:
        app: gorush
        tier: frontend
    spec:
      containers:
      - image: appleboy/gorush
        name: gorush
        imagePullPolicy: Always
        ports:
        - containerPort: 8088
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8088
          initialDelaySeconds: 3
          periodSeconds: 3
        env:
        - name: GORUSH_STAT_ENGINE
          valueFrom:
            configMapKeyRef:
              name: gorush-config
              key: stat.engine

Create a dedicated namespace:

$ kubectl create ns eks-dev-1-gorush
namespace/eks-dev-1-gorush created

Create the application:

$ kubectl -n eks-dev-1-gorush apply -f my-gorush.yaml
service/gorush created
configmap/gorush-config created
deployment.apps/gorush created

Check its pods:

$ kubectl -n eks-dev-1-gorush get pod
NAME READY STATUS RESTARTS AGE
gorush-5c6775748b-6r54h 1/1 Running 0 83s

Gorush Service:

$ kubectl -n eks-dev-1-gorush get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
gorush ClusterIP 172.20.186.251 <none> 80/TCP 103s

Run the port-forward to its pod:

$ kubectl -n eks-dev-1-gorush port-forward gorush-5c6775748b-6r54h 8088:8088
Forwarding from 127.0.0.1:8088 -> 8088
Forwarding from [::1]:8088 -> 8088

Check metrics:

$ curl -s localhost:8088/metrics | grep gorush | head
HELP gorush_android_fail Number of android fail count
TYPE gorush_android_fail gauge
gorush_android_fail 0
HELP gorush_android_success Number of android success count
TYPE gorush_android_success gauge
gorush_android_success 0
HELP gorush_ios_error Number of iOS fail count
TYPE gorush_ios_error gauge
gorush_ios_error 0
HELP gorush_ios_success Number of iOS success count

Or another way: by using its Service name, we can reach it directly.

Find the Service:

$ kubectl -n eks-dev-1-gorush get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
gorush ClusterIP 172.20.186.251 <none> 80/TCP 26m

Open proxy to the API-server:

$ kubectl proxy — port=8080
Starting to serve on 127.0.0.1:8080

And connect to the Service:

$ curl -sL localhost:8080/api/v1/namespaces/eks-dev-1-gorush/services/gorush:gorush/proxy/metrics | head
HELP go_gc_duration_seconds A summary of the GC invocation durations.
TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile=”0"} 9.194e-06
go_gc_duration_seconds{quantile=”0.25"} 1.2092e-05
go_gc_duration_seconds{quantile=”0.5"} 2.1812e-05
go_gc_duration_seconds{quantile=”0.75"} 5.1794e-05
go_gc_duration_seconds{quantile=”1"} 0.000145631
go_gc_duration_seconds_sum 0.001080551
go_gc_duration_seconds_count 32
HELP go_goroutines Number of goroutines that currently exist.

Kubernetes ServiceMonitor

The next thing to do is to add a ServiceMonitor to the Kubernetes cluster for our Prometheus Operator which will collect those metrics, check the Adding Kubernetes ServiceMonitor.

Check metrics now with port-forward:

$ kubectl -n monitoring port-forward prometheus-prometheus-prometheus-oper-prometheus-0 9090:9090
Forwarding from [::1]:9090 -> 9090
Forwarding from 127.0.0.1:9090 -> 9090

Try to access them:

$ curl “localhost:9090/api/v1/series?match[]=gorush_total_push_count&start=1597141864”
{“status”:”success”,”data”:[]}

The "data":[] is empty now - or Prometheus doesn't collect those metrics yet.

Define the ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    serviceapp: gorush-servicemonitor
    release: prometheus
  name: gorush-servicemonitor
  namespace: monitoring
spec:     
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 15s
    port: gorush
  namespaceSelector:
    matchNames: 
    - eks-dev-1-gorush
  selector: 
    matchLabels:
      app: gorush

Note : The Prometheus resource includes a field called serviceMonitorSelector, which defines a selection of ServiceMonitors to be used. By default and before the version v0.19.0, ServiceMonitors must be installed in the same namespace as the Prometheus instance. With the Prometheus Operator v0.19.0 and above, ServiceMonitors can be selected outside the Prometheus namespace via the serviceMonitorNamespaceSelector field of the Prometheus resource.

See Prometheus Operator

Create this ServiceMonitor:

$ kubectl apply -f ../../gorush-service-monitor.yaml
servicemonitor.monitoring.coreos.com/gorush-servicemonitor created

Check it in the Targets:

UP, good.

And in a couple of minutes check for metrics again:

$ curl “localhost:9090/api/v1/series?match[]=gorush_total_push_count&start=1597141864”
{“status”:”success”,”data”:[{“__name__”:”gorush_total_push_count”,”endpoint”:”gorush”,”instance”:”10.3.35.14:8088",”job”:”gorush”,”namespace”:”eks-dev-1-gorush”,”pod”:”gorush-5c6775748b-6r54h”,”service”:”gorush”}]}

Or in this way:

$ curl -s localhost:9090/api/v1/label/__name__/values | jq | grep gorush
“gorush_android_fail”,
“gorush_android_success”,
“gorush_ios_error”,
“gorush_ios_success”,
“gorush_queue_usage”,
“gorush_total_push_count”,

Nice, we’ve got our metrics, let’s go ahead and use them in the HorizontalPodAutoscaler of this Deployment.

Check the metrics groups available here:

$ kubectl get — raw “/apis/custom.metrics.k8s.io/v1beta1” | jq . | grep “gorush”
“name”: “services/gorush_android_success”,
“name”: “pods/gorush_android_fail”,
“name”: “namespaces/gorush_total_push_count”,
“name”: “namespaces/gorush_queue_usage”,
“name”: “pods/gorush_ios_success”,
“name”: “namespaces/gorush_ios_success”,
“name”: “jobs.batch/gorush_ios_error”,
“name”: “services/gorush_total_push_count”,
“name”: “jobs.batch/gorush_queue_usage”,
“name”: “pods/gorush_queue_usage”,
“name”: “jobs.batch/gorush_android_fail”,
“name”: “services/gorush_queue_usage”,
“name”: “services/gorush_ios_success”,
“name”: “jobs.batch/gorush_android_success”,
“name”: “jobs.batch/gorush_total_push_count”,
“name”: “pods/gorush_ios_error”,
“name”: “pods/gorush_total_push_count”,
“name”: “pods/gorush_android_success”,
“name”: “namespaces/gorush_android_success”,
“name”: “namespaces/gorush_android_fail”,
“name”: “namespaces/gorush_ios_error”,
“name”: “jobs.batch/gorush_ios_success”,
“name”: “services/gorush_ios_error”,
“name”: “services/gorush_android_fail”,

Add a new manifest with the HPA which will use the gorush_queue_usage from the Pods group:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: gorush-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gorush
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: gorush_total_push_count
      targetAverageValue: 2

With such settings, the HPA has to scale pods once the gorush_total_push_count's value will be over 2.

Create it:

$ kubectl -n eks-dev-1-gorush apply -f my-gorush.yaml
service/gorush unchanged
configmap/gorush-config unchanged
deployment.apps/gorush unchanged
horizontalpodautoscaler.autoscaling/gorush-hpa created

Check its value now:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_total_push_count” | jq ‘.items[].value’
“0”

Check the НРА:

$ kubectl -n eks-dev-1-gorush get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
gorush-hpa Deployment/gorush 0/1 1 5 1 17s

TARGETS 0/1, okay.

Send a push:

$ curl -X POST a6095d18859c849889531cf08baa6bcf-531932299.us-east-2.elb.amazonaws.com/api/push -d ‘{“notifications”:[{“tokens”:[“990004543798742”],”platform”:2,”message”:”Hello Android”}]}’
{“counts”:1,”logs”:[],”success”:”ok”}

Check the gorush_total_push_count metric again:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_total_push_count” | jq ‘.items[].value’
“1”

One push was sent.

Check the HPA one more time:

$ kubectl -n eks-dev-1-gorush get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
gorush-hpa Deployment/gorush 1/2 1 5 1 9m42s

TARGETS 1/2, REPLICAS still has 1, send another push, and check the events:

$ kubectl -n eks-dev-1-gorush get events — watch
LAST SEEN TYPE REASON KIND MESSAGE
18s Normal Scheduled Pod Successfully assigned eks-dev-1-gorush/gorush-5c6775748b-x8fjs to ip-10–3–49–200.us-east-2.compute.internal
17s Normal Pulling Pod Pulling image “appleboy/gorush”
17s Normal Pulled Pod Successfully pulled image “appleboy/gorush”
17s Normal Created Pod Created container gorush
17s Normal Started Pod Started container gorush
18s Normal SuccessfulCreate ReplicaSet Created pod: gorush-5c6775748b-x8fjs
18s Normal SuccessfulRescale HorizontalPodAutoscaler New size: 2; reason: pods metric gorush_total_push_count above target
18s Normal ScalingReplicaSet Deployment Scaled up replica set gorush-5c6775748b to 2

And the HPA:

$ kubectl -n eks-dev-1-gorush get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
gorush-hpa Deployment/gorush 3/2 1 5 2 10m

Great! So, scaling by the gorush_total_push_count is working.

But here a trap: the gorush_total_push_count is cumulative metric, e.g. on the Production graph it will look like the next:

And in such a case our HPA will scale pods till the end of time.

Prometheus Adapter ConfigMap - seriesQuery and metricsQuery

To mitigate this let’s add an own metric.

The Prometheus Adapter has its own ConfigMap:

$ kubectl get cm prometheus-adapter
NAME DATA AGE
prometheus-adapter 1 46h

Wich contains the config.yaml, see its example here>>>.

Create a PromQL query which will return pushes count per second:

rate(gorush_total_push_count{instance="push.server.com:80",job="push-server"}[5m])

Update the ConfigMap and add new query there:

apiVersion: v1
data:
  config.yaml: |
    rules:
    - seriesQuery: '{__name__=~"gorush_total_push_count"}'
      seriesFilters: []
      resources:
        overrides:
          namespace:
            resource: namespace
          pod:
            resource: pod
      name:
        matches: ""
        as: "gorush_push_per_second"
      metricsQuery: rate(<<.Series>>{<<.LabelMatchers>>}[5m])

Save and exit, check it:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_push_per_second” | jq
Error from server (NotFound): the server could not find the metric gorush_push_per_second for pods

Re-create the pod so it will apply the changes (see Kubernetes: ConfigMap and Secrets — data auto-reload in pods):

$ kubectl delete pod prometheus-adapter-7c56787c5c-kllq6
pod “prometheus-adapter-7c56787c5c-kllq6” deleted

Check it:

$ kubectl get — raw=”/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/*/gorush_push_per_second” | jq
{
“kind”: “MetricValueList”,
“apiVersion”: “custom.metrics.k8s.io/v1beta1”,
“metadata”: {
“selfLink”: “/apis/custom.metrics.k8s.io/v1beta1/namespaces/eks-dev-1-gorush/pods/%2A/gorush_push_per_second”
},
“items”: [
{
“describedObject”: {
“kind”: “Pod”,
“namespace”: “eks-dev-1-gorush”,
“name”: “gorush-5c6775748b-6r54h”,
“apiVersion”: “/v1”
},
“metricName”: “gorush_push_per_second”,
“timestamp”: “2020–08–11T12:28:03Z”,
“value”: “0”,
“selector”: null
},
…

Update the НРА to use the gorush_push_per_second:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: gorush-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gorush
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: gorush_push_per_second
      targetAverageValue: 1m

Check it:

$ kubectl -n eks-dev-1-gorush get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
gorush-hpa Deployment/gorush 0/1m 1 5 1 68m

Events:

…
0s Normal SuccessfulRescale HorizontalPodAutoscaler New size: 4; reason: pods metric gorush_push_per_second above target
0s Normal ScalingReplicaSet Deployment Scaled up replica set gorush-5c6775748b to 4
…

And the HPA now:

$ kubectl -n eks-dev-1-gorush get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
gorush-hpa Deployment/gorush 11m/1m 1 5 5 70m

Done.

Useful links

Originally published at RTFM: Linux, DevOps и системное администрирование.


Posted on by:

setevoy profile

Arseny Zinchenko

@setevoy

DevOps, cloud and infrastructure engineer. Love Linux, OpenSource, and AWS.

Discussion

pic
Editor guide