#15 on Fast Company’s Best Workplaces for Innovators List – Learn more

Autoscaling K8s HPA with Google HTTP/S Load-Balancer RPS EXTERNAL Stackdriver Metrics

1 ce4zye4gpmytsbthtlntwq

Most of the time, we scale our Kubernetes deployments based on metrics such as CPU or memory consumption, but sometimes we need to scale based on external metrics. In this post, I’ll guide you through the process of setting up Horizontal Pod Autoscaler (HPA) autoscaling using any Stackdriver metric; specifically we’ll use the Request Per Second from a Google Cloud HTTP/S Load Balancer.

1 ce4zye4gpmytsbthtlntwq
Autoscaling Kubernetes Horizontal Pod Autoscaler with Stackdriver Metrics

Let’s Go!

First let’s create a new Google Kubernetes Engine (GKE) cluster:

gcloud beta container clusters create "hpa-with-stackdriver-metrics" --zone "us-central1-a" \
--username "admin" \
--cluster-version "1.10.7-gke.6" \
--machine-type "n1-standard-1" \
--image-type "COS" \
--disk-type "pd-standard" \
--disk-size "100" --scopes \ "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append"
--num-nodes "3" \ 
--enable-cloud-logging \
--enable-cloud-monitoring \
--addons HorizontalPodAutoscaling,HttpLoadBalancing \
--enable-autoupgrade --enable-autorepair

Note the `enable-cloud-monitoring` which will allow us to read from the Stackdriver Monitoring metrics.

Deploy Custom Metrics Stackdriver Adapter

The custom metrics adapter is responsible for importing stackdriver metrics to the Kubernetes API, this will enable the HPA to consume these metrics and act upon them. You can see more details about that in the troubleshooting section below..

To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter in your cluster.

In order to run Custom Metrics Adapter you must grant your user the ability to create required authorization roles by running the following command:

kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin \
--user "$(gcloud config get-value account)"

And now let’s deploy the actual adapter that will enable us to read metrics from Stackdriver

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml

Create a Deployment

Now, let’s deploy a simple nginx application that will be scaling later on based on the RPS measured by HTTP/S Load Balancer.

create this file: deployment.yaml

apiVersion: apps/v1 
kind: Deployment
  name: nginx
      app: nginx
  replicas: 1
        app: nginx
      - name: nginx
        image: nginx:1.8 
        - containerPort: 80
apiVersion: v1
kind: Service
  name: nginx
    app: nginx
  type: NodePort
  - port: 80
    protocol: TCP
    app: nginx

Now let’s deploy it:

kubectl apply -f deployment.yaml

Create LoadBalancer Ingress

create ingress file: ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
  name: basic-ingress
    serviceName: nginx
    servicePort: 80

And apply the ingress

kubectl apply -f ingress.yaml

Create HorizontalPodAutoscaler object

This is where the magic happens,
we use an external metric*, with metricName:


Note: you can find the list of all Stackdriver metrics here or you can use the Metrics Explorer.

We should also use a metricSelector, to make sure we are using only our specific load balancer metrics, so we use a metricSelector.

let’s find our LB forwarding rule:

$ kubectl describe ingress basic-ingress
Name:             basic-ingress
Namespace:        default
Default backend:  nginx:80 (
  Host  Path  Backends
  ----  ----  --------
  *     *     nginx:80 (
  backends:         {"k8s-be-32432--ffd629d77b6630de":"HEALTHY"}
  forwarding-rule:  k8s-fw-default-basic-ingress--ffd629d77b6630de
  target-proxy:     k8s-tp-default-basic-ingress--ffd629d77b6630de
  url-map:          k8s-um-default-basic-ingress--ffd629d77b6630de

now we can add the label match to our config (notice the label: “forwarding_rule_name” )

          resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de

The final file will look like this: hpa.yaml

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
  name: nginx
  minReplicas: 1
  maxReplicas: 5
  - external:
      metricName: loadbalancing.googleapis.com|https|request_count metricSelector: matchLabels: resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de targetAverageValue: "1" type: External scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx

notice the we have used targetAverageValue, this specifies how much of a total value of metric each replica can handle. This is useful when using metrics that describe some work or resource that can be divided between replicas, in our case each replica can handle a single (i.e. 1) RPS. You should, of course, change this according to your needs.

Let’s test everything

Let’s start by driving traffic to our load balancer.
As you can see from the above command :

kubectl describe ingress basic-ingress

Our Ingress Public IP address is :

now let’s start hitting that endpoint 🥊 with some requests:

while true ; do curl -Ss -k --write-out '%{http_code}\n' --output /dev/null ; done

now let’s see if our HorizontalPodAutoscaler is affected:

kubectl describe hpa nginx-hpa

at this point you might see some warnings since the metric is not populated yet, but after a few minutes we see that the Metrics section is populated:

Name:                                                                         nginx-hpa
Namespace:                                                                    default
Labels:                                                                       <none>
Annotations:                                                                  kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-hpa","namespace":"default"},"spec":{"ma...
CreationTimestamp:                                                            Wed, 31 Oct 2018 18:18:28 +0200
Reference:                                                                    Deployment/nginx
Metrics:                                                                      ( current / target )
"loadbalancing.googleapis.com|https|request_count" (target average value):  1034m / 1
Min replicas:                                                                 1
Max replicas:                                                                5

And in the “Events” section we can see:

Type     Reason                        Age                   From                       Message
Normal   SuccessfulRescale             2m                   horizontal-pod-autoscaler  New size: 2; reason: external metric loadbalancing.googleapis.com|https|request_count(&LabelSelector{MatchLabels:map[string]string{resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de,},MatchExpressions:[],}) above target

We have a liftoff! 🚀


  1. An easy way to see if the metric is being imported to the Kubernetes external metrics api is to browse the api manually. I will also help you to check whether you have used the metricSelector correctly.

First thing, we run the kubernetes proxy

kubectl proxy --port=8080

And then we can access from our localhost:


And this is an excerpt of the result:

  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Crequest_count"
  "items": [
      "metricName": "loadbalancing.googleapis.com|https|request_count",
      "metricLabels": {
        "metric.labels.cache_result": "DISABLED",
        "resource.labels.backend_target_type": "BACKEND_SERVICE",
        "resource.labels.backend_name": "k8s-ig--ffd629d77b6630de",
        "resource.labels.forwarding_rule_name": "k8s-fw-default-basic-ingress--ffd629d77b6630de",
      "timestamp": "2018-11-01T08:41:30Z",
      "value": "2433m"


2. A way to see how the adapter behaves is to watch it’s logs, first let’s list the custom-metrics pods:

$ kubectl get pods -n custom-metrics
NAME                                                 READY
custom-metrics-stackdriver-adapter-c4d98dc54-2n4jz   1/1

Finally, let’s watch to logs:

$ kubectl logs custom-metrics-stackdriver-adapter-c4d98dc54-2n4jz -n custom-metrics
I1104 06:42:11.125627       1 trace.go:76] Trace[1192308782]: "List /apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com|https|request_count" (started: 2018-11-04 06:42:08.155209905 +0000 UTC m=+311951.293335726) (total time: 2.970372027s):
Trace[1192308782]: [2.97027864s] [2.970185564s] Listing from storage done

You can get a lot of valuable information from those logs, especially if there are any error messages.


Using external metrics such as these collected and stored by the Stackdriver is pretty straightforward and fairly easy to use. In a similar way you can also use your own custom metrics being published to Stackdriver using Monitoring API.

I have created a GitHub repo with the resources I have used for this post here.

Want more stories? Check our blog, or follow Eran on Twitter.

Subscribe to updates, news and more.