AutoScaling

Autoscaling

Autoscaling can refer to two things: The scaling of application workloads by Kubernetes by deploying additional pods. This is controlled by the Horizontal Pod Autoscaler https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

The Pod AutoScaler can only scale within the confined of the underlying compute resources available.

Moving to AWS customers can combine this with AutoScaling of the underlying infrastructure. This is managed by the Cluster Autoscaler. This allows worker nodes will be added or removed from the cluster based on pods failing due to insufficient resources thereby affecting the total number of resources available.

We will focus on the second as it relates to ROSA.

Cluster autoscaling can be enabled at deployment time using the --enable-autoscaling flag.

Setting up cluster autoscaling

Step 1

List available machine pools

rosa list machinepools -c <cluster-name>
You will see a response like:
ID         AUTOSCALING  REPLICAS  INSTANCE TYPE  LABELS     TAINTS    AVAILABILITY ZONES
default    No           2         m5.xlarge                           us-east-1a

You will note that the above pool does not have autoscaling enabled.

Step 2

enable autoscaling

rosa edit machinepool -c <cluster-name> --enable-autoscaling <machinepool-name> --min-replicas=<num> --max-replicas=<num>
For example:
$ rosa edit machinepool -c my-rosa-cluster --enable-autoscaling default --min-replicas=2 --max-replicas=4
This will create an autoscaler for the worker nodes to scale between 2 and 4 nodes depending on the resources.
The cluster autoscaler increases the size of the cluster when there are pods that failed to schedule on any of the current nodes due to insufficient resources or when another node is necessary to meet deployment needs. The cluster autoscaler does not increase the cluster resources beyond the limits that you specify. The cluster autoscaler decreases the size of the cluster when some nodes are consistently not needed for a significant period, such as when it has low resource use and all of its important pods can fit on other nodes.
Step 3

Confirm Autoscaling

----
rosa list machinepools -c <cluster-name>
----
You will see a response like:
ID         AUTOSCALING  REPLICAS  INSTANCE TYPE  LABELS     TAINTS    AVAILABILITY ZONES
default    Yes          2-4       m5.xlarge                           us-east-1a