exploring and configuring machinesets

Scaling an OpenShift 4 Cluster

With OpenShift 4.0+, we now have the ability to dynamically scale the cluster size through OpenShift itself.

In this lab you will learn how to scale OpenShift 4 cluster both manually and automatically. Applying autoscaling to an OpenShift Container Platform cluster involves deploying a ClusterAutoscaler and then deploying MachineAutoscalers for each Machine type in your cluster.

Manual Cluster Scale Up/Down (Option 1)

In this exercise we’re going to manually add worker nodes to our cluster:

  • Go to the OpenShift web console and login with admin username that is provided

  • Browse to Compute on the left-hand side-bar, and click Machine Sets.

  • Select openshift-machine-api from the Project dropdown and you should see the list of machine sets available for the platform:

image
  • Select one of the machine sets in the list by clicking on the name, e.g. cluster-3e5f-kqr2b-worker-us-east-2a (yours will be slightly different)

Note
Depending on the AWS region you chose, you may have several worker machine sets that can be scaled, some of which are at a scale of 0. It does not matter which set you choose for this example.
  • In the Actions pull down menu (on the right hand side), select Edit Count

  • Enter 3 and click Save

image
  • Click the Machines tab and see the allocated machines.

  • Click Overview tab, it will let you know when when the machines become ready.

  • Click Machine Sets under Machines on the left-hand side again, you will also see the status of the machines in the set:

image
  • You may need to wait a few minutes before the machines become ready.

In the background additional EC2 instances are being provisioned and then registered and configured to participate in the cluster, so yours may still show 1/3.

You can also view this in the CLI:

[~] $ oc get machinesets -n openshift-machine-api
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-3e5f-kqr2b-worker-us-east-2a   3         3         3       3           24h
cluster-3e5f-kqr2b-worker-us-east-2b   1         1         1       1           24h
cluster-3e5f-kqr2b-worker-us-east-2c   1         1         1       1           24h


[~] $ oc get nodes
NAME                                         STATUS   ROLES          AGE     VERSION
ip-10-0-128-248.us-east-2.compute.internal   Ready    worker         3m56s   v1.13.4+c3617b99f
ip-10-0-137-97.us-east-2.compute.internal    Ready    master         24h     v1.13.4+c3617b99f
ip-10-0-138-117.us-east-2.compute.internal   Ready    worker         24h     v1.13.4+c3617b99f
ip-10-0-141-208.us-east-2.compute.internal   Ready    worker         3m52s   v1.13.4+c3617b99f
ip-10-0-148-106.us-east-2.compute.internal   Ready    worker         24h     v1.13.4+c3617b99f
ip-10-0-156-129.us-east-2.compute.internal   Ready    master         24h     v1.13.4+c3617b99f
ip-10-0-167-6.us-east-2.compute.internal     Ready    master         24h     v1.13.4+c3617b99f
ip-10-0-170-241.us-east-2.compute.internal   Ready    worker         24h     v1.13.4+c3617b99f

You’ll note that some of these systems 'age' will be much newer than some of the others, these are the ones that will have just been added in. Before continuing, scale back down by editing the count to whatever it was previously for the Machine Set, i.e. return it to '1' node.

Manual Cluster Scale Up/Down via the CLI (Option 2)

  • You can alter the Machine Set count in several ways in the web UI console, but you can also perform the same operation via the CLI by using the oc edit command on the machineset in the openshift-machine-api project.

[~] $ oc get machineset -n openshift-machine-api
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-4c7b-lkw4d-worker-us-east-2a   1         1         1       1           6h17m
cluster-4c7b-lkw4d-worker-us-east-2b   1         1         1       1           6h17m
cluster-4c7b-lkw4d-worker-us-east-2c   1         1         1       1           6h17m
[~] $ oc edit machineset cluster-4c7b-lkw4d-worker-us-east-2a -n openshift-machine-api
(opens in vi)
  • You will see the following text:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  annotations:
    autoscaling.openshift.io/machineautoscaler: openshift-machine-api/autoscale-us-east-2a-ts7rr
    machine.openshift.io/cluster-api-autoscaler-node-group-max-size: "4"
    machine.openshift.io/cluster-api-autoscaler-node-group-min-size: "1"
  creationTimestamp: "2019-05-13T20:34:26Z"
  generation: 9
  labels:
    machine.openshift.io/cluster-api-cluster: cluster-3e5f-kqr2b
  name: cluster-3e5f-kqr2b-worker-us-east-2a
  namespace: openshift-machine-api
  resourceVersion: "446823"
  selfLink: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machinesets/cluster-3e5f-kqr2b-worker-us-east-2a
  uid: 80644a16-75be-11e9-bb7c-02f7ee4a116e
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: cluster-3e5f-kqr2b
      machine.openshift.io/cluster-api-machine-role: worker
      machine.openshift.io/cluster-api-machine-type: worker
      machine.openshift.io/cluster-api-machineset: cluster-3e5f-kqr2b-worker-us-east-2a
  template:
(...)
Note
If you’re uncomfortable with vi(m) you can use your favorite editor by specifying EDITOR=<your choice> before the oc command.
  • The above is just an example of the entire machine set configuration. Update replicas from 1 to 2

  • Save your changes simply save and quit from your editor. OpenShift will now patch the configuration. You should see that your modified machine set (depending on which one you edited) will be confirmed:

  • Once that has been changed, you can view the outcome here:

[~] $ oc get machinesets -n openshift-machine-api
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-3e5f-kqr2b-worker-us-east-2a   3         3         3       3           24h
cluster-3e5f-kqr2b-worker-us-east-2b   1         1         1       1           24h
cluster-3e5f-kqr2b-worker-us-east-2c   1         1         1       1           24h

Again, before you move forward, return the replica count back to 1, using the same method as above.

Automatic Cluster Scale Up

OpenShift can automatically scale the infrastructure based on workload provided there is a configuration specified to do so. Before we begin, ensure that your cluster is back to having three nodes running:

[~] $ oc get machinesets -n openshift-machine-api
NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
cluster-3e5f-kqr2b-worker-us-east-2a   1         1         1       1           25h
cluster-3e5f-kqr2b-worker-us-east-2b   1         1         1       1           25h
cluster-3e5f-kqr2b-worker-us-east-2c   1         1         1       1           25h

Define a MachineAutoScaler

  • Configure a MachineAutoScaler - you’ll need to fetch the following YAML file:

[~] $ wget https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/machine-autoscale-example.yaml

The file has the following contents:

kind: List
metadata: {}
apiVersion: v1
items:
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-<aws-region-az>-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: <clusterid>-worker-<aws-region-az>
  • Check the MachineSets with the CLI, you noticed that they all had the format of:

<clusterid>-worker-<aws-region-az>
  • MachineAutoscaler resources must be defined for each region-AZ that you want to autoscale.

  • Using the example output and MachineSets above, and selecting "us-east-2a" as the region we’re going to autoscale into, you would need to modify the YAML file to look like the following:

  • To ensure you make no mistakes, here is the command you can use to update the yaml

$ export CLUSTER_NAME=$(oc get machinesets -n openshift-machine-api | awk -F'-worker-' 'NR>1{print $1;exit;}')
$ export REGION_NAME=us-east-2a

$ sed -i s/\<aws-region-az\>/$REGION_NAME/g machine-autoscale-example.yaml
$ sed -i s/\<clusterid\>/$CLUSTER_NAME/g machine-autoscale-example.yaml
  • Here is the working sample of an MachineAutoScaler:

[~] $ cat machine-autoscale-example.yaml
kind: List
metadata: {}
apiVersion: v1
items:
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
- apiVersion: "autoscaling.openshift.io/v1beta1"
  kind: "MachineAutoscaler"
  metadata:
    generateName: autoscale-us-east-2a-
    namespace: "openshift-machine-api"
  spec:
    minReplicas: 1
    maxReplicas: 4
    scaleTargetRef:
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      name: cluster-4c7b-lkw4d-worker-us-east-2a
Note
If you aren’t deployed into this region, or don’t want to use us-east-2a, adapt the instructions to suit.
  • Make sure that you properly modify both generateName and name.

    • Note which one has the <clusterid> and which one does not.

    • Note that generateName has a trailing hyphen.

    • You can specify the minimum and maximum quantity of nodes that are allowed to be created by adjusting the minReplicas and maxReplicas.

  • You do not have to define a MachineAutoScaler for each MachineSet. But remember that each MachineSet corresponds to an AWS region/AZ. So, without having multiple MachineAutoScalers, you could end up with a cluster fully scaled out in a single AZ. If that’s what you’re after, it’s fine. However if AWS has a problem in that AZ, you run the risk of losing a large portion of your cluster.

Note
You should probably choose a small-ish number for maxReplicas. The next lab will autoscale the cluster up to that maximum. You’re paying for the EC2 instances.
  • Once the file has been modified appropriately, you can now create the autoscaler:

$ oc create -f machine-autoscale-example.yaml -n openshift-machine-api

Define a ClusterAutoscaler

  • Define a ClusterAutoscaler, this configures some boundaries and behaviors for how the cluster will autoscale. An example definition file can be found at:

https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/cluster-autoscaler.yaml
  • This definition is set for a maximum of 20 workers, but we need to reduce that with our labs to minimize the cost. Let’s first download that file:

[~] $ wget https://raw.githubusercontent.com/RedHatWorkshops/openshiftv4-workshop/master/solutions/cluster-autoscaler.yaml
  • Modify the max number of replicas:

$ sed -i s/20/10/g cluster-autoscaler.yaml
  • Here is an example of ClusterAutoscaler yaml.

[~] $ cat machine-autoscale-example.yaml
apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
  name: "default"
spec:
  resourceLimits:
    maxNodesTotal: 10
  scaleDown:
    enabled: true
    delayAfterAdd: 10s
    delayAfterDelete: 10s
    delayAfterFailure: 10s
  • Create the ClusterAutoscaler with the following command:

$ oc create -f cluster-autoscaler.yaml
clusterautoscaler.autoscaling.openshift.io/default created
Note
The ClusterAutoscaler is not a namespaced resource — it exists at the cluster scope.

Define a Job

The following example YAML file defines a Job:

It will produce a massive load that the cluster cannot handle, and will force the autoscaler to take action (up to the maxReplicas defined in your ClusterAutoscaler YAML).

Note
If you did not scale down your machines earlier, you may have too much capacity to trigger an autoscaling event. Make sure you have no more than 3 total workers before continuing.
  • Create a project to hold the resources for the Job, and switch into it:

$ oc adm new-project autoscale-example && oc project autoscale-example
Created project autoscale-example
Now using project "autoscale-example" on server "{{API_URL}}".

Open Grafana

  • Go to OpenShift web console

  • Click Monitoring -→ Dashboards

  • This will open a new browser tab for Grafana. You will also get a certificate error similar to the first time you logged in.

  • Click Advance when you see the SSL certificate error.

  • Click Process to …​ link

  • Click Log in with OpenShift

  • Click htpasswd

  • Enter your provided admin username and password and click login

  • CLick Allow selected permissions

  • you will see the Grafana homepage. Grafana is configured to use an OpenShift user and inherits permissions of that user for accessing cluster information.

  • Click the dropdown on Home and choose Kubernetes / Compute Resources / Cluster. Leave this browser window open while you start the Job so that you can observe the CPU utilization of the cluster rise:

image

Force an Autoscaling Event

  • Go to web terminal, create the Job:

$ oc create -n autoscale-example -f https://raw.githubusercontent.com/openshift/training/master/assets/job-work-queue.yaml
job.batch/work-queue-qncs2 created
  • Check status of the Job. It will create a lot of Pods:

$ oc get pod -n autoscale-example
NAME                     READY     STATUS    RESTARTS   AGE
work-queue-qncs2-26x9c   0/1       Pending   0          33s
work-queue-qncs2-28h6r   0/1       Pending   0          33s
work-queue-qncs2-2tdz9   0/1       Pending   0          33s
work-queue-qncs2-526hl   0/1       Pending   0          33s
work-queue-qncs2-55nr7   0/1       Pending   0          33s
work-queue-qncs2-5d98k   0/1       Pending   0          33s
work-queue-qncs2-7pd5p   0/1       Pending   0          31s
work-queue-qncs2-8k76z   0/1       Pending   0          32s
(...)
  • After a few moments, look at the list of Machines:

$ oc get machines -n openshift-machine-api
NAME                                          INSTANCE              STATE     TYPE        REGION      ZONE         AGE
beta-190305-1-79tf5-master-0                  i-080dea906d9750737   running   m4.xlarge   us-east-2   us-east-2a   26h
beta-190305-1-79tf5-master-1                  i-0bf5ad242be0e2ea1   running   m4.xlarge   us-east-2   us-east-2b   26h
beta-190305-1-79tf5-master-2                  i-00f13148743c13144   running   m4.xlarge   us-east-2   us-east-2c   26h
beta-190305-1-79tf5-worker-us-east-2a-8dvwq   i-06ea8662cf76c7591   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-9pzvg   i-0bf01b89256e7f39f   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-vvddp   i-0e649089d42751521   running   m4.large    us-east-2   us-east-2a   2m7s  <--------
beta-190305-1-79tf5-worker-us-east-2a-xx282   i-07b2111dff3c7bbdb   running   m4.large    us-east-2   us-east-2a   26h
beta-190305-1-79tf5-worker-us-east-2b-hjv9c   i-0562517168aadffe7   running   m4.large    us-east-2   us-east-2b   26h
beta-190305-1-79tf5-worker-us-east-2c-cdhth   i-09fbcd1c536f2a218   running   m4.large    us-east-2   us-east-2c   26h
  • You should see a scaled-up cluster with three new additions as worker nodes in us-east-2a, you can see the ones that have been auto-scaled from their age.

  • Depending on when you run the command, your list may show all running workers, or some pending. After the Job completes, which could take anywhere from a few minutes to ten or more (depending on your ClusterAutoscaler size and your MachineAutoScaler sizes), the cluster should scale down to the original count of worker nodes. You can watch the output with the following (runs every 10s)-

$ watch -n10 'oc get machines -n openshift-machine-api'
  • In Grafana, be sure to click the autoscale-example project in the graphs

image

Congratulations!! You now know how OpenShift 4 cluster scaling works!! For more information, see https://docs.openshift.com/container-platform/4.1/machine_management/applying-autoscaling.html for details.