Autoscaling
We use our own way to autoscale the amount of replicas using k8s-autoscale. It looks at the pending queue and adjusts the amount of replicas depending on average task duration, SLA and the maximum amount of replicas we want to run.
The configs can be found in the configs directory. Some important config variables are below:
worker_type
: corresponds to Taskcluster’sworkerType
deployment_namespace
: corresponds to deployment’s namespace in Kubernetes and used in the Kubernetes API queries.deployment_name
: corresponds to deployment’s name in Kubernetes and used in the Kubernetes API queries.max_replicas
andmin_replicas
: set the max and min amount of replicasavg_task_duration
: average task duration. Used in calculations and affects the amount of replicas we spin up.slo_seconds
: (service level objective) how many seconds we tolerate waiting until we start a pending task. For example, with 1 running instance,slo_seconds
set to 240 andavg_task_duration
set to 60, we don’t spin up new instances until we have more than 4 pending tasks.capacity_ratio
: a value between 0 and 1, which tells what portion of the pending pool this entry can handle. Used in case we want to use multiple entries for the same worker type in different clusters.
After a change is merged to the master
branch, it’s immediately deployed to
the dev GCP cluster. In order to deploy the changes to production, you need to
merge from master
to the production
branch. Moreover, in order for the change
to have effect in the desired scriptworker(s), a new image for the latter needs
to be pushed out to Docker.