DataOps Kubernetes Runner Scaling
To manage the number of concurrent jobs executing as part of a pipeline you can increase both the number of runners deployed to the cluster as well as the number of jobs a single runner executes:
concurrent = 8
To deploy more than one runner pod to your Kubernetes cluster set the
replicas key in your
The concurrency of jobs for a runner is set to 8 by default. You can increase the concurrency of jobs by setting the
concurrent key in your values.yml.
The maximum number of jobs that can run concurrently on the cluster is the number of
replicas multiplied by the
As a general rule, the number of replicas drives high availability and only in rare cases should exceed 3. Match the number of replicas to the number of cloud vendor availability zones your cluster is deployed on. To scale out your workloads and fit them to your cluster's capacity increase your concurrent limit instead, e.g. to 100.
The runner does not try to balance the jobs being scheduled across nodes. Instead, jobs will be spread according to the configured behavior of the cluster by the cluster itself.
You can customize the scheduling behavior of pods through the
tolerations keys in your values.yml file.
Having more than one runner can help ensure that jobs do not spend time waiting to be picked up by a runner and gives you some resistance to disasters such as a node going down. We recommend to match the number of runners for a given group of projects with the number of availability zones of your cloud vendor.
Runners and jobs are isolated so that runners going down will not affect any jobs that have already been scheduled and the deployment should schedule a new runner pod in this scenario.