Runner Overview

The DataOps Runner is a long-running container that runs within a customer's infrastructure (on-premises or private cloud). Typically, it runs inside the on-premises/private network for security reasons (among others) to give the jobs in a DataOps pipeline access to otherwise inaccessible resources.

This DataOps Runner regularly polls the DataOps application asking if there is any work for it to do, as seen in the gif below:

DataOps Runner polls DataOps application for work

Follow the steps in this section to install and configure a DataOps Runner for your compute environment. As part of the installation, the runner is associated with your group (preferred) or project in the DataOps application.

You can have multiple DataOps Runners in many locations, with each job executed by a specific runner. For instance, if you have a tool that needs orchestrating in Singapore and London, but no direct connectivity is allowed between these locations, the solution is to install a separate DataOps Runner at each location.

The next question that deserves an answer is how to get each runner to pick up the correct jobs or the destined jobs for each runner.

You tag each job with an identifying tag. For instance, let's assume the runners are named and tagged dataops-runner-singapore and dataops-runner-london, respectively. Tag each job with dataops-runner-singapore to indicate to the Singapore Runner that there are jobs that must be run by it, and vice versa.