How to Configure a Production-Only Runner

warning

The end-to-end walkthrough in segregating environments supersedes this how-to guide now. While this guide still works, the new article is comprehensive. Thus head over there.

Many DataOps architectures require physical separation between environments, which is best achieved by using a separate runner to execute pipelines for the production environment. Separation can be achieved by implementing a wrapper -ci.yml pipeline file for production. Extend the concept to other environments as needed.

For this guide, we shall assume two runners are available to the project scope: prod-runner and non-prod-runner.

Production-only runner for a single pipeline

First, configure the project as usual by setting the tag in the .agent_tag base job to the non-production runner:

pipelines/includes/config/agent_tag.yml
.agent_tag:
  tags:
    - non-prod-runner

Create a new pipeline file for production. Here we use production-ci.yml as the name, yet you can follow your naming convention:

production-ci.yml
## Include the main pipeline file
include:
  - /full-ci.yml

## Override the agent tag base job
.agent_tag:
  tags:
    - prod-runner

## Any other production overrides like variables
variables:
  SECRETS_AWS_USE_ROLE: 1

The production pipeline file has the following content:

Include the main pipeline file (in this case, full-ci.yml, but it can be any pipeline file in the project). Including full-ci.yml ensures that the same pipeline jobs run in each environment.
Adjust the tags to use the production runner. This is achieved by simply overriding the .agent_tag base job.
Override any variables to values suitable for the production environment.

To follow DataOps best practices and encapsulate the production configuration into separate files, the second and third parts above can be included as shown:

production-better-ci.yml
include:
  ## Include the main pipeline file
  - /full-ci.yml

  ## Override the agent tag base job
  - /pipelines/includes/config/agent_tag_production.yml

  ## Any other production overrides like variables
  - /pipelines/includes/config/variables_production.yml

In this case, files agent_tag_production.yml is simply a copy of agent_tag.yml with the tag changed, and variables_production.yml includes the variables that are different in the production environment (as the main variables.yml file is still included).

Production-only runner for multiple pipelines

Often, a branch can contain multiple -ci.yml pipeline files. A slight modification of the above setup is needed to set a production-only runner for all of them. By introducing a new variable $PIPELINE_NAME, we can control which pipeline file will be triggered with the production-only runner at runtime. An example setup would be like this:

production-ci.yml
include:
  ## Include the main pipeline file
  - $PIPELINE_NAME

  ## Override the agent tag base job
  - /pipelines/includes/config/agent_tag_production.yml

  ## Any other production overrides like variables
  - $PIPELINE_VARIABLES_NAME

We also introduced an optional variable $PIPELINE_VARIABLES_NAME, which would point to a variables.yml file specifically for the pipeline. Note that the variable names are not fixed, and you can set them to anything.

Production-only runner for multiple pipelines

Running the pipeline by setting the path achieves the goal of having a production-only runner for multiple pipelines. Afterward, set schedules to run the production-ci.yml with appropriate variable settings.

Production-only runner for a single pipeline​

Production-only runner for multiple pipelines​

Production-only runner for a single pipeline

Production-only runner for multiple pipelines