Skip to main content

SOLE Pipeline Jobs

So far, we have let the data product platform take care of executing our SOLE configuration, using the standard Snowflake Setup job from the DataOps Reference Project.

Reminder

The DataOps Reference Project contains common content that is included in every DataOps project automatically at run time. This includes standard pipeline jobs, particularly the Set Up Snowflake job that runs the SOLE execution.

Opening up the setup Snowflake job

This is the job definition for Set Up Snowflake:

pipelines/includes/default/snowflake_lifecycle.yml
Set Up Snowflake:
extends: .agent_tag
image: $DATAOPS_SNOWFLAKEOBJECTLIFECYCLE_RUNNER_IMAGE
variables:
LIFECYCLE_ACTION: AGGREGATE
ARTIFACT_DIRECTORY: $CI_PROJECT_DIR/snowflake-artifacts
CONFIGURATION_DIR: $CI_PROJECT_DIR/dataops/snowflake
resource_group: $CI_JOB_NAME
stage: Snowflake Setup
script:
- /dataops
artifacts:
when: always
paths:
- $ARTIFACT_DIRECTORY
icon: ${SNOWFLAKEOBJECTLIFECYCLE_ICON}

Most parts of this configuration should be familiar, but there are some specific variables and a few features you may not have used before:

  • LIFECYCLE_ACTION - this sets the action performed by the SOLE orchestrator, much like the TRANSFORM_ACTION in the MATE orchestrator. In this case, we're running the AGGREGATE action, which executes the full SOLE process (render, compile, plan, apply).
  • resource_group - a DataOps feature that ensures no two jobs with the same resource_group value can run concurrently. This ensures that two branches cannot be making changes via SOLE at the same time.
  • artifacts - this will package the specified directory and provide it as part of the job output for future reference. You can find artifacts on the right-hand side of the job output screen.

Other SOLE jobs

You may have noticed that the snowflake_lifecycle.yml file in the reference project also contains another job. This is Tear Down Snowflake, which only appears in non-production branches, and was used in the previous section's exercises.

This job uses the AGGREGATE-DESTROY action, which performs the reverse of AGGREGATE and removes all the objects defined in the SOLE configuration (apart from those managed externally of course). You'll see this job also has a rules section, which determines when it will appear in pipelines. In this case, the rule specifies that the job should never appear in pipelines triggered from main or qa branches.

Did you know?

It's also possible to create separate jobs for the different phases of the SOLE process, creating more flexible and powerful pipelines. This will be covered in a future user guide.