Skip to main content

How to Create a Custom Reference Project

Once your DataOps account has more than one project, you might start thinking about how to avoid the needless repetition of content. This is where a Custom reference project comes in.

Create the reference project

First, identify or create a suitable group to hold shared content. A good approach is to create a group at the top level of your account called something like Reference and Templates.

Please note

Access at Reporter level will need to be granted on this group to all your users so that they can access the reference project content in pipelines.

Once you have a suitable group, go ahead and create an empty project in there, called something like %CUSTOMER% Reference Project.

Set up bootstrapping

All standard DataOps projects bootstrap from their reference project (by default, the DataOps Reference Project) including the reference project's base_bootstrap.yml file into their local bootstrap.yml file.

Our new custom reference project will contain a base_bootstrap.yml file that will include the content from the standard DataOps reference project (to retain all the usual DataOps goodness).

In your new, blank reference project, create the following file:

pipelines/includes/base_bootstrap.yml

include:
- project: reference-template-projects/dataops-template/dataops-reference
ref: 5-stable
file: /pipelines/includes/base_bootstrap.yml

Set variables

DATAOPS_EXTRA_REFERENCE_PROJECTS

The first job in any pipeline is to initialise the pipeline environment, initially by cloning the reference project into the pipeline's workspace. Therefore, as we are creating a new, custom reference project, this will also need to be cloned into each pipeline workspace. Fortunately, DataOps makes this easy for use, providing a variable, DATAOPS_EXTRA_REFERENCE_PROJECTS, to enable this clone operation.

Create the following file and set this variable as follows:

pipelines/includes/config/variables.yml

  ## Also clone this CUSTOMER reference project (v1.0.0)
DATAOPS_EXTRA_REFERENCE_PROJECTS: https://gitlab-ci-token:${CI_JOB_TOKEN}@app.dataops.live/CUSTOMER/reference-and-templates/CUSTOMER-reference-project.git|v1.0.0

You will notice that this variable takes a value of the form URL|REF where URL is the full reference project GIT URL (including the .git extension), and REF is the branch or tag to clone from.

Please note

It's important to use a branch, or ideally tag, as the source of your reference content, rather than just using the master branch, as this gives consistency and control over updates and releases.

Decide on a release tag (we've used v1.0.0 in this document) that will be used for all reference project links.

DATAOPS_EXTRA_BEFORE_SCRIPTS

It is also possible to enhance and override some DataOps runtime variables using an additional custom before_script, which is activated using this variable.

As this is an advanced feature of the DataOps platform, please contact your account representative to obtain further assistance with this.

Other variables

Other DataOps configuration variables can be set in your custom reference project's variables.yml file. These will override the standard DataOps defaults and provide customised default values for all your projects that use your new reference project.

Configure stages and jobs

It's possible to create a custom stages.yml file in your new reference project, allowing the use of a specific set of stage definitions across all your projects. To do this, copy the stages.yml file from the DataOps reference project into the same file location in your reference project.

You can also add other job and base job definition files to the reference project, using the standard file locations.

Update the bootstrap

As you have now added new files to your new reference project, these will need to be linked into the base_bootstrap.yml file in order to be available in all your projects.

For each file you have added, include a section such as the following in base_bootstrap.yml:

pipelines/includes/base_bootstrap.yml

include:
...

## CUSTOMER variable definitions and overrides
- project: CUSTOMER/reference-and-templates/CUSTOMER-reference-project
ref: v1.0.0 # Set this to your release tag
file: /pipelines/includes/config/variables.yml
Important

Don't delete the reference to the base_bootstrap.yml file from the DataOps reference project, or all sorts of important things will stop working!

Once you have linked all your reference project's pipeline configuration files into base_bootstrap.yml it will look something like this:

pipelines/includes/base_bootstrap.yml

include:

##### DataOps Core #####

- project: reference-template-projects/dataops-template/dataops-reference
ref: 5-stable
file: /pipelines/includes/base_bootstrap.yml

##### CUSTOMER Custom #####

## CUSTOMER variable definitions and overrides
- project: CUSTOMER/reference-and-templates/CUSTOMER-reference-project
ref: v1.0.0 # Set this to your release tag
file: /pipelines/includes/config/variables.yml

## CUSTOMER stages
- project: CUSTOMER/reference-and-templates/CUSTOMER-reference-project
ref: v1.0.0 # Set this to your release tag
file: /pipelines/includes/config/stages.yml

...

Release your reference project

Before you can use your new reference project, you will need to create your release tag (see above) from the current code. For now, this will usually just involve tagging the branch you've been developing, but moving forward it will be better to have a dev/test/MR workflow around making and releasing reference project changes.

Take care

The tag you release your reference project under must match all the the ref values in the includes within base_bootstrap.yml.

Update your projects and templates

Now the reference project is released, you can test it using one of your existing projects (or a new project from the standard template).

To switch a project from the DataOps standard reference project to your new custom reference project there are only two steps:

1. Update bootstrap.yml

In the project's bootstrap.yml, update the project and ref to match your new reference project's URL path and release tag:

pipelines/includes/bootstrap.yml BEFORE

include:
- project: reference-template-projects/dataops-template/dataops-reference
ref: 5-stable
file: /pipelines/includes/base_bootstrap.yml

...

pipelines/includes/bootstrap.yml AFTER

include:
- project: CUSTOMER/reference-and-templates/CUSTOMER-reference-project
ref: v1.0.0
file: /pipelines/includes/base_bootstrap.yml

...

2. Remove any local content that's now in the reference project

Since a common reason for creating a custom reference project is to move duplicated content out of DataOps projects, you need to ensure this is actually done in each project. Otherwise, the local configuration will override that from your reference project, which may not be immediately obvious as it will probably be the same code right now.

Test!

Set up and run a pipeline to verify everything from your new configuration is working correctly in the repointed project.

If you have included additional content in your reference project outside of the pipeline configuration files, e.g. dbt libraries for use in MATE jobs, this may need to be explicitly included before it can be used. The DataOps pipeline initialisation will clone your reference project into each pipeline's workspace at the following location:

$CI_PROJECT_DIR/reference-projects/CUSTOMER-reference-project