Skip to main content

How to Host a dbt Package in DataOps

The ability to manage common macros, models and other modeling and transformation resources using dbt Packages is very important to the principles of DataOps, reducing code duplication and centralizing management. DataOps provides the capability to host dbt packages right inside the platform with simple authentication using tokens.

For more information...

...on creating dbt packages, read this dbt blog: So You Want to Build a dbt Package

Procedure

  1. Create a project in DataOps that contains the dbt package. There's no need for the usual DataOps template: start from an empty project and just add the dbt package content.

  2. Once you have content in your package, create a git tag to set the initial version. Use whichever versioning strategy works best for your organization.

  3. Create a deploy token for this project (Settings > Repository > Deploy tokens) with read_repository access. Save the token to your secrets manager so it can be used in DataOps pipelines.

  4. In each DataOps project that will use this package, add an entry to packages.yml:

    dataops/modelling/packages.yml
    packages:
    # existing package dependencies
    ...

    - git: "https://USERNAME:{{ env_var('PACKAGE_TOKEN') }}@app.dataops.live/path/to/package-project.git"
    revision: v0.1.0

    Make sure you substitute USERNAME with the actual username you specified when creating the token, and that you map the PACKAGE_TOKEN variable to a path equivalent to the Parameter Store location. The revision (in this case v0.1.0) will be the value of the git tag you created above.

  5. Any pipeline job that uses this package (or any other you may have added to packages.yml) will need to have the variable TRANSFORM_FORCE_DEPS set to 1 so that the transform orchestrator will run dbt deps before its main action. The job (or the project's main variables.yml file) will need to define the PACKAGE_TOKEN variable (can be renamed if needed), using the DATAOPS_VAULT() syntax to retrieve the token from your secrets manager via the DataOps vault.

Example Job

This example job executes a local MATE macro my_local_macro, which utilizes another macro from a dbt package hosted in DataOps and declared in the packages.yml injecting the necessary token PACKAGE_TOKEN from the DataOps vault:

pipelines/includes/local_includes/modelling_and_transformation/local_macro_job.yml
Run My Local Macro:
extends:
- .modelling_and_transformation_base
- .agent_tag
stage: Demo
variables:
TRANSFORM_ACTION: OPERATION
TRANSFORM_OPERATION_NAME: my_local_macro
TRANSFORM_FORCE_DEPS: 1
PACKAGE_TOKEN: DATAOPS_VAULT(PATH.TO.DATAOPS.DEPLOY_TOKEN)
script:
- /dataops
icon: ${DATAOPS_ICON}