Skip to main content

What is Modelling and Transformation?

The DataOps Modelling and Transformation Engine (MATE) is a SQL-based database modelling and testing system, built on top of the popular dbt framework. However, it is much more than that and has been heavily modified to include DataOps-specific logic, helper macros, tests, and other custom configuration, with an Enterprise-grade orchestration approach built on the full power of DataOps.

How is MATE Different to Just dbt?

The dbt files are stored in the DataOps project

Each project has a directory at dataops/modelling containing the dbt_project.yml file and operates as a fully functional, self-contained dbt project location.

The Operations are run using the DataOps Modelling and Transformation Orchestrator

When a DataOps pipeline runs, the DataOps Modelling and Transformation Orchestrator will provide the underpinning technology stack needed to run a MATE job.

The Database name is automatically derived by DataOps

DataOps automatically derives the working database name dynamically based on the execution environment when each pipeline is run. The derived database name is then injected automatically into the dbt configuration.

The additional macros are available via the modelling & transformation library

DataOps includes a very rich MATE library that provides a large set of additional macros and tests (create database, conditional execute, etc.).

The dbt Utils package is also available as standard

Pipelines in DataOps can also natively use the dbt-utils package as standard.

Model Documentation (if generated) is available via the DataOps UI

All pipelines include, by default, a job that will automatically build model documentation based around the dbt docs package.

The dbt Project

Each DataOps project that you based on our DataOps Reference Project contains a complete dbt project, including allowing full access to the dbt_project.yml configuration. Here is a sample of the standard config:

/dataops/modelling/dbt_project.yml
## Project
name: MyTemplate
version: 0.1
config-version: 2
profile: dlxsnowflake

## Sources
source-paths: [models, sources]
analysis-paths: [analysis]
test-paths: [tests]
data-paths: [data]
macro-paths: [macros]
snapshot-paths: [snapshots]

## Target
target-path: target
clean-targets: [target, dbt_modules]

## Models
models:
+transient: true
+materialized: table
MyTemplate:
snowflake_sample_data:
schema: SAMPLES
info

Notice how the models block needs no database specified. The database name will automatically be derived as we mentioned above. During execution, the database name gets injected in the also automatically generated dbt profiles.yml.