What is Modeling and Transformation?
The DataOps Modelling and Transformation Engine (MATE) is a SQL-based database modelling and testing system, built on top of the popular dbt framework. However, it is much more than that and has been heavily modified to include DataOps-specific logic, helper macros, tests, and other custom configuration, with an Enterprise-grade orchestration approach built on the full power of DataOps.
How is MATE different to just dbt?
The dbt files are stored in the DataOps project
Each project has a directory at dataops/modelling
containing the dbt_project.yml
file and
operates as a fully functional, self-contained dbt project location.
The operations are run using the DataOps Modeling and Transformation Orchestrator
When a DataOps pipeline runs, the DataOps Modelling and Transformation Orchestrator will provide the underpinning technology stack needed to run a MATE job.
The Database name is automatically derived by DataOps
DataOps automatically derives the working database name dynamically based on the execution environment when each pipeline is run. The derived database name is then injected automatically into the dbt configuration.
The additional macros are available via the modeling & transformation library
DataOps includes a very rich MATE library that provides a large set of additional macros and tests (create database, conditional execute, etc.).
The dbt utils package is also available as standard
Pipelines in DataOps can also natively use the dbt-utils package as standard.
Model documentation is available via the DataOps UI
All pipelines include, by default, a job that will automatically build model documentation based around the dbt docs package.
The dbt project
Each DataOps project that you based on our DataOps Reference Project
contains a complete dbt project, including allowing full access to the dbt_project.yml
configuration. Here is a
sample of the standard config:
## Project
name: MyTemplate
version: 0.1
config-version: 2
profile: dlxsnowflake
## Sources
source-paths: [models, sources]
analysis-paths: [analysis]
test-paths: [tests]
data-paths: [data]
macro-paths: [macros]
snapshot-paths: [snapshots]
## Target
target-path: target
clean-targets: [target, dbt_modules]
## Models
models:
+transient: true
+materialized: table
MyTemplate:
snowflake_sample_data:
schema: SAMPLES
Notice how the models
block needs no database specified. The database name will automatically be derived as we
mentioned above. During execution, the database name gets
injected in the also automatically generated dbt profiles.yml.