Modelling and Transformation Engine
What is modeling and transformation?
The DataOps Modelling and Transformation Engine (MATE) is a SQL-based database modeling and testing engine developed on top of the dbt Framework.
However, MATE offers much more than just the dbt transformation framework. We have added DataOps-specific logic, helper macros, and additional tests and custom configurations, all supported by the powerful DataOps orchestration platform.
Modelling and Transformation or MATE in a DataOps project is the "T" in ELT, while auto-ingestion capabilities provide the "E" and the "L." Therefore, most DataOps projects will include a MATE component to turn source data into valuable data marts and other data products.
How is MATE different from just dbt?
The dbt files are stored in the DataOps project
Each project has a directory at dataops/modelling
containing the dbt_project.yml
file and
operates as a fully functional, self-contained dbt project location.
The operations are run using the DataOps Modeling and Transformation orchestrator
When a DataOps pipeline runs, the DataOps Modeling and Transformation orchestrator will provide the underpinning technology stack needed to run a MATE job.
The Database name is automatically derived by DataOps.live
DataOps dynamically derives the working database name based on the execution environment of each pipeline. The derived database name is then injected automatically into the dbt configuration.
The additional macros are available via the modeling & transformation library
DataOps includes a very rich MATE library that provides a large set of additional macros and tests (create database, conditional execute, etc.).
The dbt utils package is also available as standard
Pipelines in DataOps.live can also natively use the dbt-utils package as standard.
Model documentation is available via the DataOps.live UI
All pipelines include, by default, a job that will automatically build model documentation based on the dbt docs package.
The dbt project
Each DataOps project that you based on our DataOps Reference Project
contains a complete dbt project, including allowing full access to the dbt_project.yml
configuration. Here is a
sample of the standard config:
## Project
name: MyTemplate
version: 0.1
config-version: 2
profile: dlxsnowflake
## Sources
source-paths: [models, sources]
analysis-paths: [analysis]
test-paths: [tests]
data-paths: [data]
macro-paths: [macros]
snapshot-paths: [snapshots]
## Target
target-path: target
clean-targets: [target, dbt_modules]
## Models
models:
+transient: true
+materialized: table
MyTemplate:
snowflake_sample_data:
schema: SAMPLES
Notice how the models
block needs no database specified. The database name will automatically be derived as mentioned above. During execution, the database name gets
injected in the also automatically generated dbt profiles.yml
.
What you'll read in this guide
The topics specifically discussed in this guide include: