The data.world Orchestrator is a pre-set orchestrator that interacts with the data.world data catalog to publish metadata about the data transformed in a DataOps pipeline. In summary, the data.world Orchestrator provides a single-click interface to the data.world service.
stage: Data Catalog
DW_ORG: <org name>
# DW_BLOCK_PROFILE_UPLOAD: 1
The data.world Orchestrator assumes that a DataOps Modelling and Transformation job completed its run in an earlier stage of the DataOps pipeline. It leverages the metadata (data about data) model to provide up-to-date information to the catalog at the end of every pipeline run.
|REQUIRED||The data.world organization where the dataset fits|
|REQUIRED||The data.world dataset to update. The standard value is |
|REQUIRED||The data.world authentication token|
|Optional||If set, it prevents updating the metadata profile during a job run|
|Optional||If set, it will upload the data file using the specified name. Otherwise the data file will be uploaded as dataopslive-catalog.ttl|
|Optional||If set, it will upload the meta file using the specified name. Otherwise the meta file will be uploaded as metadata-profile.ttl|
Most of the configuration happens on the data.world application. When run, the orchestrator uploads a default profile file. The default profile is sufficient to get started. Set the
DW_BLOCK_PROFILE_UPLOAD variable to prevent changes to the data.world application profile from being overwritten.
DATA_WORLD.AUTH key in the DataOps Vault is a valid user authentication token obtained from the data.world settings at https://data.world/settings/advanced.
This example dynamically adjusts the organization being used based on the DataOps context (dev, test, prod). In other words, depending on the context, the default organization changes from
stage: "Data Catalog"
DW_ORG: dataopslivedev #dataopslive, dataopsliveqa,
- if [[ $DATAOPS_DATABASE == *"_PROD" ]]; then export DW_ORG=dataopslive; fi
- if [[ $DATAOPS_DATABASE == *"_QA" ]]; then export DW_ORG=dataopsliveqa; fi
- if [[ $DATAOPS_DATABASE == *"_DEV" ]]; then export DW_ORG=dataopslivedev; fi
- if [[ $DATAOPS_DATABASE == *"_FB_"* ]]; then export DW_ORG=dataopslivedev; fi
- echo "DATABASE = $DATAOPS_DATABASE and DW_DATASET=$DW_ORG"
The data.world Orchestrator assumes that MATE has already run in the pipeline. It then leverages the MATE results, specifically table-level lineage, including tags, descriptions, and other metadata.
The orchestrator uses two intermediate files, the catalog and manifest. The files must be located at the following path:
/dataops/modelling/target; a working directory of the standard MATE project found at
The details of these intermediate files are as follows:
catalog.json- this file contains information from your data warehouse about the tables and views produced and defined by the resources in your project.
manifest.json- this file contains a complete representation of your dbt project's resources (models, tests, macros, etc.), including all node configurations and resource properties.
Host Dependencies (and Resources)
The example configurations use a data.world access token stored in the DataOps vault at