DataOps Orchestration
DataOps.live helps you to easily create data pipelines to build your modern data products and data apps. The data product platform is made up of many apps providing specialized capabilities ranging from data ingestion over data quality and data transformation to data observability and governance. Orchestrators are your building blocks to orchestrate all your applications to create high-value data products.
DataOps pipelines are made up of a directed acyclic graph (DAG) of jobs. All DataOps jobs use an orchestrator image providing the necessary capabilities to interact with other apps.
The most commonly used orchestrators include:
- The Snowflake Object Lifecycle (SOLE) Orchestrator to automate all your Snowflake infrastructure
- The Stage Ingestion Orchestrator to ingest all your data to Snowflake
- The Modeling and Transformation (MATE) Orchestrator to transform, prepare, and curate all your data
- The Secrets Manager Orchestrator for loading sensitive configuration from your enterprise credential management system
- The Utilities Orchestrator for executing scripts and running ad-hoc commands
- The Python Orchestrators for executing Python scripts and apps
Orchestrators overview
Most DataOps orchestrators perform a single main action,
configured using job variables. Therefore, the job's script block only needs to call the
/dataops
entry point, e.g.:
My Job:
...
script:
- /dataops
You can insert additional scripts before and after the /dataops
execution sequence to
perform custom setup or teardown actions respectively.
However, a few orchestrators provide a set of utilities that
support whatever actions the job developer requires. Jobs using flexible orchestrators do not
typically call the /dataops
entry point (although it is still available if needed),
but instead define a sequence of custom script actions, e.g.:
My Job:
...
script:
- echo "My job starting..."
List of orchestrators
- API Orchestrator
- AWS Orchestrator
- Azure Data Factory Orchestrator
- Azure Orchestrator
- Coalesce Orchestrator
- Collibra Orchestrator
- Data Product Orchestrator
- data.world catalog Orchestrator
- DataPrep Orchestrator
- dbt Cloud Orchestrator
- Fivetran Orchestrator
- Git Orchestrator
- Informatica Cloud Data Governance and Catalog (CDGC) Orchestrator
- Informatica Cloud Taskflow Orchestrator
- Java 8 Orchestrator
- Matillion Orchestrator
- Montecarlo Orchestrator
- Modeling and transformation Orchestrator
- Python Orchestrators
- R Orchestrator
- Reporting Orchestrator
- Secrets Manager Orchestrator
- Snowflake Orchestrator
- Snowpark (Python) Orchestrator
- Soda Orchestrator
- SOLE (Snowflake Object Lifecycle Engine) Orchestrator
- Stage Ingestion Orchestrator
- Stitch Orchestrator
- Talend (TAC) Orchestrator
- Talend (TMC) Orchestrator
- Utils Orchestrator
- VaultSpeed Orchestrator