DataOps Orchestration
DataOps helps you to easily create data pipelines to build your modern data platform. The platform is regularly made up of many applications providing specialized capabilities ranging from data ingestion over data quality and data transformation to data observability and governance. Orchestrators are your building blocks to orchestrate all your applications to create high-value data products.
DataOps pipelines are made up of a directed acyclic graph (DAG) of jobs. All DataOps jobs use an orchestrator image providing the necessary capabilities to interact with other applications.
The most commonly used orchestrators include:
- The Snowflake Object Lifecycle (SOLE) Orchestrator to automate all your Snowflake infrastructure
- The Stage Ingestion Orchestrator to ingest all your data to Snowflake
- The Modeling and Transformation (MATE) Orchestrator to transform, prepare, and curate all your data
- The Secrets Manager Orchestrator for loading sensitive configuration from your enterprise credential management system
- The Utilities Orchestrator for executing scripts and running ad-hoc commands
- The Python3 Orchestrator for executing Python scripts and apps
Orchestration types
DataOps orchestrators fall into two types: pre-set or flexible.
Pre-set orchestrators
Most DataOps orchestrators support a pre-set operation, meaning they perform a single main action,
configured using job variables. Therefore, the job's script block only needs to call the
/dataops
entry point, e.g.:
My Job:
...
script:
- /dataops
You can insert additional scripts before and after the /dataops
execution sequence to
perform custom setup or teardown actions respectively.
Flexible orchestrators
Some orchestrators do not have a pre-set action and instead provide a set of utilities that
support whatever actions the job developer requires. Jobs using flexible orchestrators do not
typically call the /dataops
entry point (although it is still available if needed),
but instead define a sequence of custom script actions, e.g.:
My Job:
...
script:
- echo "My job starting..."
List of orchestrators
- API Orchestrator
- AWS Orchestrator
- Azure Data Factory Orchestrator
- Azure Orchestrator
- Collibra Orchestrator
- data.world Orchestrator
- DataPrep Orchestrator
- Fivetran Orchestrator
- Git Orchestrator
- Informatica Cloud Orchestrator
- Java 8 Orchestrator
- Matillion Orchestrator
- Modeling and transformation Orchestrator
- Monte carlo Orchestrator
- Python3 Orchestrator
- R Orchestrator
- Reporting Orchestrator
- Secrets Manager Orchestrator
- Snowflake Orchestrator
- Snowpark (Python) Orchestrator
- Soda Orchestrator
- SOLE (Snowflake Object Lifecycle Engine) Orchestrator
- Stage Ingestion Orchestrator
- Stitch Orchestrator
- Talend (TAC) Orchestrator
- Talend (TMC) Orchestrator
- Utils Orchestrator
- VaultSpeed Orchestrator