Matillion Orchestrator
Enterprise
Image | $DATAOPS_MATILLION_RUNNER_IMAGE |
---|
The Matillion orchestrator triggers the start of a Matillion job as part of a DataOps pipeline. This functionality makes it possible to integrate all existing Matillion jobs into a DataOps pipeline.
Usage
The Matillion orchestrator orchestrates Matillion jobs via Matillion's ETL API.
The orchestrator's workflow is as follows:
- It posts a request to the specified Matillion instance
- It then polls the status and fetches progress from the Matillion instance
- Finally, it propagates the Matillion execution status to the DataOps pipeline
My Matillion Job:
extends:
- .agent_tag
stage: My Stage
image: $DATAOPS_MATILLION_RUNNER_IMAGE
variables:
MATILLION_ACTION: START
MATILLION_URL: XXXX
MATILLION_USERNAME: DATAOPS_VAULT(XXXX)
MATILLION_PASSWORD: DATAOPS_VAULT(XXXX)
MATILLION_GROUP: XXXX
MATILLION_PROJECT: XXXX
MATILLION_JOB: XXXX
script: /dataops
icon: ${MATILLION_ICON}
We recommend that you configure the DataOps pipeline to continue running only if the Matillion job is successful, ensuring that the pipeline run does not transform any out-of-date data.
Supported parameters
Parameter | Required/Default | Description |
---|---|---|
MATILLION_USERNAME | REQUIRED | Username to access the Matillion ETL API |
MATILLION_PASSWORD | REQUIRED | Password to access the Matillion ETL API |
MATILLION_ACTION | REQUIRED | Currently only supports START to run a Matillion job |
MATILLION_URL | REQUIRED | The URL of the Matillion ETL instance |
MATILLION_GROUP | REQUIRED | Name of the Matillion group of the job to be executed |
MATILLION_PROJECT | REQUIRED | Name of the Matillion project of the job to be executed |
MATILLION_JOB | REQUIRED | Name of the Matillion job to execute |
MATILLION_VERSION | default | Version for Matillion ETL |
MATILLION_VARIABLES | None | Correctly-escaped block of JSON comprising key-value pairs of job variables (see example below) |
MATILLION_TIMEOUT | 3600 | Matillion task timeout in seconds. If increasing the DataOps job timeout, also set this to an equivalent value |
MATILLION_IGNORE_SSL_CERT | FALSE | Set to TRUE to ignore SSL warnings, e.g. when using self-signed certificates |
SET_MATILLION_KEYS_TO_ENV | None | (Deprecated) If set, the credentials from the DataOps Vault would be fetched and exposed in the environment. The preferred method is now to set sensitive parameters using the DATAOPS_VAULT() syntax (see examples) |
MATILLION_USERNAME_VAULT_KEY | MATILLION.DEFAULT.USERNAME | (Deprecated) If set, overrides the default vault path for username |
MATILLION_PASSWORD_VAULT_KEY | MATILLION.DEFAULT.PASSWORD | (Deprecated) If set, overrides the default vault path for password |
Example jobs
Basic task execution
This example demonstrates what a typical pipeline job looks like:
Start Matillion Job:
extends:
- .agent_tag
stage: "Bulk Ingestion"
image: $DATAOPS_MATILLION_RUNNER_IMAGE
variables:
MATILLION_ACTION: START
MATILLION_URL: https://etl1.example.com
MATILLION_USERNAME: DATAOPS_VAULT(matillion.etl1.username)
MATILLION_PASSWORD: DATAOPS_VAULT(matillion.etl1.password)
MATILLION_GROUP: My Group
MATILLION_PROJECT: My Project
MATILLION_JOB: my_job
MATILLION_VERSION: default
script: /dataops
icon: ${MATILLION_ICON}
Passing variables to Matillion
This job passes a payload of variables to the Matillion job:
Start Matillion Job with Variables:
extends:
- .agent_tag
stage: "Bulk Ingestion"
image: $DATAOPS_MATILLION_RUNNER_IMAGE
variables:
MATILLION_ACTION: START
MATILLION_URL: https://etl1.example.com
MATILLION_USERNAME: DATAOPS_VAULT(matillion.etl1.username)
MATILLION_PASSWORD: DATAOPS_VAULT(matillion.etl1.password)
MATILLION_GROUP: My Group
MATILLION_PROJECT: My Project
MATILLION_JOB: my_parameterized_job
MATILLION_VERSION: default
MATILLION_VARIABLES: >-
{
"my_variable_one": "VALUE 1",
"my_variable_two": "VALUE 2"
}
script: /dataops
icon: ${MATILLION_ICON}
And the following example shows how to pass sensitive variables into Matillion using the existing SNOWFLAKE.
namespace from the DataOps vault:
Start Matillion Job with Sensitive Variables:
extends:
- .agent_tag
stage: "Bulk Ingestion"
image: $DATAOPS_MATILLION_RUNNER_IMAGE
variables:
MATILLION_ACTION: START
MATILLION_URL: https://etl1.example.com
MATILLION_USERNAME: DATAOPS_VAULT(matillion.etl1.username)
MATILLION_PASSWORD: DATAOPS_VAULT(matillion.etl1.password)
MATILLION_GROUP: My Group
MATILLION_PROJECT: My Project
MATILLION_JOB: my_sensitive_job
MATILLION_VERSION: default
MATILLION_VARIABLES: >-
{
"snowflake_account": "DATAOPS_VAULT(SNOWFLAKE.ACCOUNT)",
"snowflake_username": "DATAOPS_VAULT(SNOWFLAKE.SOLE.USERNAME)",
"snowflake_password": "DATAOPS_VAULT(SNOWFLAKE.SOLE.PASSWORD)"
}
script: /dataops
icon: ${MATILLION_ICON}
DataOps.live will, by default, pass variables into the Matillion API using the scalarVariables
block. However, if you also wish to include grid variables, you can provide an entire JSON block consisting of the keys scalarVariables
or gridVariables
at the top level. The orchestrator will automatically detect the JSON and pass it as-is into Matillion.