Skip to main content

Acceldata Orchestrator

Enterprise

Image$DATAOPS_ACCELDATA_RUNNER_IMAGE

The Acceldata orchestrator enables automated creation and management of data quality (DQ) policies in Acceldata's Data Observability Platform. This orchestrator integrates with your DataOps pipelines to publish data quality rules based on your dbt test definitions.

Usage

pipelines/includes/local_includes/acceldata_jobs/acceldata_dq.yml
"Acceldata DQ Policy Creation":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
SNOWFLAKE_DATABASE: "YOUR_DATABASE"
script:
- /dataops
icon: ${ACCELDATA_ICON}

The Acceldata orchestrator assumes that a DataOps modeling and transformation job completed running — including the dbt test execution — in an earlier stage of the DataOps pipeline. It uses the dbt test definitions to automatically create corresponding data quality policies in Acceldata.

Supported parameters

Core Configuration Parameters

ParameterRequired/DefaultDescription
DATAOPS_ACCELDATA_HOSTREQUIREDAcceldata host URL (without https://), e.g., your-host.acceldata.com
DATAOPS_ACCELDATA_ACCESS_KEYREQUIREDAPI access key for authentication
DATAOPS_ACCELDATA_SECRET_KEYREQUIREDAPI secret key for authentication
DATAOPS_ACCELDATA_ASSEMBLY_NAMEREQUIREDSpark assembly name in Acceldata
SNOWFLAKE_DATABASEOptional, defaults to OPS_DP_PUBSnowflake database name
DATAOPS_ACCELDATA_TAG_NAMEOptional, auto-detectedDomain tag name for policies/assets
DATAOPS_ACCELDATA_NOTIFICATION_CHANNEL_IDOptionalNotification channel ID for alerts
DATAOPS_ACCELDATA_DBT_MODELS_PATHOptional, defaults to ${CI_PROJECT_DIR}/dataops/modelling/Path to dbt schema YAML files
ADOC_POLICY_ACTIONOptionalAction: LIST, EXECUTE, DELETE, GET_ASSETS, or CRAWLER
ADOC_RULE_IDSOptionalComma-separated rule IDs for specific operations
DATAOPS_ACCELDATA_DISABLE_SCHEDULINGOptional, defaults to falseSet to true to create unscheduled policies
DATAOPS_ACCELDATA_SCHEDULE_START_TIMEOptional, defaults to 2Schedule start hour (0-23)
DATAOPS_ACCELDATA_SCHEDULE_END_TIMEOptional, defaults to 10Schedule end hour (0-23)
DATAOPS_ACCELDATA_SCHEDULE_INTERVAL_MINUTESOptional, defaults to 10Minutes between schedule slots

Standard Policies and Monitoring Parameters

ParameterRequired/DefaultDescription
DATAOPS_ACCELDATA_ENABLE_STANDARD_POLICIESOptional, defaults to falseEnable schema drift, freshness, and anomaly policies
DATAOPS_ACCELDATA_PROFILING_SCHEDULEOptional, defaults to 0 0 0,8,16 * * ?Cron schedule for profiling
DATAOPS_ACCELDATA_ANOMALY_SENSITIVITYOptional, defaults to MEDIUMAnomaly sensitivity: LOW, MEDIUM, or HIGH
DATAOPS_ACCELDATA_ANOMALY_TRAINING_WINDOW_DAYSOptional, defaults to 7Anomaly training window (days)
DATAOPS_ACCELDATA_FRESHNESS_CRON_SCHEDULEOptional, defaults to 0 12 0/4 * * ?Cron schedule for freshness checks
DATAOPS_ACCELDATA_FRESHNESS_POLICY_TYPESOptionalComma-separated: data_freshness, row_count, row_count_drift, asset_size, asset_size_drift. See Freshness Configuration
DATAOPS_ACCELDATA_ENABLE_DATA_FRESHNESSOptional, defaults to trueEnable Data Freshness policy
DATAOPS_ACCELDATA_ENABLE_ABSOLUTE_ROW_COUNTOptional, defaults to trueEnable Absolute Row Count policy
DATAOPS_ACCELDATA_ENABLE_ROW_COUNT_DRIFTOptional, defaults to falseEnable Row Count Drift policy
DATAOPS_ACCELDATA_ENABLE_ABSOLUTE_ASSET_SIZEOptional, defaults to trueEnable Absolute Asset Size policy
DATAOPS_ACCELDATA_ENABLE_ASSET_SIZE_DRIFTOptional, defaults to falseEnable Asset Size Drift policy
DATAOPS_ACCELDATA_PROFILING_TYPEOptional, defaults to FULLProfiling type: FULL or INCREMENTAL
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_TYPEOptionalStrategy: id, datetime, or partition. See Incremental Profiling
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_COLUMNOptionalColumn name for incremental strategy
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_FORMATOptional, defaults to yyyy-mm-ddDate/time format
DATAOPS_ACCELDATA_TABLE_INCREMENTAL_COLUMNSOptionalJSON mapping table names to columns
ADOC_RUN_CRAWLEROptional, auto-enabledEnable/disable metadata crawler
ADOC_CRAWLER_START_TIMEOUTOptional, defaults to 120Crawler start timeout (seconds)
ADOC_CRAWLER_COMPLETION_TIMEOUTOptional, defaults to 1800Crawler completion timeout (seconds)

Freshness Policy Threshold Parameters

ParameterRequired/DefaultDescription
DATAOPS_ACCELDATA_FRESHNESS_LOOKBACK_WINDOWOptional, defaults to 24Data freshness lookback window
DATAOPS_ACCELDATA_FRESHNESS_LOOKBACK_WINDOW_TYPEOptional, defaults to HOURSWindow type: HOURS or DAYS
DATAOPS_ACCELDATA_ROW_COUNT_LOOKBACK_WINDOWOptional, defaults to 4Row count lookback window
DATAOPS_ACCELDATA_ROW_COUNT_LOOKBACK_WINDOW_TYPEOptional, defaults to HOURSWindow type: HOURS or DAYS
DATAOPS_ACCELDATA_ROW_COUNT_CHANGE_THRESHOLDOptional, defaults to 10.0Row count change threshold (%)
DATAOPS_ACCELDATA_ROW_COUNT_DRIFT_LOOKBACK_WINDOWOptional, defaults to 4Row count drift lookback window
DATAOPS_ACCELDATA_ROW_COUNT_DRIFT_LOOKBACK_WINDOW_TYPEOptional, defaults to HOURSWindow type: HOURS or DAYS
DATAOPS_ACCELDATA_ROW_COUNT_DRIFT_THRESHOLDOptional, defaults to 10.0Row count drift threshold (%)
DATAOPS_ACCELDATA_ASSET_SIZE_LOOKBACK_WINDOWOptional, defaults to 24Asset size lookback window
DATAOPS_ACCELDATA_ASSET_SIZE_LOOKBACK_WINDOW_TYPEOptional, defaults to HOURSWindow type: HOURS or DAYS
DATAOPS_ACCELDATA_ASSET_SIZE_CHANGE_THRESHOLDOptional, defaults to 10.0Asset size change threshold (%)
DATAOPS_ACCELDATA_ASSET_SIZE_DRIFT_LOOKBACK_WINDOWOptional, defaults to 24Asset size drift lookback window
DATAOPS_ACCELDATA_ASSET_SIZE_DRIFT_LOOKBACK_WINDOW_TYPEOptional, defaults to HOURSWindow type: HOURS or DAYS
DATAOPS_ACCELDATA_ASSET_SIZE_DRIFT_THRESHOLDOptional, defaults to 10.0Asset size drift threshold (%)
Environment-Specific Parameters

For multi-environment deployments, append _PROD or _QA to: DATAOPS_ACCELDATA_HOST, DATAOPS_ACCELDATA_ACCESS_KEY, DATAOPS_ACCELDATA_SECRET_KEY, DATAOPS_ACCELDATA_ASSEMBLY_NAME, DATAOPS_ACCELDATA_NOTIFICATION_CHANNEL_ID

Incremental Profiling

Configure incremental profiling strategies for efficient data processing:

ID-Based Strategy:

DATAOPS_ACCELDATA_PROFILING_TYPE: "INCREMENTAL"
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_TYPE: "id"
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_COLUMN: "customer_id"

Datetime Strategy:

DATAOPS_ACCELDATA_PROFILING_TYPE: "INCREMENTAL"
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_TYPE: "datetime"
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_COLUMN: "created_date"
DATAOPS_ACCELDATA_INCREMENTAL_STRATEGY_FORMAT: "yyyy-MM-dd"

Per-Table Configuration:

DATAOPS_ACCELDATA_TABLE_INCREMENTAL_COLUMNS: |
{
"CUSTOMERS": "updated_at",
"ORDERS": "order_date"
}

Example jobs

Basic DQ Policy Creation

pipelines/includes/local_includes/acceldata_jobs/acceldata_basic.yml
"Acceldata DQ Policies":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
SNOWFLAKE_DATABASE: "YOUR_DATABASE"
script:
- /dataops
icon: ${ACCELDATA_ICON}

With Standard Policies

pipelines/includes/local_includes/acceldata_jobs/acceldata_standard.yml
"Acceldata DQ with Monitoring":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
DATAOPS_ACCELDATA_NOTIFICATION_CHANNEL_ID: "11345"
SNOWFLAKE_DATABASE: "YOUR_DATABASE"
DATAOPS_ACCELDATA_ENABLE_STANDARD_POLICIES: "true"
DATAOPS_ACCELDATA_FRESHNESS_POLICY_TYPES: "data_freshness,row_count,asset_size"
script:
- /dataops
icon: ${ACCELDATA_ICON}

List Policies

pipelines/includes/local_includes/acceldata_jobs/acceldata_list.yml
"List Acceldata Policies":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
ADOC_POLICY_ACTION: "LIST"
script:
- /dataops
icon: ${ACCELDATA_ICON}

Execute Policies

pipelines/includes/local_includes/acceldata_jobs/acceldata_execute.yml
"Execute Acceldata Policies":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
ADOC_POLICY_ACTION: "EXECUTE"
ADOC_RULE_IDS: "123,456,789"
script:
- /dataops
icon: ${ACCELDATA_ICON}

Get Policy Assets

pipelines/includes/local_includes/acceldata_jobs/acceldata_get_assets.yml
"Get Policy Assets":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
ADOC_POLICY_ACTION: "GET_ASSETS"
ADOC_RULE_IDS: "123,456,789"
script:
- /dataops
icon: ${ACCELDATA_ICON}

Delete Policies

pipelines/includes/local_includes/acceldata_jobs/acceldata_delete.yml
"Delete Acceldata Policies":
extends:
- .agent_tag
stage: "Data Quality"
image: $DATAOPS_ACCELDATA_RUNNER_IMAGE
variables:
DATAOPS_ACCELDATA_HOST: "your-host.acceldata.com"
DATAOPS_ACCELDATA_ACCESS_KEY: DATAOPS_VAULT(ACCELDATA.ACCESS_KEY)
DATAOPS_ACCELDATA_SECRET_KEY: DATAOPS_VAULT(ACCELDATA.SECRET_KEY)
DATAOPS_ACCELDATA_ASSEMBLY_NAME: "Your_Assembly"
ADOC_POLICY_ACTION: "DELETE"
ADOC_RULE_IDS: "123,456,789"
script:
- /dataops
icon: ${ACCELDATA_ICON}