Skip to main content

Assist Data Pipeline mode for Data Engineers

Data Pipeline mode supports modern data engineering workflows. It helps you:

  • Design robust, scalable data pipelines
  • Optimize pipeline performance across stages
  • Build error handling and recovery mechanisms directly into your pipelines

Building each pipeline component from scratch takes time, and data engineers often have higher-priority tasks. The DataOps.live reference project simplifies this process by generating a standardized project structure from a template. You can then customize the template to fit your specific needs.

However, the reference project only provides scaffolding, it doesn’t tailor pipelines to your requirements. That’s where Data Pipeline mode steps in.

With Data Pipeline mode, you define your requirements, and Assist generates a tailored pipeline for you. For example, prompting: “Create a simple environment pipeline job” creates a new YAML file and generates the following job:

environment pipeline job !!shadow!!

environment job
Simple Environment Info Job:
extends:
- .agent_tag
stage: Additional Configuration
script:
- echo "Running simple environment info job"
- echo "Current date and time: $(date)"
- echo "Environment: $DATAOPS_ENV_NAME"
- echo "Database: $DATAOPS_DATABASE"
tags:
- dataops-runner
when: always

This job captures and prints basic environment details during pipeline execution. It runs in the Additional Configuration stage and triggers on every pipeline run, regardless of success or failure in previous steps (when: always).

You can use it to debug, audit, or validate configuration settings before other stages run. It logs:

  • The current date and time
  • The active DataOps environment name ($DATAOPS_ENV_NAME)
  • The target database ($DATAOPS_DATABASE)

Add environment pipeline job to full-ci.yml !!shadow!!

Next, Assist asks you for permission to include this job in your full-ci.yml file—the main pipeline configuration. Once you confirm and save, it automatically adds the job to your CI workflow.