Skip to main content

Taking it Live

Taking this live

What you will learn

In this section, you learn how you take pipelines to production by scheduling them for regular execution. We also show you how to easily add a second pipeline that can be used for your environment.

Set aside 5 minutes to complete the section.

Scheduling

So far, we have only been running these Pipelines manually, but operationally we usually want to run them on a schedule. Scheduling is simple to achieve. Navigate to CI / CD > Schedules > New Schedule and complete using demo-ci.yml as the pipeline type.

Creating different pipeline types

We only worked on a single -ci.yml till now, but it's most common to have a set of these, usually mapping to different frequencies and scope of data ingestion. For example, you might have "HR data processed every 6 hours" called hr-6hourly-ci.yml, "Sales data processed every 30 mins" called sales-30mins-ci.yml and so on. Generally, these are files simply including a different subset of the jobs.

  • Let's create a new pipeline sales-30mins-ci.yml
sales-30mins-ci.yml
include:
- /pipelines/includes/bootstrap.yml
- /pipelines/includes/local_includes/say_hello.yml
#- /pipelines/includes/local_includes/say_hello_again.yml
- /pipelines/includes/local_includes/base_hello.yml

We've made it similar to the previous pipeline definition but removed some of the jobs. We can now run this, and once we are happy it's working, schedule it to run every 30 mins.

Shortcut

Did you see that -ci.yml was not available to select to run when we created the new file? DataOps doesn't know about this file until we have committed it. So we selected Skip CI and then did another quick commit and then could select the new file.

Checkpoint 4

We have now added to our DataOps project:

  • Ability to Schedule jobs
  • Creation of additional Pipelines

You have done it! You now have all the key skills to get into building your DataOps pipelines.

Where to go from here?