Taking this live
What you will learn
In this section, you learn how you take pipelines to production by scheduling them for regular execution. We also show you how to easily add a second pipeline that can be used for your environment.
Set aside 5 minutes to complete the section.
So far, we have only been running these Pipelines manually, but operationally we usually want to run them on a schedule.
Scheduling is simple to achieve. Navigate to CI / CD > Schedules > New Schedule and complete using
demo-ci.yml as the pipeline type.
Creating different pipeline types
We only worked on a single
-ci.yml till now, but it's most common to have a set of these, usually
mapping to different frequencies and scope of data ingestion. For example, you might have "HR data processed every 6
hr-6hourly-ci.yml, "Sales data processed every 30 mins" called
sales-30mins-ci.yml and so on.
Generally, these are files simply including a different subset of the jobs.
- Let's create a new pipeline
We've made it similar to the previous pipeline definition but removed some of the jobs. We can now run this, and once we are happy it's working, schedule it to run every 30 mins.
Did you see that
-ci.yml was not available to select to run when we created the new file? DataOps doesn't know about
this file until we have committed it. So we selected
Skip CI and then did another quick commit and then could select
the new file.
We have now added to our DataOps project:
- Ability to Schedule jobs
- Creation of additional Pipelines
You have done it! You now have all the key skills to get into building your DataOps pipelines.
Where to go from here?
- Create your first real project by starting from the DataOps template project
- Familiarize yourself with the available orchestrators and their capabilities
- Dive into the Snowflake Object Lifecycle Engine (SOLE) fundamentals
- Dive into the Modelling and Transformation Engine (MATE) fundamentals
- If you are a developer boost your productivity by setting up your developer experience