Adding More Configuration
Now that you have become familiar with the base SOLE configuration, we can do a few exercises looking at adding new objects in SOLE. You will need to continue using the new DataOps project you used in the previous section.
We will use the Web IDE for file editing in these exercises, although you can of course use a different IDE if you wish.
Exercise 1: Schemas and tables
For this section, we will set up a schema and a couple of tables.
-
Open file
dataops/snowflake/databases.template.yml
and add the followingschemas
section after the database grants:dataops/snowflake/databases.template.ymldatabases:
<database-name>:
grants: ...
schemas:
TRAINING:
tables:
TRAINING_TABLE1:
columns:
COL1:
type: INT
COL2:
type: VARCHAR -
Commit to main and run pipeline
full-ci.yml
-
Open Snowflake and observe the new schema and table.
-
Now, return to
databases.template.yml
and add a second table, namedTRAINING_TABLE2
. Set up any columns you wish - refer to the SOLE table reference for full configuration details.Why not try......adding a primary key to your new table?
-
Commit and run the pipeline as before.
Exercise 2: Production-only warehouse
Now let's venture a little into the realm of Jinja templating, and create a warehouse that will only exist in the PROD environment.
-
Rename file
dataops/snowflake/warehouses.yml
towarehouses.template.yml
-
Open the file and add the following new warehouse configuration to the bottom of the file:
dataops/snowflake/warehouses.template.ymlwarehouses:
PRODUCTION_WH:
comment: Production use only!
warehouse_size: XSMALL
auto_suspend: 40
auto_resume: true
namespacing: prefix -
Commit to main and run pipeline
full-ci.yml
. -
Take a look at the new warehouse in Snowflake.
-
Now, let's add the magic: Alter the new warehouse config so it looks like this:
dataops/snowflake/warehouses.template.ymlwarehouses:
{% if env.DATAOPS_ENV_NAME == 'PROD' %}
PRODUCTION_WH:
comment: Production use only!
warehouse_size: XSMALL
auto_suspend: 40
auto_resume: true
namespacing: prefix
{% endif %}
Be careful not to introduce any additional indenting of content within {% %}
blocks.
Additional indenting may make the code slightly more readable but may break the YAML structure!
-
Commit to main and run pipeline
full-ci.yml
again. -
SOLE should not make any changes - as far as it's concerned nothing has changed, as we're still running in the production environment. We will check on this warehouse in the next exercise to make sure it doesn't pop up in a non-production environment!
...adding some grants to your new warehouse so that READER
and WRITER
can use it?
Exercise 3: What happens in dev?
So far, we have limited ourselves to committing directly to main (production) and running pipelines there. Now, we will create a development branch and see what happens.
-
Open your project's main page and navigate, using the left-hand bar, to Repository → Branches.
-
Create a new branch called
dev
from main. -
Navigate to CI/CD → Pipelines and check that the
full-ci.yml
pipeline is running in dev. If not, click Run pipeline and startfull-ci.yml
in dev. -
Once the pipeline completes (it will show as Blocked as the final job is manual), take a look in Snowflake. There should be a new database named
DATAOPS_SOLE_TRAINING_DEV
, a clone of the production database (check it has the same schemas/tables), but not a new warehouse, as this has been configured to only exist in the production environment.
...creating a feature branch from dev
and running the pipeline there. What do the new Snowflake
objects look like?
Exercise 4: Make a change in dev
Now that we have a development branch, and a handy cloned database to work in, let's create something new in there.
-
In the Web IDE, switch to the dev branch.
-
Open file
dataops/snowflake/databases.template.yml
and add the followingstages
block to theTRAINING
schema, below thetables
section (be careful with your indenting):dataops/snowflake/databases.template.ymltables: ...
stages:
TRAINING_STAGE:
comment: Created for training purposes
url: s3://no-such-bucket/no-such-path -
Commit to dev and run pipeline
full-ci.yml
. -
Once the pipeline completes, take a look at your new stage in the
DATAOPS_SOLE_TRAINING_DEV
database.
...creating a file format or two as well?
Exercise 5: Merge into production
Let's say that you've completed developing all your new database features, and they have been thoroughly tested (out of the scope of this guide). You must raise a Merge Request (MR) into main (production environment).
In most real-world projects there will be one or more test/pre-production environments in between. Let's pretend that's already happened!
-
Open your project's main page and navigate, using the left-hand bar, to Merge requests.
-
Click Create merge request and create a new MR from the dev branch into main.
-
Enter
Production Release
for the title, anything you like for the description, and assign the MR to yourself. -
Important: Make sure the option to delete the source branch is unchecked.
Otherwise......you won't be able to run the tear-down in the next exercise, as your dev branch will have gone!
-
Click Submit to raise the request. If you're not that familiar with Merge Requests, feel free to explore the MR form, especially the Changes tab.
-
Pretend to be someone else and merge the MR with the Merge button.
-
Navigate to CI/CD → Pipelines and run a new pipeline for the main branch (
full-ci.yml
as before). -
Once the pipeline completes, take a look at your new features in the
DATAOPS_SOLE_TRAINING_PROD
database.
Exercise 6: Tear down dev
Assuming you didn't allow the MR process in the previous exercise to delete the dev branch, you can now go back and clean up all your development resources. We can then delete the branch itself.
In most DataOps projects, the dev branch is considered persistent and will not normally be cleaned up and deleted. We're simplifying this a bit for this guide, so go with us.
-
Navigate to CI/CD → Pipelines and find the most recent dev branch pipeline.
-
Click on the ⚙ Blocked status button to enter the pipeline details.
-
Noting that it has not yet run, click the ▶ button on the Tear Down Snowflake job.
-
Once the job is completed, take a look in Snowflake. The
DATAOPS_SOLE_TRAINING_DEV
database should have been dropped, along with any other resources suffixed to the DEV environment (if you created any). -
Now, you can safely delete the dev branch. Navigate to Repository → branches and delete it.