Now that we've covered the need for an engine like SOLE, and looked at some of the main use cases for Snowflake object lifecycle management in DataOps, this section will start to examine the main concepts within SOLE.
As we established back in Why SOLE?, this is a declarative approach to object management and, as such, needs a declarative configuration.
SOLE uses a YAML-based configuration format that comprises several files, each specifying the objects
of a certain type (e.g. databases, warehouses, roles, users, etc.). These files sit together in the DataOps
project, located in a specific directory (
Configuration files can include Jinja templating that will allow the inclusion of dynamic content during pipeline run time. This can include:
- Variables from the pipeline environment and the DataOps Vault
- Conditional statements to include parts of the configuration under certain conditions
- Control structures that can loop through a set of values, reducing the code footprint.
If configuration files include templated content, the filename will include
.template., for example
The DataOps Snowflake Object Lifecycle Engine is built on top of the popular infrastructure-as-code toolset Terraform, using the Chan-Zuckerberg Snowflake provider, along with a YAML-based configuration compiler, enhanced logging and reporting, and many other features.
When a SOLE job runs, configuration flows through the following system components in this order:
A core part of the DataOps platform, this component identifies any files that contain Jinja template content and renders them into static files. This operation will resolve any variables to their current values, optionally include any conditional content, and expand out any looped sections.
Each rendered file is saved under the original filename but without the
.template modifier. The original
template file is then deleted to avoid issues with the next stages.
Next, the rendered YAML files go through a compiler that converts the syntax from the SOLE YAML format into Terraform's JSON structures, adding computed properties and other settings as required by the object type.
This is done in four steps, SOLE's domains of operation:
- Database objects (schemas, tables, etc.)
It is possible to export the Terraform configurations from a SOLE job if needed, using a DataOps artifact.
Once the Terraform configurations are ready, they are applied to Snowflake in order. Terraform has its own compile, plan and apply stages, but that is beyond the scope of this guide.
As part of this stage of the process, Snowflake configuration is imported, to compute the changes needed. And if the SOLE configuration includes a database clone (such as in a feature branch), there is an additional import from the newly-cloned database to detect its contents.
Ultimately, the SOLE configuration is fully applied to the Snowflake account, with only the minimum changes necessary made to update existing objects. The job will summarize the number of additions, changes and deletions that were applied to each domain of operation, e.g.:
| Resource type | Action | Objects Added | Objects Modified | Objects Deleted |
| ACCOUNT_LEVEL | PLAN | 5 | 1 | 0 |
| ACCOUNT_LEVEL | APPLY | 5 | 1 | 0 |
| DATABASE | PLAN | 0 | 0 | 0 |
| DATABASE | APPLY | 0 | 0 | 0 |
| DATABASE_LEVEL | PLAN | 0 | 0 | 0 |
| DATABASE_LEVEL | APPLY | 0 | 0 | 0 |
| GRANT | PLAN | 0 | 8 | 3 |
| GRANT | APPLY | 0 | 8 | 3 |
Objects are defined in the YAML configuration files as mentioned above. Here is a simple example of a warehouse object in SOLE:
comment: Warehouse for Transformation operations
The above configuration will create and maintain a medium warehouse in Snowflake, with the specified suspend/resume settings and access to be used by two roles.
In the example configuration above, the warehouse name is specified as
TRANSFORMATION, but that's not
the full name that will end up in Snowflake, as SOLE can apply namespacing prefix and suffix to
all account-level objects.
Namespacing is important so that objects can coexist across different DataOps projects and environments. SOLE will apply a prefix, a suffix, or both to object names to achieve this.
All DataOps projects define a variable named
DATAOPS_PREFIX, set in pipelines/includes/config/variables.yml,
that defines the main identifier for the project. The default value is usually
DATAOPS, but this can be
(and often is) changed in projects.
This prefix can allow multiple projects to coexist within the same Snowflake account, or will at least identify account-level objects as being managed by DataOps.
When a pipeline is run, the environment is determined from the current branch: master branch is the
environment, dev is
DEV, etc. Feature branches (e.g. any branch that is not
will have a generated environment name of the form
This suffix allows each environment to have its own set of Snowflake objects, without the risk of name collisions.
So, from the warehouse example above, if our project's DATAOPS_PREFIX is set to
DATAOPS and we're
running a pipeline in the
dev branch, the warehouse will be named
If we merge our dev code into master and run the pipeline again, a new warehouse will be managed,
Notes on Namespacing
When referring to another object in an object's configuration, for example, the roles
WRITER in the code above, always use the base, non-namespaced name. The SOLE compiler will sort
out all the resultant names for you. The exception here is when referring to an externally-managed
object - see below for more on that.
It is possible to disable namespacing, either to limit to prefix- or suffix-only or to turn it off altogether. This will be covered later.
Disabling namespacing can have an impact on your production environment, and even cause conflicts between different projects. It's there for a reason, so please ensure you understand it fully!
namespacing to none or prefix prevents the deletion of the object. Please refer to Environment-Specific Snowflake Objects.
To manage the lifecycle of the object but not its prefix and suffix,
namespacing can be set to external.
Roles and Grants
As the main use case for SOLE, the management of roles and grants is a core functionality and should
feature in all DataOps projects. The standard SOLE configuration has a file
roles.yml that can
be extended to include all the roles your project needs to define, along with how those roles are
assigned to each other and to users.
Then, each object has a
grants block (see the example above), that then specifies which roles
can enjoy which privileges on that object.
Of course, DataOps will not manage every single object in your Snowflake account. Even if it's doing most things, you will still have several other objects, particularly the built-in roles, user accounts, sample data database, etc.
Fortunately, SOLE has the concept of manage_mode to define how to interact with objects outside
its control. For most objects configured in SOLE, the default manage_mode is set to
SOLE full control over the object, but it is also possible to declare objects as externally managed.
For example, I may want to grant
MY_ROLE to the built-in
SYSADMIN role to maintain the role hierarchy:
comment: main role for my project
Read more on manage_mode here.