SOLE Namespace and Environment Management
This topic discusses how the Snowflake Object Lifecycle Engine (SOLE) controls namespace and environment management for all account-level objects. And, then allied to the namespace and environment management, we'll look at what manage-mode is and how it impacts the overall management of Snowflake objects.
Overview
Before we dive into namespace and environment management, let's look at what a namespace and an environment are.
What is a namespace?
The Snowflake docs describe a namespace as a combination of the database and schema names. Additionally:
When performing any operations on database objects in Snowflake, the namespace is inferred from the current database and schema in use for the session. If a database and schema are not in use for the session, the namespace must be explicitly specified when performing any operations on the objects.
For instance, if we refer to the page on how to build MATE models, we see that both models are built on top of base tables that come from the raw
database (raw.stock_orders
and raw.product_types
). In these examples, raw
is the namespace part of the database table names.
stg_orders as (
select
id as order_id,
prod_type_code as product_type
items as items_ordered,
date as order_date
from
raw.stock_orders
)
stg_product_types as (
select
id as product_type_id,
code as product_type_code,
prod_type_desc as product_type_description
from
raw.product_types
)
There is no schema in these examples. However, if we add the schema, machine_shop
, to these examples, then the namespace becomes raw.machine_shop
, as the following SQL code snippet shows:
stg_orders as (
select
id as order_id,
prod_type_code as product_type
items as items_ordered,
date as order_date
from
raw.machine_shop.stock_orders
)
What is an environment?
The definition of a DataOps environment is:
A DataOps environment describes where code and data are deployed together.
There are essentially two types of environments:
- Static or long-lived environments such as PROD, QA, and DEV
- Short-lived environments such as feature branches
Depending on your engineering processes also DEV can be a short-lived environment.
Environment management
Environment management controls the environments in which Snowflake objects (and code) are managed. This feature ensures that objects created in a specific environment, such as Prod or QA, are only accessed and managed by the environment that created them.
It is important to note that environment management control can only be applied to account-level objects, ensuring that SOLE configures the environment to manage each object.
You can set the environment for each account-level object by adding the keyword environment
to the properties section of an object and setting its value to one of the project's environment names, typically one of PROD
, QA
, or DEV
.
For example:
roles:
ADMIN:
environment: PROD
QA:
environment: QA
databases:
DB:
environment: DEV
The following details apply to this configuration:
- The ADMIN role is only managed when the environment is PROD (main branch).
- The QA role is only managed when the environment is QA (qa branch).
- The DB database is only managed when the environment is DEV (dev branch).
The object is created in all environments if the environment
variable is not specified.
Setting the environment:
property on an object that already exists in the configuration will cause that object to be removed from other environments even if the namespacing is set to prevent deletion. For instance, setting environment: PROD
instructs SOLE to remove the object in the feature, qa, and dev branches. To avoid this from occurring, set the LIFECYCLE_STATE_RESET
variable for the first pipeline execution after setting the environment property, resetting the SOLE state file, so SOLE will no longer remove the object.
This rule applies to any environment or branch where this change is added (or merged). For example, if these changes are made in a feature branch and merged into the dev or main branch, LIFECYCLE_STATE_RESET
should be set for at least the following pipeline run where the changes have been integrated.
For more information on environments and environment management, see the Branching Strategies document.
Namespace management
In DataOps, namespacing (or namespace management) is similar to environment management in that it separates objects into different branches (for example, qa, dev, and feature branches) from each other so that each branch has authority over its objects. Therefore, namespace management controls the addition of prefixes and suffixes to account-level objects and databases, except for the default database.
Adding namespacing
to a configuration determines whether a prefix and/or a suffix is added to the account-level objects and databases (except for the default database). The parameters used to specify the prefix and suffix are as follows:
- The project variable
DATAOPS_PREFIX
value specifies the prefixwarningSupported characters in
DATAOPS_PREFIX
are letters (A-Z), decimal digits ( 0-9), and an underscore (_). If lowercase letters are used, SOLE adds a prefix and suffix to the value of the variableDATAOPS_DATABASE
available at pipeline run time to create a default database with an incorrect name. - The project variable
DATAOPS_ENV_NAME
value specifies the suffix
The accepted value for the namespacing
variable in the properties section of an account-level object is one of the following values:
-
both
(default)- Both the prefix and the suffix are added to the Snowflake object's name
- If the
namespacing
property is not defined, this is the default
-
none
-
Neither the prefix nor the suffix is added to the Snowflake object's name
Object DeletionSetting
namespacing
tonone
prevents SOLE from deleting an object. Refer to Environment-Specific Snowflake Objects for the reasons why.
-
-
prefix
-
Only the prefix is added to the Snowflake object's name
Object DeletionSetting
namespacing
toprefix
prevents SOLE from deleting an object. Refer to Environment-Specific Snowflake Objects for the reasons why.
-
-
suffix
- Only the suffix is added to the Snowflake object's name
- If the parameter
DATAOPS_NONDB_ENV_NAME
is set, its value is used as the suffix instead
-
external
- Neither the prefix nor the suffix is added to the Snowflake object's name
- The object is managed completely (
CREATE
,UPDATE
, andDELETE
), but no namespacing is applied external
can be used if the object names are being managed outside the engine, for example, by a script running before SOLE, to set the object names to follow a predefined naming convention.- Setting
external
asnamespacing
allows for complete lifecycle management without adding the suffix or the prefix.
Environment-specific Snowflake objects
Adding a suffix to an object name ensures that the object is scoped to the suffix's environment. This informs SOLE that it can completely manage the object lifecycle (CREATE
, UPDATE
, DELETE
). If a suffix is not set via namespacing
, then SOLE assumes the object is shared between all the different environments. And because deleting a shared object could have serious repercussions, SOLE does not delete shared objects.
For example:
roles:
ADMIN:
namespacing: none
QA:
namespacing: suffix
databases:
DB_1:
namespacing: both
DB_2:
namespacing: prefix
With the DATAOPS_PREFIX
set to DATAOPS
and DATAOPS_ENV_NAME
set to PROD
, the generated configuration is as follows:
- The ADMIN role remains as it is.
- The QA role is converted to QA_PROD.
- The DB_1 database is converted to DATAOPS_DB_1_PROD.
- The DB_2 database is converted to DATAOPS_DB_2.
Manage mode
SOLE supports the ability to define a specified object's manage mode (manage_mode
) feature.
What is manage mode?
Manage mode allows you to specify what object properties SOLE can manage and what the engine must ignore. For instance, this feature is essential if SOLE only manages an object's grants, or it must manage a child object (such as a table or pipe) but not its parent (like a schema).
SOLE supports the following three manage modes:
All
Setting manage_mode
to all
informs SOLE that it must manage all the object's properties. This setting also notifies SOLE that the engine must generate the resource and grant definitions for the object.
As all
is the default value for manage_mode
, you don't need to explicitly define it.
Grants
If only the defined grants must be managed for the object, then setting manage_mode
to grants
notifies SOLE that only the object's grants and revokes definitions must be generated by the engine.
None
In the configuration, you must define parent objects of database-level objects, such as schema, table, and stage. As a result, the following rules apply:
- Schemas must be defined inside a database
- Tables, stages, masking policies, and so on must be defined inside a schema.
In the scenario where SOLE must only manage a child object, setting manage_mode
to none
for the parent object will inform the engine that it must generate neither the resource nor the grant definition for the parent object. It must only generate the configurations for the child objects.
The following example describes how to use the different manage_mode
settings:
databases:
DATABASE_1:
manage_mode: all
comment: "Test Database 1"
grants:
USAGE:
- ROLE_1
- SYSADMIN
DATABASE_2:
manage_mode: grants
comment: "Test Database 2"
grants:
USAGE:
- ROLE_1
- SYSADMIN
DATABASE_3:
comment: "Test Database 3"
manage_mode: none
schemas:
SCHEMA_1:
comment: "Test Schema"
The following details apply to this configuration:
- All the specified properties are managed for
DATABASE_1
, including thecomment
and definedgrants
. - Only the defined grants are managed for
DATABASE_2
. The other properties, like thecomment
, are ignored. - None of the properties are managed for
DATABASE_3
. That means SOLE manages neither thecomment
nor the definedgrants
. - Because
SCHEMA_1
is defined insideDATABASE_3
, the defaultmanage_mode: all
applies, so SOLE manages the comment property.
Manage mode cannot only manage the object properties and ignore the grants. Therefore, if you don't want the grants managed, don't define them in the configuration. See Access Management for more information.
Changing manage mode
Changing the manage_mode
of a managed object from all
(default) to grants
or none
leads to deleting the object unless you do a state reset.
SOLE uses Terraform to manage the state of all managed objects. If manage_mode
is changed from all
, SOLE does not generate a configuration for the related object.
And as the state of the object is still managed, Terraform assumes that the object is to be deleted.
To prevent object deletion when changing manage_mode
:
- Set the variable
LIFECYCLE_STATE_RESET
to1
at the project level (recommended). Read more about state reset here. - Set
manage_mode
togrants
ornone
. - Run the pipeline.
- Comment out the declaration of the variable
LIFECYCLE_STATE_RESET
.