Skip to main content

SOLE Namespace and Environment Management

This topic discusses how the Snowflake Object Lifecycle Engine (SOLE) controls namespace and environment management for all account-level objects. And, then allied to the namespace and environment management, we'll look at what manage-mode is and how it impacts the overall management of Snowflake objects.

Overview

Before we dive into namespace and environment management, let's look at what a namespace and an environment are.

What is a namespace?

The Snowflake docs describe a namespace as a combination of the database and schema names. Additionally:

When performing any operations on database objects in Snowflake, the namespace is inferred from the current database and schema in use for the session. If a database and schema are not in use for the session, the namespace must be explicitly specified when performing any operations on the objects.

For instance, if we refer to the page on how to build MATE models, we see that both models are built on top of base tables that come from the raw database (raw.stock_orders and raw.product_types). In these examples, raw is the namespace part of the database table names.

stg_orders.sql
stg_orders as (
select
id as order_id,
prod_type_code as product_type
items as items_ordered,
date as order_date
from
raw.stock_orders
)
stg_product_types.sql
stg_product_types as (
select
id as product_type_id,
code as product_type_code,
prod_type_desc as product_type_description
from
raw.product_types
)

There is no schema in these examples. However, if we add the schema, machine_shop, to these examples, then the namespace becomes raw.machine_shop, as the following SQL code snippet shows:

stg_orders.sql
stg_orders as (
select
id as order_id,
prod_type_code as product_type
items as items_ordered,
date as order_date
from
raw.machine_shop.stock_orders
)

What is an environment?

The definition of a DataOps environment is:

A DataOps environment describes where code and data are deployed together.

There are essentially two types of environments:

  • Static or long-lived environments such as PROD, QA, and DEV
  • Short-lived environments such as feature branches

Depending on your engineering processes also DEV can be a short-lived environment.

Environment management

Environment management controls the environments in which Snowflake objects (and code) are managed. This feature ensures that objects created in a specific environment, such as Prod or QA, are only accessed and managed by the environment that created them.

It is important to note that environment management control can only be applied to account-level objects, ensuring that SOLE configures the environment to manage each object.

You can set the environment for each account-level object by adding the keyword environment to the properties section of an object and setting its value to one of the project's environment names, typically one of PROD, QA, or DEV.

For example:

dataops/snowflake/objects.yml
roles:
ADMIN:
environment: PROD
QA:
environment: QA
databases:
DB:
environment: DEV

The following details apply to this configuration:

  • The ADMIN role is only managed when the environment is PROD (main branch).
  • The QA role is only managed when the environment is QA (qa branch).
  • The DB database is only managed when the environment is DEV (dev branch).
note

The object is created in all environments if the environment variable is not specified.

Setting the environment: property on an object that already exists in the configuration will cause that object to be removed from other environments even if the namespacing is set to prevent deletion. For instance, setting environment: PROD instructs SOLE to remove the object in the feature, qa, and dev branches. To avoid this from occurring, set the LIFECYCLE_STATE_RESET variable for the first pipeline execution after setting the environment property, resetting the SOLE state file, so SOLE will no longer remove the object.

This rule applies to any environment or branch where this change is added (or merged). For example, if these changes are made in a feature branch and merged into the dev or main branch, LIFECYCLE_STATE_RESET should be set for at least the following pipeline run where the changes have been integrated.

For more information on environments and environment management, see the Branching Strategies document.

Namespace management

In DataOps, namespacing (or namespace management) is similar to environment management in that it separates objects into different branches (for example, qa, dev, and feature branches) from each other so that each branch has authority over its objects. Therefore, namespace management controls the addition of prefixes and suffixes to account-level objects and databases, except for the default database.

Adding namespacing to a configuration determines whether a prefix and/or a suffix is added to the account-level objects and databases (except for the default database). The parameters used to specify the prefix and suffix are as follows:

  • The project variable DATAOPS_PREFIX value specifies the prefix
    warning

    Supported characters in DATAOPS_PREFIX are letters (A-Z), decimal digits ( 0-9), and an underscore (_). If lowercase letters are used, SOLE adds a prefix and suffix to the value of the variable DATAOPS_DATABASE available at pipeline run time to create a default database with an incorrect name.

  • The project variable DATAOPS_ENV_NAME value specifies the suffix

The accepted value for the namespacing variable in the properties section of an account-level object is one of the following values:

  • both (default)

    • Both the prefix and the suffix are added to the Snowflake object's name
    • If the namespacing property is not defined, this is the default
  • none

    • Neither the prefix nor the suffix is added to the Snowflake object's name

      Object Deletion

      Setting namespacing to none prevents SOLE from deleting an object. Refer to Environment-Specific Snowflake Objects for the reasons why.

  • prefix

    • Only the prefix is added to the Snowflake object's name

      Object Deletion

      Setting namespacing to prefix prevents SOLE from deleting an object. Refer to Environment-Specific Snowflake Objects for the reasons why.

  • suffix

    • Only the suffix is added to the Snowflake object's name
    • If the parameter DATAOPS_NONDB_ENV_NAME is set, its value is used as the suffix instead
  • external

    • Neither the prefix nor the suffix is added to the Snowflake object's name
    • The object is managed completely (CREATE, UPDATE, and DELETE), but no namespacing is applied
    • external can be used if the object names are being managed outside the engine, for example, by a script running before SOLE, to set the object names to follow a predefined naming convention.
    • Setting external as namespacing allows for complete lifecycle management without adding the suffix or the prefix.

Environment-specific Snowflake objects

Adding a suffix to an object name ensures that the object is scoped to the suffix's environment. This informs SOLE that it can completely manage the object lifecycle (CREATE, UPDATE, DELETE). If a suffix is not set via namespacing, then SOLE assumes the object is shared between all the different environments. And because deleting a shared object could have serious repercussions, SOLE does not delete shared objects.

For example:

dataops/snowflake/objects.yml
roles:
ADMIN:
namespacing: none
QA:
namespacing: suffix
databases:
DB_1:
namespacing: both
DB_2:
namespacing: prefix

With the DATAOPS_PREFIX set to DATAOPSand DATAOPS_ENV_NAME set to PROD, the generated configuration is as follows:

  • The ADMIN role remains as it is.
  • The QA role is converted to QA_PROD.
  • The DB_1 database is converted to DATAOPS_DB_1_PROD.
  • The DB_2 database is converted to DATAOPS_DB_2.

Manage mode

SOLE supports the ability to define a specified object's manage mode (manage_mode) feature.

What is manage mode?

Manage mode allows you to specify what object properties SOLE can manage and what the engine must ignore. For instance, this feature is essential if SOLE only manages an object's grants, or it must manage a child object (such as a table or pipe) but not its parent (like a schema).

SOLE supports the following three manage modes:

All

Setting manage_mode to all informs SOLE that it must manage all the object's properties. This setting also notifies SOLE that the engine must generate the resource and grant definitions for the object.

tip

As all is the default value for manage_mode, you don't need to explicitly define it.

Grants

If only the defined grants must be managed for the object, then setting manage_mode to grants notifies SOLE that only the object's grants and revokes definitions must be generated by the engine.

None

In the configuration, you must define parent objects of database-level objects, such as schema, table, and stage. As a result, the following rules apply:

  • Schemas must be defined inside a database
  • Tables, stages, masking policies, and so on must be defined inside a schema.

In the scenario where SOLE must only manage a child object, setting manage_mode to none for the parent object will inform the engine that it must generate neither the resource nor the grant definition for the parent object. It must only generate the configurations for the child objects.

The following example describes how to use the different manage_mode settings:

dataops/snowflake/database.yml
databases:
DATABASE_1:
manage_mode: all
comment: "Test Database 1"
grants:
USAGE:
- ROLE_1
- SYSADMIN
DATABASE_2:
manage_mode: grants
comment: "Test Database 2"
grants:
USAGE:
- ROLE_1
- SYSADMIN
DATABASE_3:
comment: "Test Database 3"
manage_mode: none
schemas:
SCHEMA_1:
comment: "Test Schema"

The following details apply to this configuration:

  • All the specified properties are managed for DATABASE_1, including the comment and defined grants.
  • Only the defined grants are managed for DATABASE_2. The other properties, like the comment, are ignored.
  • None of the properties are managed for DATABASE_3. That means SOLE manages neither the comment nor the defined grants.
  • Because SCHEMA_1 is defined inside DATABASE_3, the default manage_mode: all applies, so SOLE manages the comment property.
info

Manage mode cannot only manage the object properties and ignore the grants. Therefore, if you don't want the grants managed, don't define them in the configuration. See Access Management for more information.

Changing manage mode

Changing the manage_mode of a managed object from all (default) to grants or none leads to deleting the object unless you do a state reset.

SOLE uses Terraform to manage the state of all managed objects. If manage_mode is changed from all, SOLE does not generate a configuration for the related object. And as the state of the object is still managed, Terraform assumes that the object is to be deleted.

To prevent object deletion when changing manage_mode:

  1. Set the variable LIFECYCLE_STATE_RESET to 1 at the project level (recommended). Read more about state reset here.
  2. Set manage_mode to grants or none.
  3. Run the pipeline.
  4. Comment out the declaration of the variable LIFECYCLE_STATE_RESET.