Skip to main content

SOLE Tag Management

DataOps lets you apply tags to Snowflake objects using the Snowflake Object Lifecycle Engine (SOLE). Once a tag is created, you can apply it to many objects with the same or different values.

This management via SOLE gives you a way to easily work with tags for the Database, Schema, and Table objects.

What is object tagging and why it is important?

According to the Snowflake docs, object tagging allows the tracking of sensitive data for compliance, discovery, protection, and resource usage.

It is also possible to apply tags within the Modelling and Transformation Engine (Tables and Columns only). See MATE Object Tags.

Tagging in SOLE

Tagging in SOLE is based on Snowflake object tagging. To understand how tag management is implemented, it is essential to understand what tags are in Snowflake.

Snowflake describes tags as follows:

A tag is a schema-level object that can be assigned to another Snowflake object. A tag can be assigned an arbitrary string value upon assigning the tag to a Snowflake object. Snowflake stores the tag and its string value as a key-value pair. The tag must be unique for your schema, and the tag value is always a string.

The first step in the tagging process is to create a tag with a name, a comment, and a list of allowed values. For how to configure the lifecycle of a tag object, see the managed object reference for Tag.

The example below shows how to define a tag object — in this case, a single tag named CONFIDENTIALITY in a schema named TAG_LIBRARY.

dataops/snowflake/database.yml
databases:
DATABASE_1:
comment: DATABASE 1
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]

Using SOLE to Apply Tags

Applying Tags

SOLE supports defining files for the Snowflake object configuration where it applies a declarative approach for defining the objects.

You can define the tags you want to apply to an object in the configuration files using the keyword with_tags. Based on these configurations, SOLE internally generates and executes a set of statements to update or apply the tags to the related object.

The example below shows a tag called CONFIDENTIALITY configured with the value Sensitive in the with_tags section in the DATABASE_1 configuration.

dataops/snowflake/database.yml
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]

Based on this configuration, SOLE sets the tag CONFIDENTIALITY to the value Sensitive for DATABASE_1, and unsets any other tags it finds in Snowflake.

Tag Lineage

Tag lineage means that if a tag is applied to an object, the tag is inherited by the children of that object. Inheritance follows the SOLE Object Group Hierarchy levels from account to item level. For example, assigning a tag to a table will inherit the tag to all the columns in that table. The value of the inherited tag can be overridden on the child object with a more specific value.

A simple example of overriding a tag:

dataops/snowflake/database.yml
databases:
DATABASE_1:
comment: DATABASE 1
schemas:
INGESTION:
tables:
TABLE_1:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
columns:
COL1:
type: VARCHAR
COL2:
type: VARCHAR
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]

The following details apply to this configuration:

  • COL2 overrides the CONFIDENTIALITY tag with a value of Highly Sensitive.
  • COL1 does not override the tag and will inherit the value Sensitive from TABLE_1.

A more complex example of applying tags at different levels and how tags are inherited or overridden:

dataops/snowflake/database.yml
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
INGESTION:
tables:
TABLE_1:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY:
schema: TAG_LIBRARY
value: "PII"
TABLE_2:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY:
schema: TAG_LIBRARY
value: "PHI"
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
CATEGORY:
comment: "CATEGORY Tag"
allowed_values: ["PII", "PHI", "PCI"]

The following details apply to this configuration:

  • All child objects of DATABASE_1 will have the tag of CONFIDENTIALITY with the value of Sensitive. This includes schemas INGESTION and TAG_LIBRARY.
  • TABLE_1 and TABLE_2 both set CONFIDENTIALITY to Highly Sensitive, this overrides what is set on DATABASE_1 for the tables and the tables' child objects, which includes the columns.
  • Columns set the CATEGORY tag. COL1 on TABLE_1 has the value PII, and COL1 on TABLE_2 has the value PHI. The CATEGORY tag has not been set anywhere else so will only apply to these columns.

Tag Quotas

A maximum of 20 unique tags can be set on a single object. The string value for each tag can be up to 256 characters, with the option to specify allowed values for a tag.

The maximum number of unique tags is slightly different for tables and views, including the columns in those tables and views. See the Snowflake documentation for a detailed explanation of Tag Quotas.

Example:

dataops/snowflake/database.yml
databases:
DATABASE_1:
schemas:
TAG_LIBRARY:
# Definition of all tags
tags:
TAG1:
comment: "TAG1 Tag"
TAG2:
comment: "TAG2 Tag"
# ... TAG3-TAG20
TAG21:
comment: "TAG21 Tag"

# Applying tags
tables:
TABLE_1:
with_tags:
TAG1: "Value"
TAG2: "Value"
# ... TAG3-TAG20
columns:
COL1:
type: VARCHAR
with_tags:
TAG21: "Value"

The above example would cause an error from Snowflake because more than 20 unique tags are trying to be applied to the object TABLE_1 or its columns.

For example: SQL compilation error: The maximum tag limit per object is 20. TABLE TABLE_1 currently has 20 associated tags. This request tries to add 1 new tag.

Working With Applied Tags

Syntax for applying tags

There are three scenarios when applying tags to objects:

  • When the tag is from the same schema as the object. In this case, you can use a shorthand syntax of a name-value pair.

    with_tags:
    <tag-name>: <tag-value>
  • When the tag is from a different schema but the same database as the object. In this case, you have to specify the schema and value as sub-keys.

    with_tags:
    <tag-name>:
    value: <tag-value>
    schema: <source-schema>
  • When the tag is from a different database to the object. In this case, you have to specify the database, schema and value as sub-keys.

    with_tags:
    <tag-name>:
    value: <tag-value>
    schema: <source-schema>
    database: <source-database>

For the above second and third scenarios, you can also use the YAML anchor syntax to assign a set of lines in different places in your configuration as described in the following section.

Reusing a Tag Reference

You can use the YAML anchor syntax to make the same cross-schema or cross-database reference more concise. For example:

dataops/snowflake/database.yml
category_pii: &category_pii
database: DATABASE_1
schema: TAG_LIBRARY
value: "PII"

databases:
DATABASE_1:
comment: DATABASE 1
schemas:
TAG_LIBRARY:
tags:
CATEGORY:
comment: "CATEGORY Tag"
allowed_values: ["PII", "PHI", "PCI"]

DATABASE_2:
comment: DATABASE 2
schemas:
SCHEMA_1:
tables:
TABLE_1:
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY: *category_pii
TABLE_2:
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY: *category_pii

Tag Management

SOLE manages tags only if you define the keyword with_tags in the object configuration. Any tags not specified in the configuration but found in Snowflake will be removed implicitly.

In the example below, SOLE sets the tag CONFIDENTIALITY to the value Sensitive for DATABASE_1 and removes any other tags it finds in Snowflake for this database. But SOLE will not manage any tags for DATABASE_2 as no with_tags section is defined for this database:

dataops/snowflake/database.yml
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
DATABASE_2:
comment: DATABASE 2
# no with_tags: means unmanaged

Tag Removal

To remove all tags applied to a Snowflake object, specify an empty with_tags key in the Snowflake object configuration, as the following example shows:

dataops/snowflake/database.yml
databases:
DATABASE_3:
comment: DATABASE 3
with_tags: # empty with_tags: means remove all

In this scenario, SOLE removes all the tags it finds in Snowflake for DATABASE_3.