Tags
DataOps lets you apply tags to Snowflake objects using the Snowflake Object Lifecycle Engine (SOLE). Once a tag is created, you can apply it to many objects with the same or different values.
This management via SOLE allows you to efficiently work with tags for the Database, Schema, and Table objects.
SOLE tag management is executed only during Sole grant management. If you have disabled SOLE grant management using the variable DONT_USE_SOLE_GRANT_MANAGEMENT: 1
, tagging will not work correctly. Read about this variable in supported parameters.
What is object tagging, and why is it important?
According to the Snowflake docs, object tagging allows tracking sensitive data for compliance, discovery, protection, and resource usage.
It is also possible to apply tags within the Modelling and Transformation Engine (Tables and Columns only). See MATE Object Tags.
Tagging in SOLE
Tagging in SOLE is based on Snowflake object tagging. To understand how tag management is implemented, it is essential to understand what tags are in Snowflake.
Snowflake describes tags as follows:
A tag is a schema-level object that can be assigned to another Snowflake object. A tag can be assigned an arbitrary string value upon assigning the tag to a Snowflake object. Snowflake stores the tag and its string value as a key-value pair. The tag must be unique for your schema, and the tag value is always a string.
The first step in the tagging process is to create a tag with a name, a comment, and a list of allowed values. For configuring a tag object's lifecycle, see the managed object reference for Tag.
The example below shows how to define a tag object — in this case, a single tag named CONFIDENTIALITY
in a schema named TAG_LIBRARY
.
databases:
DATABASE_1:
comment: DATABASE 1
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
Using SOLE to apply tags
Prerequisites
Ensure the variable DONT_USE_SOLE_GRANT_MANAGEMENT
isn't enabled, as SOLE tag management is executed only during Sole grant management. Read about this variable in supported parameters.
Applying Tags
SOLE uses files for the Snowflake object configuration, applying a declarative approach for defining the objects.
You can define the tags you want to apply to an object in the configuration files using the keyword with_tags
. Based on these configurations, SOLE internally generates and executes a set of statements to update or apply the tags to the related object.
The example below shows a tag called CONFIDENTIALITY
configured with the value Sensitive
in the with_tags
section in the DATABASE_1
configuration.
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
Based on this configuration, SOLE sets the tag CONFIDENTIALITY
to the value Sensitive
for DATABASE_1
, and unsets any other tags it finds in Snowflake.
Tag lineage
Tag lineage means that if a tag is applied to an object, the tag is inherited by the children of that object. Inheritance follows the SOLE Object Group Hierarchy levels from account to item level. For example, assigning a tag to a table will inherit the tag to all the columns in that table. The value of the inherited tag can be overridden on the child object with a more specific value.
A simple example of overriding a tag:
databases:
DATABASE_1:
comment: DATABASE 1
schemas:
INGESTION:
tables:
TABLE_1:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
columns:
COL1:
type: VARCHAR
COL2:
type: VARCHAR
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
The following details apply to this configuration:
COL2
overrides theCONFIDENTIALITY
tag with a value ofHighly Sensitive
.COL1
does not override the tag and will inherit the valueSensitive
fromTABLE_1
.
A more complex example of applying tags at different levels and how tags are inherited or overridden:
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
INGESTION:
tables:
TABLE_1:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY:
schema: TAG_LIBRARY
value: "PII"
TABLE_2:
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Highly Sensitive"
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY:
schema: TAG_LIBRARY
value: "PHI"
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
CATEGORY:
comment: "CATEGORY Tag"
allowed_values: ["PII", "PHI", "PCI"]
The following details apply to this configuration:
- All child objects of
DATABASE_1
will have the tag ofCONFIDENTIALITY
with the value ofSensitive
. This includes schemasINGESTION
andTAG_LIBRARY
. TABLE_1
andTABLE_2
both setCONFIDENTIALITY
toHighly Sensitive
, this overrides what is set onDATABASE_1
for the tables and the tables' child objects, which includes the columns.- Columns set the
CATEGORY
tag.COL1
onTABLE_1
has the valuePII
, andCOL1
onTABLE_2
has the valuePHI
. TheCATEGORY
tag has not been set anywhere else, so it will only apply to these columns.
Tag quotas
A maximum of 20 unique tags can be set on a single object. The string value for each tag can be up to 256 characters, with the option to specify allowed values for a tag.
The maximum number of unique tags is slightly different for tables and views, including the columns in those tables and views. See the Snowflake documentation for a detailed explanation of Tag Quotas.
Example:
databases:
DATABASE_1:
schemas:
TAG_LIBRARY:
# Definition of all tags
tags:
TAG1:
comment: "TAG1 Tag"
TAG2:
comment: "TAG2 Tag"
# ... TAG3-TAG20
TAG21:
comment: "TAG21 Tag"
# Applying tags
tables:
TABLE_1:
with_tags:
TAG1: "Value"
TAG2: "Value"
# ... TAG3-TAG20
columns:
COL1:
type: VARCHAR
with_tags:
TAG21: "Value"
The above example would cause an error from Snowflake because more than 20 unique tags are trying to be applied to the object TABLE_1
or its columns.
For example:
SQL compilation error: The maximum tag limit per object is 20. TABLE TABLE_1 currently has 20 associated tags. This request tries to add 1 new tag.
Working with applied tags
Syntax for applying tags
There are three scenarios when applying tags to objects:
-
When the tag is from the same schema as the object. In this case, you can use a shorthand syntax of a name-value pair.
with_tags:
<tag-name>: <tag-value> -
When the tag is from a different schema but the same database as the object. In this case, you must specify the schema and value as sub-keys.
with_tags:
<tag-name>:
value: <tag-value>
schema: <source-schema> -
When the tag is from a different database than the object, you must specify the database, schema, and value as sub-keys.
with_tags:
<tag-name>:
value: <tag-value>
schema: <source-schema>
database: <source-database>
For the above second and third scenarios, you can also use the YAML anchor syntax to assign a set of lines in different places in your configuration, as described in the following section.
Reusing a tag reference
You can use the YAML anchor syntax to make the same cross-schema or cross-database reference more concise. For example:
category_pii: &category_pii
database: DATABASE_1
schema: TAG_LIBRARY
value: "PII"
databases:
DATABASE_1:
comment: DATABASE 1
schemas:
TAG_LIBRARY:
tags:
CATEGORY:
comment: "CATEGORY Tag"
allowed_values: ["PII", "PHI", "PCI"]
DATABASE_2:
comment: DATABASE 2
schemas:
SCHEMA_1:
tables:
TABLE_1:
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY: *category_pii
TABLE_2:
columns:
COL1:
type: VARCHAR
with_tags:
CATEGORY: *category_pii
Tag management
SOLE manages tags only if you define the keyword with_tags
in the object configuration. Any tags not specified in the configuration but found in Snowflake will be removed implicitly.
In the example below, SOLE sets the tag CONFIDENTIALITY
to the value Sensitive
for DATABASE_1
and removes any other tags it finds in Snowflake for this database. But SOLE will not manage any tags for DATABASE_2
as no with_tags
section is defined for this database:
databases:
DATABASE_1:
comment: DATABASE 1
with_tags:
CONFIDENTIALITY:
schema: TAG_LIBRARY
value: "Sensitive"
schemas:
TAG_LIBRARY:
tags:
CONFIDENTIALITY:
comment: "CONFIDENTIALITY Tag"
allowed_values: ["Sensitive", "Highly Sensitive"]
DATABASE_2:
comment: DATABASE 2
# no with_tags: means unmanaged
Tag removal
To remove all tags applied to a Snowflake object, specify an empty with_tags
key in the Snowflake object configuration, as the following example shows:
databases:
DATABASE_3:
comment: DATABASE 3
with_tags: # empty with_tags: means remove all
In this scenario, SOLE removes all the tags it finds in Snowflake for DATABASE_3
.