Unlocking Data Governance with Snowflake Object Tagging

Date
  • Jeff Skoldberg
    Jeff Skoldberg
    Sales Engineer at SELECT

What is Object Tagging in Snowflake?

Object tagging in Snowflake is a data governance feature that allows you to categorize and group objects (warehouses, tables, etc.). The purpose is to enhance discoverability, monitor usage, and monitor compliance on Snowflake objects. Tags are particularly handy to identify Snowflake objects with sensitive information such as PII or financial data. They are also useful for assigning ownership of resources, for cost monitoring or data governance purposes.

What are tags?

In order to tag objects in Snowflake, you first need to create the tag itself.

Tags are schema level objects in Snowflake that can be attached to other Snowflake objects such as databases, schemas, tables, columns, users, warehouses, roles, etc. The list of objects that can be tagged is quite extensive, and can be found here.

A tag is a key-value pair. The tag name ( create tag <tag name>) is considered the “key”, and the values can be a pre-defined list of allowed values or left unrestricted, allowing any string value.

For example, if you want to track the audience of a particular table, you can create a tag (key) called “audience” and the values can be something like: HR, Sales, Operations, IT, etc. These tags can easily be queried: show me all tables used by sales. We’ll see more about this later in this post.

Tag Inheritance

Here’s a quick refresher on Snowflake’s “Container Hierarchy”. A Snowflake Organization is a container of Accounts. An Account is a container of databases. A database is a container of Schemas. And a Schema is a container of tables, views, pipes, file formats, stages, and many more objects.

Tagging an object at any level of the container hierarchy pushes the tag down to all objects below it. For example, if I apply a tag to a database, every table, view, column, and other objects like pipes, stages, and so on, will all inherit the database tag.

Let’s say you have tags explicitly set at the account, database, and schema levels, like this example:

{
"account": {
"tags": {
"env": "prod",
"region": "us-east-1"
},
"databases": {
"analytics": {
"tags": {
"owner": "data_team",
"classification": "restricted"
},
"schemas": {
"sales": {
"tags": {

Since region is tagged at the account level, every database, schema, table, view, and even every column will inherit the us-east-1 value. Pretty cool!

Querying the tag lineage is done easily with the snowflake.account_usage.tag_references_with_lineage view, shown below. See “Observing Tags with SQL”.

How are Object Tags different than Query Tags?

In Snowflake, a query tag is a single string value attached to a query or session, making it easier to filter and search through query history. Query tags are not key-value pairs — they’re simply searchable strings. In contrast, object tags are key-value pairs and are considered part of Snowflake’s data governance framework. They’re used to classify, control, and audit access to data assets like tables, columns, or schemas.

When someone mentions “tagging” in Snowflake, be sure to clarify whether they mean query tags or object tags, as they serve very different purposes.

How to use tags in Snowflake

Create a tag

The syntax to create a tag is very simple. create tag <tag_name> <allowed values>;

Since tags are schema level objects, use fully qualified SQL or be mindful of your worksheet context. Like anything in Snowflake, tags can be created in one database (or schema) and used in another.

Examples:

use schema governance.tags;
create tag table_classification allowed_values 'metadata', 'fact', 'dimension';
-- creates a tag with a specific list of allowed values.
create tag user_group;
-- creates a tag with any value allowed.

Creating a tag without specifying the allowed values sets the allowed_values parameter to null, enabling any text string to be applied as a value. This is convenient when the list of values needs to be dynamic, but most often you will list the allowed values up front, and alter the tag when you need to add more allowed values.

Verify the allowed values of a tag

If you want to check the allowed values of an existing tag, there are a few easy ways.

The show tags; command will show metadata for all tags in your account. One of the columns returned will be the allowed_values.

show tags command in Snowflake

Another way is to call a system function called system$get_tag_allowed_values for a specific tag.

select system$get_tag_allowed_values(
'governance.tags.cost_center');

And yet another way is by calling the get_ddl function in Snowflake. For example:

select get_ddl('tag','dev.public.TESTING_TAG_1');

Use the tag by tagging an object:

To set a tag on an object, use the alter command. For example, if you want to set a tag on a table called select_blog.workday.workday_users:

alter table select_blog.workday.workday_users
set tag tutorial_db.public.table_classification = 'metadata';

In this example the tag is created in a database called tutorial_db and the tag is used in a database called select_blog.

Observing Snowflake Tags with SQL

Option 1: Query the account_usage.tag_references view

The view snowflake.account_usage.tag_references will show you everywhere a tag is used and what the tag value is. The most common columns for filtering will be tag_name and tag_value. This table does not show Tag Inheritance; it only shows objects that are directly tagged.

SELECT * FROM
snowflake.account_usage.tag_references
where tag_name ilike 'table_class%'
or tag_value ilike 'meta%'
ORDER BY TAG_NAME, DOMAIN, OBJECT_ID;
Results of tag_reference view

Be aware the data in this view has a significant delay. Snowflake docs say the data can be delayed up to two hours. I have found in most cases the delay is at least an hour.

Option 2: get_tag function

Since the snowflake.account_usage.tag_references view is delayed, sometimes you need a quick way of observing changes to tagged objects right away. The system function system$get_tag will always be up to date.

Syntax:

SELECT SYSTEM$GET_TAG( <fully qualified tag name>,
<'fully qualified object name'>, <'object class such as table, view, etc' > )
AS TAG_META;


Example:

SELECT SYSTEM$GET_TAG( 'tutorial_db.public.table_classification',
'SELECT_BLOG.WORKDAY.WORKDAY_USERS', 'TABLE' )
AS TAG_META;
Results of Snowflake's system$get_tag function

Option 3: the account_usage.tag_lineage function

Earlier we discussed “tag inheritance”; tags are automatically pushed down from higher levels to lower levels. The tag_lineage function let’s you see all objects that have been directly tagged or that have inherited a tag. Continuing with the table_classification example, we can write this SQL:

SELECT *
FROM TABLE(
snowflake.account_usage.tag_references_with_lineage(
'TUTORIAL_DB.PUBLIC.TABLE_CLASSIFICATION'
)
);

Here we can see that the tag has been applied manually to the table WORKDAY_USERS and every column in the table inherits the tag.

Result of Snowflake's tag_references_with_lineage view

In this screenshot many columns are de-selected in the Snowsight UI to get the screenshot to fit.

Observing Snowflake Tags Using Snowsight UI

Snowflake provides a nice dashboard where you can observe all tags in Snowflake.

Using accountadmin role or a role that is correctly configured to view the dashboard, use Snowsight’s side bar to navigate to Monitoring —> Governance.

Below is what the dashboard will look like. Since I’m using a demo account, tags are not set, but you can see how this dashboard will be useful.

View tags in the Snowsight UI in the Monitoring / Governance side bar

Click on the tab called “Tagged Objects” to view every table and view in your account, and what tags apply to those objects. The objects are listed whether or not a tag is applied. Unfortunately objects like users and warehouses are not found here. Snowflake provides nice filtering capability on this page, so be sure to click through the filters.

All tables and their tags, unfiltered

Toggle on the “has tags” filter to only view objects with tags:

Tagged Objects tab filtered to "has tags", shows only tables that have tags.

Best Practices for Snowflake Tagging

Develop a centralized tagging policy

Data governance and managing PII and sensitive data is difficult. This is mostly non-technical work that requires a lot of discussion and alignment!

The first step is having a well thought out strategy. Here are some questions to get you started.

  • What are we trying to accomplish with tags? (cost reporting, securing sensitive data, etc.)
  • Who is defining the list of tags and allowed values? (Strategy)
  • What roles can set tags and who gets these roles? (Tactics)
  • How can we leverage tag inheritance in the most effective way?

Utilized Allowed Values for tags

Especially in cases when tags will be set manually, leverage the “allowed values” setting. In cases where tag assignment is automated based on data in a table, this is less critical.

Automate tagging when possible

Tagging falls short when teams rely on manually applying tags. Think of creative ways to automate tagging. One example found below!

Integrate tagging with data pipelines when possible

There’s a useful dbt package called dbt_tags; this package helps you automate Object Tagging in Snowflake and align dynamic masking based on tags. If you require data masking based on tags, check out this package!

Monitor and Audit tag usage

Why create all those tags if you don’t review them?! Dedicate one hour a month to reviewing tags, usage, and upgrading your tagging policy.

Automate Snowflake Tagging - A practical example

The Scenario

Imagine a scenario where you maintain metadata about your employees in Workday (or ERP of choice) and that table is ETL’d to Snowflake. Let’s call it workday_users table. It has one row per user, and each column represents attributes that we want to maintain as tags on the Snowflake user. This allows us to audit, observe, and set permissions on users with the same tags.

As users change departments or job roles, the Snowflake tags should update automatically.

Create some mock data

Let’s create a table and mock data for our scenario:

-- This table represents a table that would be ETL'd from Workday to Snowflake, containing user metadata
-- I'm using a demo database, change yours.
create or replace table select_blog.workday.workday_users
(
employee_id text,
work_email text,
manger_email text,
is_manager boolean,
pillar_id text,
group_id text,
team_id text,
department_id text,
job_family_id text,
area text,
cost_center text,

Fill up the table with some mock data:

insert into select_blog.workday.workday_users
values(
3000,
null,
true,
'Pillar-C-SUITE',
'Group-C-SUITE',
'Team-Founders',
'Department-All',
'JF-Founders',
'Founders',
'CC Board of Directors',
'CC3625'
),

In this scenario, we want to translate any field that ends with _id into a tag on the Snowflake user.

Create the tags

Let’s assume the number of columns in the workday_users is relatively stable, and the tags can be created as a one-time (or infrequent) exercise that doesn’t need to be automated. If a column is added, we can create a new tag. (Or you can schedule the code below to run in a task! Just don’t “replace” tags if you schedule this.)

The code below will query the the columns metadata, filtering to columns that contain _id, per our requirement. These column names become our tag names. For each _id column, we simply create a new tag. In this case, I am not pre-defining the allowed values, as the values will be set from the data.

This code saves you from manually crafting a dozen or more create tag statements. It runs the create or replace tag ... for every row of a select statement.

BEGIN
-- Declare variables
LET create_tag_statements RESULTSET := (
SELECT
'CREATE OR REPLACE TAG "' || COLUMN_NAME || '" COMMENT = ''Tag for ' || COLUMN_NAME || ''';' AS create_tag_statement
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'WORKDAY_USERS'
AND TABLE_SCHEMA = 'WORKDAY'
AND LOWER(COLUMN_NAME) ILIKE '%_ID'
);
-- Loop through the result set and execute each statement
LET stmt VARCHAR DEFAULT '';
LET create_tag_cursor CURSOR FOR SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));

Alternatively, you can manually run commands like this:

use schema select_blog.workday;
create tag pillar_id;
create tag team_id;
-- drop tags if desired
drop tag my_obsolete_tag;

Create a view that compares actual user tags vs the desired state

The workday_users table is the “desired state”, as it has the most up to date information from Workday on how we want our user tags set. Now we need to compare the actual user tags to the desired state, and show any rows that do not have the correct tags set. (In this case, we can trust that the work_email column in the workday_users table corresponds to the email address of the Snowflake user in the snowflake.account_usage.users table.)

Let’s save this query as a view, as it will be reused often.

create or replace view tag_updates as
with desired_tags as (
select
upper(work_email) as user_email,
upper(tag_name) as tag_name,
replace(upper(desired_tag_value), '-', '_') as desired_tag_value,
from
workday_users unpivot (
desired_tag_value for tag_name in (pillar_id, group_id, group_id)
)
),
snowflake_user_email as (
select
upper(name) as user_name,

Let’s summarize what the view does:

  • Unpivot the workday table (generate rows from columns) to get one row per user and possible tag.
  • Get the Snowflake USER_NAME that will be tagged, and the Snowflake email address to be joined to the “desired state” table.
  • Join on email to get existing tags for each user:
  • finally, show all records where the existing tag value does not equal the desired tag value

These are the records we can automatically set using a stored procedure.

Since the tag_updates view is filtered rows where the existing tag value doesn’t match the desired value, ideally this view is empty when our tags are set correctly.

Create a stored procedure to set the tags on the user

The stored procedure does these basic steps:

  • For each row in tag_updates view:
    • alter user, set tag = desired tag

Below I offer two versions of the stored procedure. One is a simple version with no logging. It is very clear and easy to read. The second version is more verbose and is used for troubleshooting if your desired tags are not set. It includes output messages in the SQL results pane.

Simple Procedure:

CREATE OR REPLACE PROCEDURE SET_USER_TAGS()
RETURNS VARCHAR
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$
def main(session):
query = """
SELECT
user_name as USER_NAME,
tag_name as TAG_NAME,
desired_tag_value as DESIRED_TAG_VALUE
FROM tag_updates

Verbose Sproc with logging:

-- create the procedure using an admin role that has permissions to "alter user".
user role accountadmin; -- or, use role useradmin if your users are actually owned by useradmin
-- if you are using useradmin role, you need to grant usage to that role on the database. Therefore it is easire to use accountadmin.
CREATE OR REPLACE PROCEDURE SET_USER_TAGS()
RETURNS VARCHAR
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'main'
AS
$$
def main(session):
# Pull exactly the columns you need from the revised view
query = """
SELECT

Call the procedure:

call set_user_tags();

Now, take a lunch break, wait about an hour or so (maybe more) for the account_usage.tag_references view to update. Assuming the tags are correctly set and the view has updated, select * from tag_updates; returns nothing! Exactly what we wanted.

Create a task to schedule the stored procedure

This task will run the procedure at 4AM UTC daily.

CREATE TASK set_user_tags
SCHEDULE = 'USING CRON 0 4 * * * UTC' AS
CALL set_user_tags();
alter task set_user_tags resume;

And just like that, we have created a process to automate user tags based on our Workday metadata table.

Now it’s your turn

This example covers one possible scenario of automating tags. But, per our best practices of using tags, always seek to automate tagging when possible. The process will always be the same:

  • Create some known desired state.
  • Create the tags (manually or automated).
  • Automate setting of the tags by comparing desired state to actual state.

Wrap Up

In this article we covered uses for tags in Snowflake, how to create tags, best practices, and an advanced scenario. Get creative! Think of ways you can automate tagging in your Snowflake account!

Jeff Skoldberg
Jeff Skoldberg
Sales Engineer at SELECT

Jeff Skoldberg is a Sales Engineer at SELECT, helping customers get maximum value out of the SELECT app to reduce their Snowflake spend. Prior to joining SELECT, Jeff was a Data and Analytics Consultant with 15+ years experience in automating insights and using data to control business processes. From a technology standpoint, he specializes in Snowflake + dbt + Tableau. From a business topic standpoint, he has experience in Public Utility, Clinical Trials, Publishing, CPG, and Manufacturing.

Get up and running with SELECT in 15 minutes.

Snowflake optimization & cost management platform

Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.

SELECT web application screenshot