As data breaches and cyber threads continue to rise, securing your data has become a top priority for businesses and organisations across all industries. Whether you want to safeguard sensitive customer data, protect your intellectual property or need to adhere to regulatory requirements, protecting your data is critical. This is where Snowflakes Object Tagging comes into play. Snowflakes Object Tagging capabilities allow you to strengthen your data security by classifying and managing data on a granular level, ensuring that only the right people have access to the right data.
What is Object Tagging?
With Object Tagging, Snowflake provides an easy way to classify objects by attaching a string value tag to it. Tags are schema-level objects that can be attached to other Snowflake objects like columns, users, databases or even the entire account.
This attached tag can label the object to help determine its security or protection level. Most importantly, they enable organisations to control data access. You can for example label sensitive data with tags like “PII” (Personally Identifiable Information) or “Confidential” and then apply security policies that ensure only authorized users can access this information. This way, securing your data becomes easily manageable.
What Should You Know About Tags?
NOTE: This does not include propagation to nested objects, which have to be recreated.
Can you have unlimited tags? Not quite, but the quotas are quite high, see for yourself:
- The number of tags used within an account is limited to 10.000.
- Usually, an object can have up to 50 unique tags.
- Inside a table or view, this varies a bit — for columns, this maximum of 50 is to be considered over all columns of a table or view combined.
- Be aware that this limit includes dropped tags (within 24 hours after dropping the tag), so best practice is to UNSET a tag from all objects it is attached to before dropping it. The reason for this is that dropped tags can be undropped within 24 hours and will then be restored with all references it had at the time it was dropped. Please refer to the Snowflake documentation about manage tag quotas for more information.
Speaking of dropping a tag: A tag can indeed be dropped even if it is assigned to an object. So before dropping it, determine all of the objects the tag is assigned to by calling the Account Usage table function “TAG_REFERENCES_WITH_LINEAGE” (more info below). An exception here are tags assigned to a masking policy. These cannot be dropped if attached to an object — they have to be unset before.
You cannot define future grants on tags.
Be aware that when you create a view or a secure view from a table that has tags attached, the tags will not be present in the view. However, if you clone an object, all tag associations in the source object (e.g. table) are maintained in the cloned objects.
Create and Apply a Tag
First you have to make sure you have the right privilege.
- To create a tag, you need the CREATE TAG privilege and at least USAGE on the database and schema.
- To apply a tag to an object, you need either:
- OWNER privilege on the Object and APPLY on the tag,
- Or your role has to be granted the ACCOUNT level (global) APPLY TAG privilege.
- If you want to alter a tag, like attaching a masking policy, your role has to have either:
- The global APPLY MASKING POLICY privilege and APPLY privilege on the TAG,
- Or the global APPLY MASKING POLICY and global APPLY TAG privilege
-- create a tag -------------------------- CREATE OR REPLACE TAG PII_TAG ALLOWED_VALUES 'confidential', 'internal', 'public' COMMENT = 'PII security level for column masking'; -- apply the tag to a column in a table -------------------------- ALTER TABLE <table_name> MODIFY COLUMN <column_name> SET TAG PII_TAG = 'internal';
Tags for Security
Before we move on to the details about the power couple Tags & Policies, I want to point out some general benefits of using object tagging to improve the security requirements in your organization:
- Granular Access Control
- Object tagging allows you to implement highly specific policies based on the applied tags. By tagging sensitive columns, you can prevent unauthorized access and increase the security standard on a granular level.
- Dynamic Security Policies
- Things change, and so might the security level of your data. Using Tags, these changes are easily implemented by simply updating the tag value or attaching a new policy to an existing tag.
- Compliance with Data Privacy Regulations
- In order to be on track with data privacy regulations like the GDPR (General Data Protection Regulation) in the EU, tags will help you to ensure that PII data is handled accordingly.
- Audit and Monitoring Capabilities
- Tags simplify identifying sensitive data and allow you to track usage of vulnerable information based on applied tags. Please read the following blog post of one of my colleagues for more on that topic: Microchip Your Data: Snowflake Column Lineage.
How Best to Combine Tags and Policies?
Best practice is to attach a policy to a tag which can then be attached to an object. This provides the following benefits:
- You can apply a policy to multiple objects by just adding it once to a tag.
- You can attach multiple policies to one tag.
- You can easily switch the policy on a tag, and all tagged objects will be protected by the new policy.
- You can apply multiple variations of one policy, based on different data types.
- You can apply the tag to the table and the policy automatically protects all of its columns, including new columns.
There are two general ways of utilizing tags in combination with policies:
- You can either use a more generic masking policy, which is just using the tag to identify securable objects
- A finer-grained control can be achieved when utilizing the tag to get different outcomes based on the tag value
The following code snippet is showing the first approach: it creates a masking policy and attaches a masking policy to a tag which can be applied to various objects. It even takes it one step further and includes role base access control: If the user is accessing the data with a privileged role they will be able to access the clear value, for all other roles the value in the content will be masked.
-- create a Masking Policy -------------------------- CREATE OR REPLACE MASKING POLICY <policy_name> AS (val string) RETURNS string -> CASE WHEN CURRENT_ROLE() IN ('PRIVILEGED_ROLE') THEN val ELSE '*** MASKED VIA TAG ***' END; -- attach the Masking Policy to the tag -------------------------- ALTER TAG HIDE_FROM_READER_TAG SET MASKING POLICY <policy_name>;
If you want to include different outcomes based on the value of the tag, you can include this in the policy by adding the SYSTEM$GET_TAG_ON_CURRENT_COLUMN function as the following script shows:
-- create a Masking Policy based on the Tag value -------------------------- CREATE OR REPLACE MASKING POLICY <policy_name> AS (val string) RETURNS string -> CASE -- highly_confidential columns will be masked WHEN SYSTEM$GET_TAG_ON_CURRENT_COLUMN('PII_TAG') = 'highly_confidential' THEN '*** highly_confidential ***' -- public columns will get the unmasked value WHEN SYSTEM$GET_TAG_ON_CURRENT_COLUMN('PII_TAG') = 'public' THEN val -- default to the masked value ELSE '******' END;
Some additional things to note of when using tags with policies:
- Be aware that you can have only one policy per tag per data type, i.e. you can create one policy per data type and then attach each to a tag to produce different outcome based on the data type. This is useful when applying the tag to a table where then each column inherits the tag no matter what data type they are using.
-
A policy directly attached to an object will overwrite a policy that is attached via a tag.
If you want to find out more about masking your data, please also have a look at this blog post: Snowflake Masking Policies Demystified.
Monitor Tags
There are several ways to discover and monitor tags. You can of course monitor tags via Snowsight or, as I prefer, using SQL. The most commonly used commands are listed below. As always, the outcome depends on the role executing the command, if your role does not have sufficiently high privileges, you might not get the expected results. Note that the SNOWFLAKE.ACCOUNT_USAGE schema might have a delay of up to 2 hours for items to be listed. More details can be found in the Snowflake documentation on monitor tags.
The SHOW_TAGS command lists all tags that exist within the database or a schema. The second one, ACCOUNT_USAGE.TAGS, is showing results on the account level:
-- show all tags within a database or schema SHOW TAGS; SHOW TAGS IN SCHEMA <database_name>.<schema_name>; -- show all tags within your Snowflake account, including deleted tags USE ROLE ACCOUNTADMIN; SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.TAGS ORDER BY TAG_NAME;
SYSTEM$GET_TAG
This system function will return the tag value associated with the specified object. You will need to provide the names of the tag and the object as well as the kind of object we’re looking at. This is referred to as the object domain which can be anything from account to column. The code below is using a table object.
Note that the two related tags SYSTEM$GET_TAG_ON_CURRENT_TABLE and SYSTEM$GET_TAG_ON_CURRENT_COLUMN, the latter of which was used in the masking policy above, are only available within a policy declaration and cannot be called in a SELECT query like the GET_TAG function below.
-- syntax SYSTEM$GET_TAG( '<tag_name>' , '<obj_name>' , '<obj_domain>' ); -- select the value for tag pii_tag attached to the my_users_table SELECT SYSTEM$GET_TAG('pii_tag', 'my_users_table', 'table');
TAG_REFERENCES
TAG_REFERENCES is available as a function in the INFORMATION_SCHEMA of each database and as a view in the ACCOUNT_USAGE schema of the SNOWFLAKE database.
The FUNCTION is bound to database level and returns a list of tags associated with the specified object, including tag lineage. The example uses the table object “my_users_table.” Possible object domains for this function include database, stage, table, column and many more.
The VIEW returns a list of all tags in the account and does not include tag lineage:
-- execute the function at database SELECT * FROM TABLE(<database_name>.INFORMATION_SCHEMA.TAG_REFERENCES('my_users_table', 'table')); -- query the view at account level SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES ORDER BY TAG_NAME, DOMAIN, OBJECT_ID;
TAG_REFERENCES_ALL_COLUMNS
This function is similar to TAG_REFERENCES listed above, but here, the object domain is restricted to table. It returns every tag set on every column in a given table or view, whether the tag is directly assigned to a column or through tag lineage:
-- syntax TAG_REFERENCES_ALL_COLUMNS('<object_name>', '<object_domain>') -- list all tags applied to a specific table SELECT * FROM TABLE(INFORMATION_SCHEMA.TAG_REFERENCES_ALL_COLUMNS('my_db.my_schema.my_table', 'table'));
TAG_REFERENCES_WITH_LINEAGE
This function is different from the ones above as it is applied on a tag and returns a list of all associations between this tag and the Snowflake object to which it is attached to. As the name suggests, this function also includes tag lineage:
-- syntax TAG_REFERENCES_WITH_LINEAGE('<name>') -- list all occurences of a tag SELECT * FROM TABLE(SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES_WITH_LINEAGE('my_db.my_schema.my_tag'));
POLICY_REFERENCES
To go one step further, you can even monitor your tags outside of Snowflake. Most data cataloguing tools provide functionality to monitor and import Snowflake meta data. This combination makes your data governance even more powerful, as you can, for example, centralize governance tasks more easily and provide direct access for qualified users to submit requests for tags or even handle masking themselves.
Two great examples of this combination are Atlan and Informatica, they provide native Snowflake integration which enables you to directly import your Snowflake tags into the data catalog. It would be beyond the scope of this post to go much deeper on this topic, but I wanted to at least mention this great option. If you want to read more about governance and data catalogues, please feel free to have a look at these two blog posts: What Is Data Governance? and De-Mystifying the Data Catalog: Find and Use Your Data.
Wrap up
I hope that with this blog post you have enough to get started and will be able to use Object Tagging to secure your data in no time. If you want to go on and dive deeper into this topic, here are the links to the relevant Snowflake documentation: