Skip to main content
Uncategorized

Why Do We Mask Data?

By , February 2, 2024March 19th, 2024No Comments
Data masking is a crucial aspect of data protection. WHY? Because many industries have to protect sensitive and confidential information. Examples are personally identifying information (PII) or Protected Health information (PHI). If you mask sensitive data, the risk of data breaches and unauthorized access is minimized. Moreover, data masking ensures that specific roles or departments can only see relevant data for them, which could minimize insider threads.

Data masking policies

Before I explain more about masking policies in snowflake it is important to know that masking policies are schema-level objects. Therefore a database and schema must exist before a masking policy can be applied. In this blog I will only highlight masking policies as data protection, but other examples include secure views (https://docs.snowflake.com/en/user-guide/views-secure). In Snowflake, column based masking policies are set at the columnar level of tables that can be found inside a schema of a database. However, a data masking policy is enforced at a query runtime, and not as a static policy on a table. This means that the masking policy condition determines whether unauthorized roles see masked or partially masked data when they run a query, but that the data is not modified in the existing table.

Snowflake roles

Data masking policies can be set up for different roles. For example, sensitive information such as client emails or phone numbers can be masked for employees that have the role ANALYST. After all, such information typically has no value for an ANALYST and can minimize data leaking from within the company. To the contrary, a CUSTOMER SUPPORT role should have access to those details to be able to identify the customer. Always set a masking policy to a table before granting specific roles access to these tables. Otherwise roles that should not be allowed to see sensitive information can still access the data.
Figure 1: Copied from Snowflake Documentation
When specific columns should be masked for certain roles, be sure to write those roles in UPPER CASE. This seems contradicting, since the queries in Snowflake are not case sensitive. However, snowflake registers roles as UPPER CASE even though you create the role as lowercase! Hence, it is important to write the role in UPPER CASE when the masking policy is created. Otherwise the masking policy does not apply to the specified role, which allows the role to access sensitive information!
Figure 2: Explanation of case sensitive masking policies in snowflake (fictional data)

Conclusion

Masking policies in Snowflake are schema-level objects requiring pre-existing databases and schemas. It’s crucial to set masking policies before granting role-specific access to tables to prevent unauthorized data exposure. Notably, roles must be written in UPPER CASE in masking policies, regardless of their creation case, to ensure proper enforcement and avoid unintended access to sensitive data.

Authors

  • Dominique
  • Jeroen Smits

    I am currently an analytical engineer at Nimbus Intelligence. My background is in human movement sciences were I specialized in the optimization of human movements, particularly in cycling. During my research projects I started to grow a passion for data, which made me decide to pursue a career in information technology. Nimbus Intelligence offers the perfect introduction into the world as an analytical engineer! In my spare time I love to ride my bicycle and cook with family and friends.

Leave a Reply