At Snowflake, information governance is all about offering our shoppers local features to simply and successfully govern information at scale. Up to now, we introduced features comparable to Object Tagging, Dynamic Information Protecting, Row Get right of entry to Insurance policies, and Get right of entry to Historical past to lend a hand stay monitor of delicate information via tagging it, assigning protecting insurance policies to offer protection to columns with delicate information from unauthorized get entry to, and audit the get entry to of delicate columns the use of Get right of entry to Historical past.
For safety and compliance, it’s crucial that every one columns containing sure forms of delicate information that constitutes for my part identifiable knowledge (PII) are secure via protecting insurance policies. A frequently used way is to periodically scan for information this is tagged as delicate, and follow the correct protecting coverage. Alternatively, this way can also be bulky, and would possibly depart information uncovered till the coverage is carried out.
Tag-based protecting addresses this problem via robotically making use of the designated insurance policies from the instant that the column is tagged. For instance, in case you have tagged columns with telephone numbers for your account as PII=‘Telephone Quantity’, you’ll assign a protecting coverage to the PII tag, and Snowflake will robotically masks all telephone quantity columns as specified within the protecting coverage, thereby fighting get entry to via the ones with out the correct authorization.
Tag-based protecting gives a scalable, uniform, and automatic answer for delicate information coverage via:
- Making it more straightforward to control information at scale
- Making use of insurance policies uniformly to corresponding tagged columns
- Straight away imposing a coverage once delicate information is tagged
How tag-based protecting works
A tag-based protecting coverage combines the item tagging function and protecting coverage function to permit a protecting coverage the use of an ALTER TAG command to be set on a tag. When the knowledge sort within the protecting coverage signature and the knowledge form of the column fit, the tagged column is robotically secure via the stipulations within the protecting coverage. The tag can improve one protecting coverage for each and every information sort that Snowflake helps. The protecting coverage stipulations can also be written to offer protection to the column information in response to the tag identify or tag price assigned to the column.
Tag-based protecting in motion
Let’s check out a easy instance to know how tag-based protecting works. This case demonstrates how tagged columns can also be secure with out without delay assigning a protecting coverage to these columns. We can use the next setup:
- DATA_GOVERNOR function for administering information governance tasks
- PII_ALLOWED function for customers who can get entry to PII information unmasked
- A desk HR.PRODUCT.EMPLOYEE with columns EMAIL and SSN that will have to be secure
use function accountadmin; // information governor function setup create function data_governor; grant function data_governor to person alice; -- assuming person alice is to be had within the account grant follow protecting coverage on account to function data_governor; grant follow tag on account to function data_governor; // create a database data_governance and grant the possession of the database to data_governor create database data_governance; grant possession on database data_governance to function data_governor; // desk setup create database hr; create schema hr.product; create desk hr.product.workers(e-mail string, ssn string); insert into hr.product.workers values ('[email protected]', 'aaa-aa-aaaa'),('[email protected]', 'bbb-bb-bbbb'); // pii_allowed function setup create function pii_allowed; grant function pii_allowed to person alice; -- assuming person alice is to be had within the account // grant suitable get entry to at the desk to the pii_allowed function and public grant utilization on database hr to function pii_allowed; grant utilization on schema hr.product to function pii_allowed; grant choose on desk hr.product.workers to function pii_allowed; grant utilization on database hr to function public; grant utilization on schema hr.product to function public; grant choose on desk hr.product.workers to function public;
The DATA_GOVERNOR plays the next steps to offer protection to the knowledge with a tag-based protecting coverage:
- Creates a tag DATA_GOVERNANCE.TAGS.PII_DATA
- Creates a protecting coverage DATA_GOVERNANCE.POLICIES.PII_MASK_STRING
- Assigns PII_MASK_STRING to the tag PII_DATA
- Assigns the PII_DATA to the columns EMAIL and SSN
use function data_governor; use database data_governance; // create the tag create schema data_governance.tags; create tag data_governance.tags.pii_data; // create the coverage create schema data_governance.insurance policies; create protecting coverage data_governance.insurance policies.pii_mask_string as (information string) returns string -> case when is_role_in_session('PII_ALLOWED') then information else '***masked***' finish; // assign protecting coverage to tag adjust tag data_governance.tags.pii_data set protecting coverage; data_governance.insurance policies.pii_mask_string; // assign tag to delicate columns use function data_governor; adjust desk hr.product.workers adjust column e-mail set tag data_governance.tags.pii_data=’EMAIL’; adjust desk hr.product.workers adjust column ssn set tag data_governance.tags.pii_data=’SSN’;
In the end, let’s take a look at to make sure that best customers with PII_ALLOWED function are approved to view the columns tagged as PII_DATA unmasked.
// function approved to view pii_data unmasked use function pii_allowed; // EMAIL and SSN columns are unmasked choose * from hr.product.workers; // function unauthorized to view pii_data use function public; // EMAIL and SSN columns are masked choose * from hr.product.workers;
Along with the power to assign protecting coverage to the tag identify, you’ll glance up the price of tags related to the column on the time of question execution with the new GET_TAG_ON_CURRENT_COLUMN machine serve as. The tag price can be utilized to resolve authorization to get entry to the knowledge unmasked. The next protecting coverage is an easy instance to display a situation the place columns with SENSITIVE_COLUMN tag and ‘YES’ tag price are all the time masked.
create or substitute protecting coverage mask_confidential_string as (information string) returns string -> case when machine$get_tag_on_current_column('data_governance.tags.sensitive_column’') = 'YES' then '***masked***' else information finish;
How one can get began with tag-based protecting
You wish to have to accomplish best 3 easy steps to get began with tag-based protecting to perform automatic coverage of delicate columns comparable to PII:
- Outline protecting insurance policies for each and every information sort you wish to have to offer protection to.
- Outline a tag (as an example, PII_DATA) and assign the protecting insurance policies to the tag.
- Assign the tag to any column with PII information.
When you’ve got already tagged columns with a suitable tag, you’ll offer protection to all columns via merely assigning the protecting insurance policies to the tag.
Get started the use of tag-based protecting as of late
Tag-based protecting is now normally to be had. You’ll get started the use of it straight away to:
- Simply organize delicate information coverage at scale.
- Get rid of the desire for periodic scanning of tagged columns for coverage task.
- Offer protection to columns once they’re tagged correctly.