Excluding field from atomic.events

NirSivan · February 21, 2018, 7:32am

Hey all,

GDPR is coming and I want to try and minimize PII data especially ip address, however I do want to keep the location data such as country city

Is there a way to exclude a field from entering the shredded JSON before it loads into redshift?

Thanks

Nir Sivan

Colm · February 21, 2018, 10:39am

Hi @NirSivan,

Is the IP anonymisation enrichment a good solution for you? It doesn’t get rid of the events in S3, but blanks out the IP address before loading to Redshift.

Best,

alex · February 21, 2018, 10:52am

Further to @Colm’s note - if you want to anonymize any other field in your Snowplow data, then stay tuned for our next release (R100), which introduces a “PII Enrichment” which lets you do precisely that…

Colm · February 21, 2018, 12:52pm

To add some more detail on my previous answer:

IP anonymisation is available, and further PII anonymisation is scheduled for R100 as Alex points out. Those two take care of the Enriched Events on S3 and the data in Redshift.

However, the raw collector logs in S3 remain to be dealt with - the best solution at the moment is to set up lifecycle rules to handle this. Please note that deleting this information from the raw logs makes it impossible to reprocess the data, so it’s best to have a buffer period before you delete anything from Raw, in case there’s a pipeline failure.

A solution that has worked in the past is to set up a lifecycle rule to delete files from the s3 buckets after 1 week, for example.

I hope this is helpful.

NirSivan · February 21, 2018, 1:12pm

Thanks for the assistance, much appreciated!

Colm · February 28, 2018, 11:04am

Hi @NirSivan,

A quick follow-up on Alex’s note:

Further to @Colm’s note - if you want to anonymize any other field in your Snowplow data, then stay tuned for our next release (R100), which introduces a “PII Enrichment” which lets you do precisely that…

This has now been released - you can see the details here.

Best,

Topic		Replies	Views
Filtering events from specific IPs Enrichment	7	2142	February 28, 2019
Drop snowplow events from certain countries For engineers	5	685	April 26, 2023
GDPR: Deleting customer data from Redshift [tutorial] GDPR	0	5771	February 22, 2018
Passing values from atomic.events to a custom table Redshift	5	4321	May 11, 2017
Upgrade from 2-yr-old version, where is unstruct_event col? For engineers	3	880	December 29, 2017

Excluding field from atomic.events

Related topics