@moshesh, you can use JavaScript script enrichment for that purpose. It comes before PII enrichment and hence the IPs are still visible to the enrichment component. You can use data from IP lookup enrichment which produces values such as geo_country to filter based on the country origin of the events.
Do remember that raw data (if batch pipeline used) would still contain the IP address in S3. You might want to set up life cycle rules to get the data flushed after a reasonable number of days.
For a less permanent or retroactive application, you could try this script:
If you know your Salt, IP addresses and push it through the same cryptographic hash function, you will get the same output as the Snowplow enrichment emits.
Then you can just filter by hashed user_ipaddress in SQL.
Thanks @robkingston, but this is not what I’m looking for.
OK, so my original question was regarding filtering events.
So the javascript enrichment is great for that, but we’ve decided that it’s won’t be good to filter the events and we prefer to mark them in some way.
Let’s say that we want to use a field that is not being used by us, e.g. ip_organization.
According to the example in the javascript ennrichment documentation, I should return an array of objects with schema and data
@moshesh, if I understood your goal correctly you do not want to filter events by IP/ geo_country but rather mark such an event using the field ip_organization for the purpose.
It can be achieved by mutating the field. No custom JSON schema (aka returned contexts) is required for that. Here’s how you could do that
function process(event) {
if (event.getUser_ipaddress() ...) { // set your condition
event.setIp_organization(new String('TO BE FILTERED')); // mark it the way you want
}
}
Many thanks @ihor, this was really helpful.
I didn’t know that the event is mutable.
We don’t have the PII enrichment, but the IP anonymization enrichment.
At start it didn’t work because of this enrichment, but once I removed it everything was as expected.
I added the IP anonymization logic to my JS script.