Snowplow Collector Authentication

brandon.stanley · February 7, 2024, 9:50pm

Hi Snowplow Community,

I am reaching out to see if anyone has been able to implement authentication within their Snowplow Collector to prevent unauthenticated events from flowing through the system. For additional context, I am sending events via the Snowplow JS Tracker to my Snowplow Collector. Has anyone been able to add authentication/filtering before events are accepted and processed by the Snowplow Collector (perhaps via a Firewall or some other mechanism)?

Here are the reasons behind my ask:

I don’t want requests outside of approved domains (Example: *.example.com) to make it into the system. By default, the Snowplow Collector accepts traffic from anywhere and then forwards those events downstream.
I don’t want the Snowplow Collector to be susceptible to DoS attacks.

Thanks,

Brandon

davidher_mann · February 7, 2024, 11:09pm

Hi Brandon,
in case a web application firewall (WAF) or CDN like Akamai, Cloudflare etc. is already in place on the website, it makes a lot of sense to route the Snowplow endpoint through the WAF/CDN, because you can:

create filter rules (e.g. host, request path, ISP, country etc.)
circumvent Safari ITP
detect and block bot or enrich the requests with additional headers to filter downstream (e.g triggered WAF rules, proxy detection etc.)
setup DoS prevention
etc.

We have this in place with Akamai incl. Akamai Bot Manager and I can definitely recommend it.

Edit: there was already a similar question with insightful answers: Snowplow JS Authentication - #6 by matus

brandon.stanley · February 9, 2024, 4:26am

Hey @davidher_mann,

Thanks for your response. I think a WAF is ideal for preventing DoS attacks. But I don’t see how I would prevent requests outside of approved domains from entering the system. Without some sort of secret, I don’t see how to stop a malicious user from fabricating a request to circumvent the rules that the WAF would filter out. I understand I could authenticate a token within the request during the Enrichment process (based on: Snowplow JS Authentication - #6 by matus), but I want to prevent the event from entering the Collector in the first place. This post from @mike explains the same problem:

Signing is probably likely to reduce users sending targeted data but if you have a signing method that is executing client side then it necessitates having that secret available on the client. If an attacker is determined enough they can determine the secret and signing method and still send dummy data. As far as I’m aware there aren’t any analytics tools (or many other tools for that matter) that prevent request tampering. Data that is sent from the client is default assumed to be untrusted so folks that want to prevent tampering tend to move these events server side rather than relying on code that executes on the client.
Snowplow JS Authentication - #8 by mike

Brandon

Topic		Replies	Views
Authentication in front of the collector? Collectors	1	994	March 15, 2022
Snowplow JS Authentication Feedback	10	1757	April 4, 2024
Snowplow tracker authentication For engineers	2	765	February 13, 2022
Filtering out bot traffic from specific user agent For engineers	4	1534	October 16, 2021
Authentication of personalized tracking events Collectors	6	2282	July 15, 2021

Snowplow Collector Authentication

Related topics