Snowplow Collector Authentication

Hi Snowplow Community,

I am reaching out to see if anyone has been able to implement authentication within their Snowplow Collector to prevent unauthenticated events from flowing through the system. For additional context, I am sending events via the Snowplow JS Tracker to my Snowplow Collector. Has anyone been able to add authentication/filtering before events are accepted and processed by the Snowplow Collector (perhaps via a Firewall or some other mechanism)?

Here are the reasons behind my ask:

  1. I don’t want requests outside of approved domains (Example: *.example.com) to make it into the system. By default, the Snowplow Collector accepts traffic from anywhere and then forwards those events downstream.
  2. I don’t want the Snowplow Collector to be susceptible to DoS attacks.

Thanks,

Brandon

Hi Brandon,
in case a web application firewall (WAF) or CDN like Akamai, Cloudflare etc. is already in place on the website, it makes a lot of sense to route the Snowplow endpoint through the WAF/CDN, because you can:

  • create filter rules (e.g. host, request path, ISP, country etc.)
  • circumvent Safari ITP
  • detect and block bot or enrich the requests with additional headers to filter downstream (e.g triggered WAF rules, proxy detection etc.)
  • setup DoS prevention
  • etc.

We have this in place with Akamai incl. Akamai Bot Manager and I can definitely recommend it.

Edit: there was already a similar question with insightful answers: Snowplow JS Authentication - #6 by matus

1 Like

Hey @davidher_mann,

Thanks for your response. I think a WAF is ideal for preventing DoS attacks. But I don’t see how I would prevent requests outside of approved domains from entering the system. Without some sort of secret, I don’t see how to stop a malicious user from fabricating a request to circumvent the rules that the WAF would filter out. I understand I could authenticate a token within the request during the Enrichment process (based on: Snowplow JS Authentication - #6 by matus), but I want to prevent the event from entering the Collector in the first place. This post from @mike explains the same problem:

Signing is probably likely to reduce users sending targeted data but if you have a signing method that is executing client side then it necessitates having that secret available on the client. If an attacker is determined enough they can determine the secret and signing method and still send dummy data. As far as I’m aware there aren’t any analytics tools (or many other tools for that matter) that prevent request tampering. Data that is sent from the client is default assumed to be untrusted so folks that want to prevent tampering tend to move these events server side rather than relying on code that executes on the client.
Snowplow JS Authentication - #8 by mike

Brandon