Filter/Discard Events At Collector

Hi Everyone,

I am trying to figure out any ways to filter/discard unexpected events at the collector itself.
Is there a way to do it though the exiting configs or by using any JS script for it?


@Jayant_Kumar no the first place you can “discard” would be in Enrichment but its not possible to delete events so much as you can redirect them to bad via the JS enrichment if you want.

Can you maybe elaborate more on the use-case you are trying to achieve with this?

Thank you @josh for the insight.

We occasionally see burst of events coming from the upstream, in some cases it’s due to applications bad behaviours and in some it’s some sort of spamming from random ips.

Events flowing through the kafka collector to the enrichment topics impacts the throughput, network and storage cost.

As a work around, we block or blacklist them at the inegestor which is in-house service itself.

Now as we are planning to use snowplow infra, we are trying to create something similar.

So easiest option is likely to invest in an upstream WAF / CDN / Proxy system where you can mitigate and handle that. On AWS you can integrate their WAF product directly with an Application Load Balancer to handle these sorts of patterns directly (and allow for blocklisting bad actor IP addresses).

1 Like

I agree with this. Thank you @josh