On-premise Realtime Pipeline

Hi Snowplow Team,

Seems there’s a wide range of options that should be explored for setting up Snowplow. To be honest, we are trying to use open source tech stack for Snowplow experimentation. It would be better if we can keep away from using too much AWS services that introduce too much cost as well (though Snowplow is actually designed to work on cloud, sadly). I would also like to know if it is already possible to run Snowplow on fully on-premise real-time pipeline (or also batch?). I was able to bump into this discourse On-premise Snowplow Realtime Pipeline with Spark Streaming Enrich and would like to know if it’s achievable by any means considering we would like to use SQL or MySQL database for long-term storage option.

I am not actually a dev so won’t be able to start from developing from scratch. If you anyone can share what specific open source tech (hopefully supported by snowplow) you used for this type of use case would be super helpful! Thanks in advance.

This thread was re-posted as:

Please don’t re-post threads @jenfiner! It makes the forums much less friendly to navigate.