Snowflake DB & other storage targets

Hello @lehneres,

First of all - thanks for kicking off this discussion. The approach you described quite precisely matches our vision of how data should be stored and processed. Snowplow never was all about single data warehouse - we’re empowering smart people to do more with their data and we don’t want these people to be restricted in how they analyze the data.

Redshift is a great warehouse and it continues to serve very well to our users as most stable solution. But while we worked on Redshift improvements and Snowplow in general, we developed a very generic and flexible framework on top of Iglu that allows us to load data to each storage in a way friendly to this particular storage. Snowflake was just another step in this direction and for example shaped up our load-tracking approach, which is going to be back-ported to Redshift soon.

You can think of Iglu + Loaders platform as of some kind of high-level and analytics-friendly generalization of AWS Glue approach. “Generalization” here implies that technologies can embrace each other, not necessary to compete. What is also very important is that unlike AWS Glue approach, which is coupled with S3 data lake, our approach is cloud-provider-independent. And our next big goal as @travisdevitt mentioned is Google BigQuery (as per RFC). So, stay tuned and as always - new ideas and proposals are very welcome.