Snowplow and ClickHouse

Will do you integration with clickhouse?

How do you think i should storage my custom events?
In one wide table or in many tables?

Hi @Antony_Aleksandrov,

We did some experiments with ClickHouse back in 2021. However, right now, as far as data destinations are concerned, we are focusing on 2 big things:

  • Making the pipeline cloud-agnostic, so that we could run it in other clouds (e.g. Azure)
  • The next generation of data lake loaders, which would play better with data lake tooling, compared to our existing S3 and GCS loaders

Unfortunately, this means that we had to put the ClickHouse support on hold :frowning:

Regarding your second question, we find the single table approach better, as you avoid doing joins. Just keep in mind that for Redshift the “wide row” option is not available yet.

@stanch Would you be able to share some more clarity on new generation data lake tooling support?

I think that’s something I currently hunting for. I feel it’s better to enable the data lake tooling to allow them to deal with storage and processing rather than adding coupled support for them. It changes rapidly.

Are we already working on it? any pointers on what’s coming next and when?

Hi @Jayant_Kumar,

This was referring to the Lake Loader, which was not released at the time but is now. The same app discussed in your other threads.

1 Like

Hey @stanch We should have the biglake-loader Docker image available in the Docker hub. It will help users test and share feedbacks.