Missing Contexts sources while Setting up Snowplow web dbt package on Snowflake snowpipe

Colm · April 20, 2022, 3:29pm

Hi there, welcome to the community!

this package works out of the box only if data in snowflake was loaded using the loader.

This will be quite a big challenge. The loader does a lot of work in formatting the data to be more usable, and the models are built for that input.

Does this make sense to take this approach of building these sources myself from these contexts columns and make the package work ?

Honestly, it doesn’t really make sense to me.

It is possible to do it, but this requires that you manually instrument in SQL what the Open Source Snowflake loader does already in Scala.

It’s definitely less efficient to do it in SQL, and Snowflake charges by compute, so you’re paying extra to do it too.

Seems like a lot of work to be honest.

Regarding your actual plan for working with the data - I think this ticket outlines the task and issues you might encounter. We also have a blog due to be released soon on the topic (in the next couple of days I think, so good timing!)

Only thing I don’t understand is all of these sources have the following columns that I have no idea where they are coming from ?
root_id
root_tstamp
ref_root
ref_tree
ref_parent

These are all fields that are set by the Redshift loader. If they’re in your data then you’ve got yet more challenges - the format of data in Redshift and Snowflake are very different, because those databases are very different.

This makes the request quite confusing to be honest… It seems like the proposed architecture is:

Enriched stream → Redshift Loader → Redshift → Snowpipe → Snowflake → Heavy SQL processing to transform the data → Data model

Whereas the architecture only needs to be:

Enriched stream → Snowflake Loader → Snowflake → Data model

Topic		Replies	Views
DBT and new data model in Snowplow Snowflake	5	1879	July 24, 2021
Dbt model column meanings For data modelers & consumers	5	966	August 5, 2022
Compilation Error in model snowplow_web_sessions_this_run Troubleshooting	12	430	March 13, 2024
Snowplow-web 0.1.0 dbt package released New releases	0	952	July 28, 2021
New dbt package! Snowplow-unified v0.1.1 released New releases	0	536	November 21, 2023

Missing Contexts sources while Setting up Snowplow web dbt package on Snowflake snowpipe

Related topics