Dbt-labs/snowplow to snowplow/snowplow_web migration

Hey folks,

have a question about the Snowplow dbt package.

Does anyone have experience with migration from the dbt package dbt-labs/snowplow to snowplow/snowplow_web? Eager to hear other people’s experiences.

Those 2 packages are quite different and wondering any tips on this as has a requirement to migrate from one to another package. Looks like it is not straightforward and looking for any tip.

Thanks!

:wave: Hey @Radovan_Bacovic!

My own opinion is that you’re best off just doing a once-off full run of the model (or at least as long back in history as will be useful to you - there is a variable to choose a start date).

Trying to migrate data from the old tables to the new ones will likely turn out to be a lot more pain than is worth it, basically.

Back when dbt recommended that old package as their primary snowplow one, they actually used to recommend a weekly full recompute - so the cost of doing this once (and not needing it again) with the new package is the same as the cost to do each of those weekly recomputes. (Don’t quote me though, I might be misremembering the context of a years old conversation).

The main reason I don’t think it’s worth it however is that there are flaws in how the old one incrementalises - which makes me question the value of trying to preserve it. (How impacted you are by that issue is something you can only ascertain by looking at the two side by side really).

Migrating any downstream reports is likely to be easier than you would expect, however. Much of the same information is available in the new tables, just a lot of column names will have changed. It’ll likely be an annoying toil task, but it likely won’t have you pulling your hair out too much.

That’s my perspective anyway, hope it helps!

2 Likes

Hey @Colm

such a remarkable expose.

For the most of the things discovered the same as you: is it not feasible to do a “lift and shift” replacement, should be more tactical here.

Trying to migrate data from the old tables to the new ones will likely turn out to be a lot more pain than is worth it, basically.

100% explored that part a bit and got the same conclusion.

Back when dbt recommended that old package as their primary snowplow one, they actually used to recommend a weekly full recompute

Yep, this is what we should consider seriously as the data set is quite large.

Generally speaking, we should do a proper impact analyse for the package replacement as the business partners can loose some information (as also column mapping exercise is done) due to migration.

Will do a deep dive into more details here, and hopefully, once when I am done with migration, will make our migration plan public to make other people life easier :slight_smile:

Thanks for your insights , highly appreciated.

Radovan

1 Like