We’ve just released our first release candidates for the upcoming 0.14.0 versions of both our web and utils packages. These versions contain one major change, we are changing from our custom snowplow_incremental
materialization to an overwritten version of the standard incremental
version.
To make sure your models are optimized you need to add the following to your dbt_project.yml
, this ensures that dbt uses our version of the merge sql rather than the default. Without this you will not see the optimized upsert that our materialization has previously provided.
# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']
Outside of this, we do not expect users to notice performance changes to the running of our packages. If you do see a sizeable increase in run time, please revert to an earlier version then comment on this post or raise an issue in the github repo.
We’d love to get your feedback on these changes and highlight any issues you find. As this is a pre-release some features may change in upcoming versions. If you have custom models you need to migrate, please see here for how to do so.
Installing the pre-release
To install a pre-release package you need to specify the version number of the package. For example to test this version for web you should have the following in your package.yml
file:
packages:
- package: snowplow/snowplow_web
version: 0.14.0-rc1
Reason for the change
Years ago, when our snowplow_incremental
materialization was built, the standard dbt incremental method was not well optimised and did not support injecting custom code in any way. Our materialization ensured that the destination table scan was optimised to just the data that needed updating, saving costs to our users. However, maintaining a materialization for 4 warehouses was not easy, and it meant that we did not add in newer features such as the on_schema_change
option due to the complexity of adding these in. We also know that adding a new warehouse to this (e.g. Azure) would be a large amount of work.
With the release of dbt-core 1.4 incremental_predicates
were added to incremental materialization, this allows us to more easily inject the date range filters we need to optimize the upserts. Unfortunately the feature isn’t 100% perfect and due to a complicated story between compile and run time configs, the above dispatch code needs to be added. We believe this is a fair trade-off to gain access to the newer features and to simplify our packge.
Roadmap
We expect to have a second release candidate in mid/late-march, this will mostly be internal changes to our integration tests, as well as some changes to macros such as snowplow_is_incremental
that are no longer required.
A full release is expected late march/early April assuming things go smoothly.
We expect all our packages will be migrated to the new materialization approach by the start of May, and the old items officially removed at a later date.
Snowplow utils 0.14.0-rc1
Summary
This is a pre-release version of the package, we believe it to be in working condition but you may encounter bugs and some features may change before the final release.
This version of the package begins the migration away from our snowplow_incremental
materialization and instead provides an overwrite to the standard incremental
materialization to provide the same performance improvements but in a simpler way. We expect users should see little to no performance change from the previous version, please let us know if you see performance degradation for large volumes of data.
Users will need to add the following to their dbt_project.yml
to benefit from the enhancements:
# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']
For custom models and more details, please see more details on our temporary docs page: Snowplow Materialization (Pre-Release) | Snowplow Documentation
Features
Deprecated old materialization
Add get_merge_sql for materialization
Fix a broken github action for our github pages
Installing
To install this version, use the following in your packages.yml
file:
packages:
- package: snowplow/snowplow_utils
version: 0.14.0-rc1
Snowplow web 0.14.0-rc1
Summary
This is a pre-release version of the package, we believe it to be in working condition but you may encounter bugs and some features may change before the final release.
This version of the package begins the migration away from our snowplow_incremental
materialization and uses an overwrite to the standard incremental
materialization to provide the same performance improvements but in a simpler way. We expect users should see little to no performance change from the previous version, please let us know if you see performance degradation for large volumes of data.
Users will need to add the following to their dbt_project.yml
to benefit from the enhancements:
# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']
For custom models and more details, please see more details on our temporary docs page: Snowplow Materialization (Pre-Release) | Snowplow Documentation
Features
Use new materialization
Installing
To install this version, use the following in your packages.yml
file:
packages:
- package: snowplow/snowplow_web
version: 0.14.0-rc1