We are very happy to announce the release of the snowplow-web v.0.4.0 dbt package.
This release brings support for the Postgres adapter (big thanks to @zloff!), a user mapping module and improved filtering of problematic long sessions.
Features
- Postgres adapter support (#45)
- A user mapping module, mapping between
domain_userid
and the latestuser_id
. This mapping is applied to the sessions table to produce astitched_user_id
(#42)
Improvements
- Improved filtering of problematic long sessions, reducing table scans and the chance of duplicate sessions (#41)
- All manifest tables are now created using dbt models rather than DDL. This allows these tables to be dropped using dbt’s
full-refresh
flag as well as appear in the lineage graph (#39) - Improved Redshift event dedupe logic (#33)
Fixes
- Fix cluster_by_fields macros to allow overriding (#35)
Under the hood
Breaking changes
- The user mapping module changes the schema of the sessions table by adding the
stitched_user_id
column. - The
snowplow_delete_from_manifest
macro has been replaced with thesnowplow_web_delete_from_manifest
. This should only affect users of the package running this macro as an operation. Refer to the README for more information. - The
snowplow__manifest_custom_schema
var has been deprecated. The schema for all manifest tables is now set directly in thedbt_project.yml
file. If you had previously set a custom manifest schema you will need to update yourdbt_project.yml
file to reflect this. Please refer to the Output Schema section of the docs for more info. - The mechanism to teardown all the manifest tables and start afresh has changed. This can now be achieved by using the native dbt
full-refresh
flag when running the manifest tables, rather than using the now deprecatedteardown_all
var. Note due to their critical nature the manifest tables are protected from accidental full-refreshes in production. Please refer to the Manifest Tables section for more details. - BigQuery Only: This release imports
snowplow-utils v0.4.0
which introduced a breaking change to thecombine_column_versions
macro. If using this macro for modelling please update accordingly.
Upgrading
To upgrade bump the version of the package in your packages.yml
file.
This release contains breaking changes, namely to the sessions table schema. As a result you be required to do a full refresh of the package:
dbt run --models snowplow_web --full-refresh --vars 'snowplow__allow_refresh: true'