We are very excited to announce the release of snowplow-mobile v0.1.0 dbt package. This package replicates the standard, sql-runner based mobile model in dbt.
This package only supports Redshift and Postgres. BigQuery and Snowflake support will follow in later releases.
What the package brings
- Transforms and aggregates raw mobile event data collected from the Snowplow iOS tracker or Android tracker (up to v3) into a set of derived tables: screen views, sessions, users, and optionally app errors.
- Processes all mobile events incrementally . It is not just constrained to screen view events - any custom events you are tracking will also be incrementally processed.
- Is designed in a modular manner, allowing you to easily integrate your own custom SQL into the incremental framework provided by the package.
- A full suite of tests to ensure data integrity.
- A comprehensive data dictionary.
Modules
This package consists of a series of modules, each producing a table which serves as the input to the next module.
The ‘standard’ modules are:
- Base: Performs the incremental logic, outputting the table
snowplow_mobile_base_events_this_run
which contains a de-duped data set of all events required for the current run of the model. - Screen Views: Aggregates event level data to a screen view level,
screen_view_id
. - Sessions: Aggregates screen view level data to a session level,
session_id
. - Users: Aggregates session level data to a users level,
device_user_id
.
Incremental framework
The majority of the incremental logic sits within the base module and performs the ‘heavy-lifting’ for you. The logic is as follows:
- Identify new or late arriving events since the last run of the package.
- Identify the
session_id
associated with these new/late events. - Reprocess all events associated with the
session_id
. This ensures when aggregating to a session level we have all the events associated with the session.
This de-duped dataset is then written to an events_this_run
table, containing all the required events for the given run of the mobile model.
Customizations
The events_this_run
table removes complexity when adding your own customisations. You can now write drop and recompute style SQL using the events_this_run
as a source, without having to worry about which events to select.
Furthermore this reduces cost and improves performance. Since the events_this_run
table is shared between the standard modules and your customizations, we negate the need to query the raw events table multiple times. For more information on writing custom SQL, please refer to the docs . An example dbt project demonstrating customisations can also be found within the repo
More information
Checkout the mobile data model section of Snowplow Docs for more information on the models structure.
Checkout the snowplow-mobile package docs for a quickstart guide as well as an explanation of operating and configuring the package.