Ok, so I’m set up in dbtCloud for data modelling - and have done some very primitive models myself but I wanted to integrate the Snowplow dbt packages because I thought it would just make everything easier.
Here’s my set up; and here’s the problem I’m having. I’m probably missing something, but when the run takes 1h 40m to fail, its really hard to debug efficiently.
So I have a Development environment set up; using dbt v1.0. It uses the development
custom schema prefix.
In dbt_project.yml
I have
# Configuring packages
# https://docs.snowplowanalytics.com/docs/modeling-your-data/the-snowplow-web-data-model/dbt-web-data-model/
# https://snowplow.github.io/dbt-snowplow-web/#!/overview/snowplow_web
# https://snowplow.github.io/dbt-snowplow-mobile/#!/overview/snowplow_mobile
vars:
snowplow_web:
snowplow__enable_iab: false
snowplow__enable_ua: true
snowplow__enable_yauaa: true
snowplow__start_date: '2021-07-06' # First events in the pipeline
snowplow__min_visit_length: 10 # 5 is default value (secs)
snowplow__heartbeat: 10 # Default value (secs)
snowplow_mobile:
snowplow__enable_mobile_context: true
snowplow__enable_geolocation_context: false
snowplow__enable_application_context: true
snowplow__enable_screen_context: true
snowplow__start_date: '2022-01-12'
In packages.yml
I have
packages:
- package: snowplow/snowplow_utils
version: 0.6.0
- package: snowplow/snowplow_web
version: 0.5.1
- package: snowplow/snowplow_mobile
version: 0.2.0
I’ve run dbt deps. Everything is installed fine.
When I do dbt run - it fails. Consistently on this part today (yesterday my dbtCloud IDE session expired before I could grab the error). So I’ve done two runs.
This is from cold start; so it went back to fill in events. From the first time period. It’s only 823 by the looks of it (just the first month of test data); so it’s not like there’s a million events killing it.
In snowplow_web_base_sessions_this_run:
2022-03-03T12:47:35.912779Z: 12:47:35 SQL status: SELECT 823 in 1.27 seconds
2022-03-03T12:47:35.915901Z: 12:47:35 Using postgres connection "model.snowplow_web.snowplow_web_base_sessions_this_run"
2022-03-03T12:47:35.916033Z: 12:47:35 On model.snowplow_web.snowplow_web_base_sessions_this_run: /* {"app": "dbt", "dbt_version": "1.0.3", "profile_name": "user", "target_name": "default", "node_id": "model.snowplow_web.snowplow_web_base_sessions_this_run"} */
alter table "snowplow"."development_scratch"."snowplow_web_base_sessions_this_run" rename to "snowplow_web_base_sessions_this_run__dbt_backup"
This is where the run fails. But it doesn’t give me any information as to why!
NOTE: I don’t think this has anything to do with DB permissions, because the mobile equivalent runs and succeeds fine
2022-03-03T12:49:24.393795Z: 12:49:24 SQL status: SELECT 119 in 95.11 seconds
2022-03-03T12:49:24.397172Z: 12:49:24 Using postgres connection "model.snowplow_mobile.snowplow_mobile_base_events_this_run"
2022-03-03T12:49:24.397323Z: 12:49:24 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: /* {"app": "dbt", "dbt_version": "1.0.3", "profile_name": "user", "target_name": "default", "node_id": "model.snowplow_mobile.snowplow_mobile_base_events_this_run"} */
alter table "snowplow"."development_scratch"."snowplow_mobile_base_events_this_run" rename to "snowplow_mobile_base_events_this_run__dbt_backup"
2022-03-03T12:49:25.069136Z: 12:49:25 SQL status: ALTER TABLE in 0.67 seconds
2022-03-03T12:49:25.071897Z: 12:49:25 Using postgres connection "model.snowplow_mobile.snowplow_mobile_base_events_this_run"
2022-03-03T12:49:25.072030Z: 12:49:25 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: /* {"app": "dbt", "dbt_version": "1.0.3", "profile_name": "user", "target_name": "default", "node_id": "model.snowplow_mobile.snowplow_mobile_base_events_this_run"} */
alter table "snowplow"."development_scratch"."snowplow_mobile_base_events_this_run__dbt_tmp" rename to "snowplow_mobile_base_events_this_run"
2022-03-03T12:49:25.205347Z: 12:49:25 SQL status: ALTER TABLE in 0.13 seconds
2022-03-03T12:49:25.206725Z: 12:49:25 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: COMMIT
2022-03-03T12:49:25.206861Z: 12:49:25 Using postgres connection "model.snowplow_mobile.snowplow_mobile_base_events_this_run"
2022-03-03T12:49:25.206938Z: 12:49:25 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: COMMIT
2022-03-03T12:49:25.407884Z: 12:49:25 SQL status: COMMIT in 0.2 seconds
2022-03-03T12:49:25.410228Z: 12:49:25 Using postgres connection "model.snowplow_mobile.snowplow_mobile_base_events_this_run"
2022-03-03T12:49:25.410361Z: 12:49:25 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: /* {"app": "dbt", "dbt_version": "1.0.3", "profile_name": "user", "target_name": "default", "node_id": "model.snowplow_mobile.snowplow_mobile_base_events_this_run"} */
drop table if exists "snowplow"."development_scratch"."snowplow_mobile_base_events_this_run__dbt_backup" cascade
2022-03-03T12:49:26.019316Z: 12:49:26 SQL status: DROP TABLE in 0.61 seconds
2022-03-03T12:49:26.020650Z: 12:49:26 finished collecting timing info
2022-03-03T12:49:26.020830Z: 12:49:26 On model.snowplow_mobile.snowplow_mobile_base_events_this_run: Close
2022-03-03T12:49:26.021525Z: 12:49:26 Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '02046cee-86db-4d21-8607-229a840ae7d8', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f22f73640a0>]}
2022-03-03T12:49:26.021893Z: 12:49:26 18 of 48 OK created table model development_scratch.snowplow_mobile_base_events_this_run [SELECT 119e in 97.49s]
2022-03-03T12:49:26.022011Z: 12:49:26 Finished running node model.snowplow_mobile.snowplow_mobile_base_events_this_run