Dbt snowplow_web starts always from `start_date`

snowplow_web 0.16.2 with dbt 1.7.3 runs successfully but subsequent runs always start from start_date again. And it always processes the same data.

I checked snowplow_web_incremental_manifest table is empty after dbt run. Other tables in the same schema like snowplow_web_base_sessions_lifecycle_manifest or snowplow_web_base_quarantined_sessions are populated ok with the data. I am using postgres v15 as backend db.

Following is my dbt_project.yml configuration:

  - macro_namespace: dbt
    search_order: ['snowplow_utils', 'dbt']

  name_tracker: 'prod'
  snowplow__enable_load_tstamp: false
  snowplow__start_date: "2023-06-01"
  snowplow__enable_iab: false
  snowplow__enable_ua: true
  snowplow__enable_yauaa: true
  snowplow__allow_refresh: true

I run dbt with dbt run --selector snowplow_web.

Thank you.

Hi @ondraz

Can you post the output of your dbt run please? It sounds like the manifest table post hook is not completing correctly.

Do you have any custom models in your project, or any model configs set in your project yaml? And just to check you did copy out the selectors file? Did it work on a previous version or is this your first time trying to use the package?

Thank you, I just got it working finally.

I added snowplow__backfill_limit_days: 7 this caused snowplow_web_incremental_manifest to be populated correctly but still was overwritten each run with data starting from start_date.

I had to remove snowplow__allow_refresh: true from dbt_project.yml to get incremental runs.

To your questions. post hooks ran correctly, I have dbt models in the project but these are not built on snowplow-web tables. This is my first time setting up snowplow-web.

Thank you.

Yeah that variable is a little confusingly named, it allows refreshes to the manifest table but also specifically causes refreshes on them at the same time, glad you got it working though!