Tests fail: domain_sessionidx is NULL

al_x · July 19, 2023, 7:07am

Hi everyone,

My team is prototyping a solution using the snowplow-web dbt package (v0.12.4), and we’re having an issue with a test failing.

It’s a vanilla snowplow set-up, into Postgres, with no extensions enabled. The only defaults we’ve overridden are the start date and the backfill window.

There is a not_null constraint on domain_sessionidx, which can be NULL (and is so for some records in the source data, where the devs were getting the tracker set up correctly).

Should domain_sessionidx being NULL break the workflow, or is this test too strict?

Any thoughts?

Thanks!

Ryan · July 19, 2023, 3:14pm

Hi @al_x

First of all, I’d recommend using a more recent version of the web package if you can as it will contain new features and fixes, especially if you’re just starting now.

In general we would expect the domain_sessionidx to not be null from the tracking side for web data, however most of our tests are “best guess” that cover the majority of scenarios, but in certain use cases of Snowplow it might not be a valid test, we suggest in those cases you can disable the tests from our package and run your own on top of the tables if you wish.

al_x · July 19, 2023, 7:02pm

Hey, @Ryan.

Thanks - I’ll speak to the team about bumping the version number.

What seems to be making people nervous is that the dbt package will build successfully for some days where domain_sessionidx is NULL without issue, then fail on processing subsequent days.

It may be some other problem being masked by this… perhaps I need to confirm with these specific data and use cases that domain_sessionidx being NULL is not a problem in itself, and that the test can be safely disabled.

(or the problematic data controlled for)

Ryan · July 19, 2023, 8:03pm

From memory the tests only exist (or at least mostly) exist on the derived tables, so even though individual events may have a NULL value, the event(s) that is used to populate the derived tables maybe have a value for this.

I agree with your approach though, understand where these events are coming from and if it’s a tracking issue or if they are valid first, then make a decision.

Topic		Replies	Views
Why do i get domain_sessionid and domain_sessionidx as null while trying to send events using snowplow GA plugin? Tracking SDKs	3	1623	December 2, 2018
Snowplow_web dbt package expecting a different column name than what I have in my events table	1	388	November 29, 2023
Duplicate `domain_userid` causing job failure For engineers	1	104	June 10, 2024
Dbt-snowplow-web not working - "Unhandled error while executing" For data modelers & consumers	4	902	October 22, 2023
Help with sending session level data using snowplow android tracker For engineers	5	953	November 28, 2018

Tests fail: domain_sessionidx is NULL

Related topics