Hello,
We found something when trying to calculate time spent and found it odd. The following all came about when calculating the temporal length of a page_id
with page_view event and the max derived_tstamp
for all subsequent events within that page_id.
Scenario: A visitor comes to our site for the first time, views 1 page and then start’s their work day, and 3 hrs later at lunch clicks around the existing page, filters or something like that but they don’t refresh the page. What you will get in this scenario is; 2 sessionidx
with 1 page_view_id
crossing the session and an engaged time of 10,800 seconds as per the Snowplow models.
Why is it that once the _SP session Id
ends the page_id
does not reset until the page is manually refreshed?
Outside of the persistent domain_id cookie. Should the tracker not should dump all current event ids on _sp session end? as no existing event id should exist outside the session.
The issue is that then the page_id
crosses multiple sessions if the page_id is kept, example below, I haven’t cherry picked as its not a small issue.
Due to this engaged time is very much artificially inflated. Then when applying the SP models the roll ups to sessions and users are compounded/multiplied as 1 page crosses many sessions as well as engaged time.
We could spend time rewriting the models and putting rules in place that no page_view can exceed the session of 30mins. That a page_id belongs to the first session it was seen in and not subsequent sessions, but what to do with the following interactions is the issue, do we manually rekey the id, sounds messy.
I feel that the tracker should probably end the page_id
on session end, then if a new session is created on the same page, then the page event id
should change too. I do understand this would create a new page_view event or possibly need to be handled by a new event type, I don’t know, like page_view resume
event.
Any thoughts/suggestions are very much appreciated.
Thanks
Kyle