Hey @yali , just found this post, good overview. We’re actually in the process of implementing this , but we are aiming for multi-touch attribution. Going to copy/paste an email I sent to support
I’m hitting an issue where activity from specific domain_userid value have NO records where refr_medium != ‘internal’ . Can this occur? My impression is that, at the very least, all sessions from a domain_userid should be direct (no traffic source). I can possibly see this occurring if a user deletes cookings mid-session, but the amount of instances this is occurring too high.
Background:
I’m in the process of implementing attribution models, similar to what Yali posted here: First and last touch attribution models in SQL [tutorial]
My approach is different in that we are are aiming for multi-touch attribution: everything in between first and last touch (relative to an order event). I’ve worked through most of the logic and hitting issues with what I think may be due to the method of identity stitching, borrowed from to this post: Identifying users (identity stitching)
High level process:
- Create mapping list of user_id (from user context data) and all associated domain_userid
- Create marketing touch points - Exclude ‘event=page_ping’ and Internal IP addresses
- Join marketing touch points with user_id / domain_userid mapping
- De-dup records that have the same referring/marketing
- Join order data on user_id
At this point, there are marketing touch points from Step 2 where a domain_userid only has 'refr=‘internal’ records
On the marketing touch points SQL from this post:
where refr_medium != ‘internal’
and refr_medium is not null
Would this exclude Direct traffic? I had thought that refr_medium != ‘internal’ was sufficient to exclude events/clicks