First and last touch attribution models in SQL [tutorial]

Hey @yali , just found this post, good overview. We’re actually in the process of implementing this , but we are aiming for multi-touch attribution. Going to copy/paste an email I sent to support :slight_smile:

I’m hitting an issue where activity from specific domain_userid value have NO records where refr_medium != ‘internal’ . Can this occur? My impression is that, at the very least, all sessions from a domain_userid should be direct (no traffic source). I can possibly see this occurring if a user deletes cookings mid-session, but the amount of instances this is occurring too high.

Background:

I’m in the process of implementing attribution models, similar to what Yali posted here: First and last touch attribution models in SQL [tutorial]

My approach is different in that we are are aiming for multi-touch attribution: everything in between first and last touch (relative to an order event). I’ve worked through most of the logic and hitting issues with what I think may be due to the method of identity stitching, borrowed from to this post: Identifying users (identity stitching)

High level process:

  1. Create mapping list of user_id (from user context data) and all associated domain_userid
  2. Create marketing touch points - Exclude ‘event=page_ping’ and Internal IP addresses
  3. Join marketing touch points with user_id / domain_userid mapping
  4. De-dup records that have the same referring/marketing
  5. Join order data on user_id

At this point, there are marketing touch points from Step 2 where a domain_userid only has 'refr=‘internal’ records

On the marketing touch points SQL from this post:

where refr_medium != ‘internal’
and refr_medium is not null

Would this exclude Direct traffic? I had thought that refr_medium != ‘internal’ was sufficient to exclude events/clicks