Hi @sevenm,
Taking this is a “broad question”, what you really want to know is how to identify the user journey in terms of multiple marketing touches.
There are a few posts published on this forum already that address this topic in one way or another which I give the reference to below. But first, let me elaborate more on domain_userid
and network_userid
.
The domain_userid
is a UUID which is generated by the Javascript tracker and stored in a (first party) cookie. Because it’s stored in a cookie, it’s not 100% perfect. If a user deletes their cookies, or the cookie expires, a new domain_userid
will be generated (which will make them look like a new user). Also, if the user visits from another browser or computer, it will look like this is a totally different user.
The network_userid
is set as a third party cookie ID and applicable to Clojure collector (set against the domain of the collector). It is typically used when site visitors need to be uniquely identified across multiple different domains (e.g. on a content or ad network). In a nutshell, the Clojure Collector receives events from the Snowplow JavaScript tracker, sets/updates a third-party user tracking cookie, and returns the pixel to the client. The ID in this third-party user tracking cookie is stored in the network_userid
field in Snowplow events. However, many browsers block 3rd party cookies by default (e.g. Safari & Firefox).
The IP address is also not a reliable identifier because a single user can have many IP addresses and many users can have the same one (e.g. one office) - it’s mainly used for the geo IP lookup and as a possible input in an identity stitching process (see below).
That’s why we provide additional identifiers. For example, the user_id
, the user_fingerprint
(aka browser fingerprint), and perhaps the user_ipaddress
.
The user_fingerprint
is generated once with each page load unless the user explicitly calls the setUserFingerprint
method. It takes the useragent, the string dimensions and colour depth, the timezone, the existence of session storage and local storage, and the list of plugins as inputs and uses the murmurhash function to convert those into the final fingerprint.
You can have a process in SQL that creates a map/graph between different identifiers. For example, if a user logs in into different browsers, you’ll see the same user_id
appear on 2 different domain_userid
. The same can happen if cookies are deleted. It’s then a reasonable assumption that all events belonging to these 2 domain_userid
actually belong to the same user_id
(even the events where the user_id
is not set, e.g. when the user is not logged in). This is what we call the identity stitching process.
Yali wrote a good post on this a while back: Identifying users (identity stitching) which should clarify the topic more.
More specifically about the touch attribution models, here the link to Yali’s tutorial: First and last touch attribution models in SQL [tutorial]. You could see that domain_userid
is the identifier used to track marketing touches by users.
Yet another tutorial on campaign tracking is here: Web traffic driven campaign tracking with Snowplow [tutorial]
Hopefully, the above is useful.