Which timestamp is the best to see when an event occurred?

leon · August 19, 2016, 3:08pm

Which timestamp is the best to see when an event occurred?

A common question among Snowplow users is ‘what timestamp should I use?’. While this depends on what you want to achieve, most users want to know when an event occurred. For that the derived_tstamp is the best choice.

In this post we’ll expain the different timestamps, what they mean and why the derived_tstamp is generally the best to use.

Available timestamps

These are the timestamps Snowplow uses:

collector_tstamp
dvce_created_tstamp
dvce_sent_tstamp
derived_tstamp
true_tstamp

The first three (collector_tstamp, dvce_created_tstamp and dvce_sent_tstamp) are used to calculate the derived_tstamp. We’ll explain each of them below and why they are by itself not the best choice to accurately see when an event happened. We’ll also show how the derived_tstamp is calculated and explain when to use the true_tstamp.

`collector_tstamp`

Timestamp for the event recorded by the collector

We can trust the collector timestamp to be accurate but it’s possible that there’s a delay between an event being created and the event arriving at the collector.

A classic example is when a device goes offline. New events will still be created, but they are cached in local storage until the connection is restored. When the connection is restored, all events in the cache are sent at once, so all events that were generated during this time will end up with the same collector timestamp.

The collector_tstamp is therefore not the best choice to see when an event happened or for building attribution models.

`dvce_created_tstamp`

Timestamp the event was recorded on the client device

This timestamp is created using the clock of the device. As a general rule, device clocks (which by definition are clocks that are not under our control) cannot be trusted to be accurate. What we can do is use them to calculate the relative time between events that were created on the same device.

`dvce_sent_tstamp`

Timestamp the event was sent by the client device

When an event is created it is not always sent immediately. Sometimes a device is offline and the event will be cached and sent once the connection is restored.

This timestamp has the same issue as the dvce_created_tstamp; we cannot trust the device clock to be accurate but it is reasonable to assume that the clock is internally accurate. In other words a 23 minute gap between the dvce_created_tstamp and the dvce_sent_tstamp will actually be 23 minutes.

`derived_tstamp`

Timestamp making allowance for innaccurate device clock

So now that we know that:

The collector_tstamp is accurate but does not show when the event was created
The dvce_created_tstamp and the dvce_sent_tstamp are accurate in relation to each other

We can use this knowledge to calculate the time the event actually happened. So we calculate the difference between the two client timestamps and apply that delta to the collector_tstamplike this:

derived_tstamp = collector_tstamp - (dvce_sent_tstamp - dvce_created_tstamp)

This is why the derived_tstamp is the best to see when the event acctually took place.

There are two exceptions:

If either of the dvce_ timestamps are not set, then the derived_tstamp == collector_tstamp
If the true_tstamp is set, then the derived_tstamp == true_tstamp.

`true_tstamp`

User-set “true timestamp” for the event

The true_tstamp is a special timestamp that is only used in rare cases, most often when you want to ingest historical data. In that case, the collector timestamp is irrelevant (it would be set to the time of ingestion - not when the historical event happened).

In those cases, you can explicitly set the true timestamp, which will be passed on to the derived timestamp.

If the true timestamp is set, the pipeline will ignore all other inputs and set the derived timestamp to the true timestamp.

Full algorithm

Following on from the information above, this is the full algorithm of the derived_tstamp:

Step 1
Check if the true_tstamp is set. If so, derived_tstamp = true_tstamp.

Step 2
Else, check if either dvce_sent_tstamp or dvce_created_tstamp are missing. If so, the derived_tstamp will simply be equal to the collector_tstamp.

Step 3
Else the derived_tstamp is calculated like this:

derived_tstamp = collector_tstamp - (dvce_sent_tstamp - dvce_created_tstamp)

Trackers by timestamp capability

Here is the current snapshot of timestamp capabilities across all of our trackers. If this is out of date, please add a comment to this thread, and we will update the table!

Tracker	dvce_created_tstamp	dvce_sent_tstamp	true_tstamp	derived_tstamp
ActionScript3 Tracker	Yes	No	No	collector_tstamp
Arduino Tracker	No	No	No	collector_tstamp
Android Tracker	Yes	Yes	Yes	All steps
CPP Tracker	Yes	Yes	Yes	All steps
Golang Tracker	Yes	Yes	Yes	All steps
iOS Tracker	Yes	Yes	No	Steps 2 & 3
Java Tracker	Yes	No	No	collector_tstamp
JavaScript Tracker	Yes	Yes	Yes	All steps
Lua Tracker	Yes	No	No	collector_tstamp
.NET Tracker	Yes	Yes	Yes	All steps
Node.js Tracker	Yes	No	No	collector_tstamp
PHP Tracker	Yes	No	No	collector_tstamp
Python Tracker	Yes	Yes	Yes	All steps
Ruby Tracker	Yes	Yes	Yes	All steps
Scala Tracker	Yes	Yes	Yes	All steps
Unity Tracker	Yes	Yes	No	Steps 2 & 3

The Pixel Tracker and the Google AMP Tracker do not use timestamps.

Ainul_Mutaqin · March 21, 2024, 10:27am

how about flutter?

Topic		Replies	Views
Avg. Time per Visit: collector vs dvce timestamp? For data modelers & consumers	3	1641	June 6, 2016
Java Tracker Issue : event time-stamp get lost Tracking SDKs	5	1313	February 27, 2018
Dvce_created_tstamp with future datetimes values For engineers	2	1774	September 21, 2018
Derived_tstamp is negative For engineers	6	1347	December 9, 2016
How to send true timestamp using the ttm parameter? Tracking SDKs	1	1882	April 19, 2016

Which timestamp is the best to see when an event occurred?