Passing values from atomic.events to a custom table

christophe · September 14, 2016, 10:17am

Hi Nir,

First of all – welcome to our forum!

Is there a specific reason not to join to atomic.events? Both tables should have the same DISTKEY (event/root ID) and SORTKEY (collector/root timestamp) so joining them is actually quite fast. You might need to enforce uniqueness on the event ID however, something neither Redshift nor Snowplow do at the moment (background & tutorial).

To answer your specific question, it depends on what you want to track. With Snowplow, you can define your own events and contexts (we call them self-describing or unstructured events - is this a process you are familiar with?). If you have the information available in the tracker, then you can send it that way, and it will end up in the custom table.

A good example is the user ID. You could either use setUserId (in which case the user ID will end up in user_id in atomic.events) or you could define your own event or context, and track the user ID by sending the data as a self-describing JSON (in which case there will be a user_id field in that custom event/context table).

That said, it’s a little more difficult with geo_location because that field is calculated during enrichment (one of the pipeline stages), by looking up the IP address against a location database. If you want, you can write an enrichment that takes certain values, transforms, and adds them to a derived context. An alternative is to use SQL Runner to run a set of SQL queries that do the join once each time new events are loaded into Redshift, so you don’t need to run it every time you consume the data.

Does that answer your question? Don’t hesitate to ask for follow-up if you have any questions about specific features I mentioned.

Christophe

Topic		Replies	Views
Customizing our Snowplow event representation in Redshift Redshift	9	2448	September 26, 2016
How is the custom data in unstructured events loaded into Redshift w/ storageloader? Redshift	2	1898	May 16, 2016
Dealing with duplicate event IDs for joining contexts to atomic.events Redshift	3	3216	May 3, 2016
Unstructured Events and the events table Data store sources	5	2077	March 7, 2019
[redshift] unstructured event not save in correct schema Redshift	5	4302	February 27, 2017

Passing values from atomic.events to a custom table

Related topics