Choosing event types: structured vs unstructured

ostap · February 19, 2020, 4:34pm

Hi community,

Has anyone combined structured and unstructured events in product analytics use cases in the same project?

These 2 event types are quite different.

Structured
- easy to add new events
- simple structure
- easy to consume by analysts (event category and name already in atomic table)
- se_property, se_label, se_value will mean different things in different events
- can’t provide complex structure (only via custom contexts)
Unstructured
- any complex event structure is possible
- harder to add new events by developers (new event --> new schema)
- needs additional data modeling to unwrap unstruct_event into a fat table

I see at least 2 obvious solutions, but wanted to know if anyone was in the same situation?
Option 1. Use structured events with custom contexts.
Option 2. Use both types, and do additional data modeling on top of atomic table to prepare events for downstream consumption (e.g. populate columns final_event_name, final_event_category based on the types of events, as unstructured use schema URLs for event_name, and structured always have event_name=event, etc).

Thanks!

PaulBoocock · February 20, 2020, 8:19am

Hi @ostap
We recently released a new blog post, written by @carabaestlein, that covers this topic in some detail:

From my point of view, I would suggest to always use unstructured (i.e. self-describing) json schema’d events. They have all the benefits and very few of the negatives, some extra work on the tracking side of it for the developers is well worth it; with benefits to better understanding of your data (increased data meaning) and consistentency in your data (increased data quality).

evaldas · February 21, 2020, 4:52pm

@ostap thanks for great question. Personally we do both Option 1+2, but as Paul mentioned I think going only for unstruct is mostly enough as se_* fields would be useful only limited contexts and I think much more work in the end at consumption due to ambigiouty of actual values.

Also as you mentioned doing post-processing atomic table afterwards can be useful as that gives you the flexibility to normalize the raw data and do additional prefiltering etc. based on data consumer needs - though it of course comes with added cost of maintaining those transformations and managing them to run properly.

Topic		Replies	Views
Struct Definitions For data modelers & consumers	1	1333	December 8, 2016
What is the difference between Self-described event and custom context? For engineers	2	1218	January 21, 2019
Self-describing events versus the mega JSON-object property for Snowflake? For data modelers & consumers	8	1250	February 10, 2022
Unstructured Events and the events table Data store sources	5	2080	March 7, 2019
Ecommerce Events - what is everyone using? For data modelers & consumers	0	1500	October 11, 2017

Choosing event types: structured vs unstructured

Related topics