Thanks for the excellent guide @Colm . I can now safely indulge in json parsing adventure to materialize contexts.
I feel adopting the RDB shredder will involve more work at this point (but i could be proven wrong soon )
Also thanks for clarifying the key ( event_id
, collector_tstamp
). The web_page context gives an output similar to below:
{schema=iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0, data={id=e8b5c86f-af7d-4d87-aa47-c1c03bb28ea6}}
So if i understand right, I will need to create additional properties for the context jsons (copy event_id
and collector_tstamp
to child jsons) and then join in the modelling phase with atomic table.
So it seems the id
present in the web_page context serves no part in the join step. (Similar understanding through Purpose of the web page context to understand more). My understanding is the web_page context is just to link all actions (watch video, scrolls etc.) within the same pv event. The moment I reload the page, I get a different id (that is the behaviour i see while i test as well).
Can you let me know if the above understanding is right?