Hello,
I’m using SnowFlake so the Derived Contexts is just read directly in.
I’ve noted that depending on the UA type that schemas can change their position within Derived Contexts. so position 0, 1, 2, 3, etc are not always within the same position and everything in the array is classed as “data”.
The example below shows, “data”: Position 0 being the UA_Parser and “data”: Position 1 yet another user agent analyser.
{
"data": [
{
"data": {
"deviceFamily": "Mac",
"osFamily": "Mac OS X",
"osMajor": "10",
"osMinor": "15",
"osPatch": "7",
"osPatchMinor": null,
"osVersion": "Mac OS X 10.15.7",
"useragentFamily": "Chrome",
"useragentMajor": "88",
"useragentMinor": "0",
"useragentPatch": "4324",
"useragentVersion": "Chrome 88.0.4324"
},
"schema": "iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0"
},
{
"data": {
"agentClass": "Browser",
"agentName": "Chrome",
"agentNameVersion": "Chrome 88.0.4324.192",
"agentNameVersionMajor": "Chrome 88",
"agentVersion": "88.0.4324.192",
"agentVersionMajor": "88",
"deviceBrand": "Apple",
"deviceClass": "Desktop",
"deviceCpu": "Intel",
"deviceCpuBits": "32",
"deviceName": "Apple Macintosh",
"layoutEngineClass": "Browser",
"layoutEngineName": "Blink",
"layoutEngineNameVersion": "Blink 88.0",
"layoutEngineNameVersionMajor": "Blink 88",
"layoutEngineVersion": "88.0",
"layoutEngineVersionMajor": "88",
"operatingSystemClass": "Desktop",
"operatingSystemName": "Mac OS X",
"operatingSystemNameVersion": "Mac OS X 10.15.7",
"operatingSystemNameVersionMajor": "Mac OS X 10",
"operatingSystemVersion": "10.15.7",
"operatingSystemVersionMajor": "10"
},
"schema": "iglu:nl.basjes/yauaa_context/jsonschema/1-0-1"
}
],
"schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1"
}
Then we have the following example, “data”: Position 0 being the UA_Parser, “data”: Position 1 is now Spider and Robots “data”: Position 2 is now yet another user agent analyser.
{
"data": [
{
"data": {
"deviceFamily": "Other",
"osFamily": "Windows",
"osMajor": "10",
"osMinor": null,
"osPatch": null,
"osPatchMinor": null,
"osVersion": "Windows 10",
"useragentFamily": "Chrome",
"useragentMajor": "87",
"useragentMinor": "0",
"useragentPatch": "4280",
"useragentVersion": "Chrome 87.0.4280"
},
"schema": "iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0"
},
{
"data": {
"category": "SPIDER_OR_ROBOT",
"primaryImpact": "UNKNOWN",
"reason": "FAILED_UA_INCLUDE",
"spiderOrRobot": true
},
"schema": "iglu:com.iab.snowplow/spiders_and_robots/jsonschema/1-0-0"
},
{
"data": {
"agentClass": "Browser",
"agentName": "Chrome",
"agentNameVersion": "Chrome 87.0.4280.88",
"agentNameVersionMajor": "Chrome 87",
"agentVersion": "87.0.4280.88",
"agentVersionMajor": "87",
"deviceBrand": "Unknown",
"deviceClass": "Desktop",
"deviceCpu": "Intel x86_64",
"deviceCpuBits": "64",
"deviceName": "Desktop",
"layoutEngineClass": "Browser",
"layoutEngineName": "Blink",
"layoutEngineNameVersion": "Blink 87.0",
"layoutEngineNameVersionMajor": "Blink 87",
"layoutEngineVersion": "87.0",
"layoutEngineVersionMajor": "87",
"operatingSystemClass": "Desktop",
"operatingSystemName": "Windows NT",
"operatingSystemNameVersion": "Windows 10.0",
"operatingSystemVersion": "10.0"
},
"schema": "iglu:nl.basjes/yauaa_context/jsonschema/1-0-0"
}
],
"schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1"
}
An Issue comes up when extracting data from Atomic Events for views/models, because everything is called “data” the easiest way is to say which data position, a simple example of divideClass like so:
**select** DERIVED_CONTEXTS: **data** [1]. **data** .deviceClass,DERIVED_CONTEXTS
**FROM** SNOWPLOW_DB. **ATOMIC** .EVENTS_V
**where** DERIVED_CONTEXTS **is** **not** **null**
**and** collector_tstamp > '2021-04-01 00:00:00'
**LIMIT** 100
;
The problem is that data[1] moves, it can be data[2] sometimes and since everything is called “data” its like The Hunt for Red October. Is there a way to set the enricher to maintain the positions of the enrichments so Derived contexts is always in the same order, by run order or something?
Its probably unlikely “data” will change to the schema name so failing that I guess I could use the schema path and somehow use it to find the array I want. Or is there any recommended approach here?
Thanks
Kyle