Enriched TSV, columns contexts and derived_contexts schema order

Hi Snowplowers!

When working with enriched TSV files, the columns contexts and derived_contexts contain JSON with schemas.

Is the order of the schemas always the same?

If you are writing a query in SQL how do you make sure when you query data you are querying data for a particular schema?

{
  "schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1",
  "data": [
    {
      "schema": "iglu:org.ietf/http_header/jsonschema/1-0-0",
      "data": {
        "name": "X-Forwarded-For",
        "value": "129.56.83.158"
      }
    },
    {
      "schema": "iglu:org.ietf/http_header/jsonschema/1-0-0",
      "data": {
        "name": "Host",
        "value": "sp.xxxxxxxxx.com"
      }
    },
    {
      "schema": "iglu:org.ietf/http_header/jsonschema/1-0-0",
      "data": {
        "name": "User-Agent",
        "value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"
      }
    },
    {
      "schema": "iglu:org.ietf/http_header/jsonschema/1-0-0",
      "data": {
        "name": "Origin",
        "value": "https://www.xxxxxxxxx.com"
      }
    },
    {
      "schema": "iglu:org.ietf/http_header/jsonschema/1-0-0",
      "data": {
        "name": "Referer",
        "value": "https://www.xxxxxxxx.com/"
      }
    }
  ]
}

Any ideas how to tackle this?

Thank you
Joao

Hi @joaocorreia

You probably want to use the SP loader to avoid this headache entirely. See this post which goes into it in more detail.

Kind regards,
Kyle

1 Like

Agree with @kfitzpatrick. Snowflake uses our Analytics SDK with “lossy” transformation, which basically turns the contexts inside out and surfaces their schemas as top-level keys. You can use Analytics SDK (or even the whole Snowflake Transformer) without actual Snowflake DB if you need to have schemas exposed on top level.

2 Likes