Snowplow Normalize, multiple versions

In normalize_config, is it possible to use multiple versions of the event?

    {
        "event_names": ["foo_bar"],
        "self_describing_event_schemas": ["iglu:com.omedastudios/foo_bar/jsonschema/1-0-0", "iglu:com.omedastudios/foo_bar/jsonschema/1-0-1"]
    },

And take advantage of this other macro? GitHub - snowplow/dbt-snowplow-utils at 0.15.2

Hi @Ben_Davison , the normalize package does actually already make usage of that macro for BigQuery (https://github.com/snowplow/dbt-snowplow-normalize/blob/7bc8a055105e377bded9f999ec16fe90a6e785c5/macros/normalize_events.sql#L114). Just note that the extraction of parameters is based on the version of the schema you provide, so ideally you should provide the version with the most fields.

Hopefully that helps answer your question

Thanks Ryan,

I should of probably put the error here lol

If I change the config to look like this:

            "event_names": ["generic_error"],
            "self_describing_event_schemas": [
                "iglu:com.omedastudios/generic_error/jsonschema/1-0-0",
                "iglu:com.omedastudios/generic_error/jsonschema/1-0-1"

            ]
        },        ```

It correctly generates most of the resulting SQL correctly,

```{%- set event_names = ['generic_error'] -%}
{%- set flat_cols = [] -%}
{%- set sde_cols = ['UNSTRUCT_EVENT_COM_OMEDASTUDIOS_GENERIC_ERROR_1_0_0', 'UNSTRUCT_EVENT_COM_OMEDASTUDIOS_GENERIC_ERROR_1_0_1'] -%}
{%- set sde_keys = [['errorType', 'errorMessage', 'errorSeverity', 'errorCode', 'scope', 'errorContext', 'timestamp', 'sessionId', 'namespace', 'platform', 'playerId', 'changeList', 'inputType'], ['errorType', 'errorMessage', 'errorSeverity', 'errorCode', 'scope', 'errorContext', 'timestamp', 'sessionId', 'namespace', 'platform', 'playerId', 'changeList', 'inputType']] -%}
{%- set sde_types = [['string', 'string', 'string', 'string', 'string', 'array', 'string', 'string', 'string', 'string', 'string', 'string', 'string'], ['string', 'string', 'string', 'string', 'string', 'array', 'string', 'string', 'string', 'string', 'string', 'string', 'string']] -%}
{%- set sde_aliases = ['generic_error', 'generic_error'] -%}```

But it complains about the aliases being the same, I know there is the 

    {
        "event_names": ["generic_error"],            
        "self_describing_event_schemas": [
            "iglu:com.omedastudios/generic_error/jsonschema/1-0-0",
            "iglu:com.omedastudios/generic_error/jsonschema/1-0-1"

        ],
        "sde_aliases": ["generic_error1"]
    },

But I always get an error about an invalid file format when trying to generate the config.

The package will already coalesce all sub-major versions of a schema’s columns in bigquery for you, so you only need to provide the schema once. In your case I can see it’s the same fields in both versions as well so it doesn’t even matter which one.

    {
        "event_names": ["generic_error"],            
        "self_describing_event_schemas": [
            "iglu:com.omedastudios/generic_error/jsonschema/1-0-1"
        ],
        "sde_aliases": ["generic_error1"]
    },

This will get you the values from either column in bigquery, no need for you to do anything special yourself. Same if you’re using a non-bigquery warehouse, as in those cases there is only 1 column anyway.

Thanks Ryan,

One last question, what is the purpose of self_describing_event_schemas are you expecting us to have multiple event schemas all coalesced together?

Because at the moment we have one line per Snowplow event like so.

    {
        "event_names": ["generic_error"],            
        "self_describing_event_schemas": [
            "iglu:com.omedastudios/generic_error/jsonschema/1-0-1"
        ],
        "sde_aliases": ["generic_error1"]
    },
    {
        "event_names": ["foo"],            
        "self_describing_event_schemas": [
            "iglu:com.omedastudios/foo/jsonschema/1-0-1"
        ],
        "sde_aliases": ["generic_error1"]
    },
etc etc

It allows you to have multiple different events in the same output table - it’s a rare use case for sure.