Hi Snowplow team,
I am currently deploying Snowplow with a Postgres loader using self-describing events. I encountered a problem when trying to push a non-breaking schema update (adding optional fields to the schema). The schema appears to be pushed to the schema repository, but the payload with the new schema is not ingested into Postgres.
Here are the steps I’ve taken so far:
- I have schema version 1-0-0 with the structure below:
{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Schema for a content context",
"self": {
"vendor": "com.mycompany",
"name": "myschema",
"format": "jsonschema",
"version": "1-0-0"
},
"type": "object",
"properties": {
"content_type": {
"description": "A content type",
"type": "string",
"minLength": 0,
"maxLength": 4096
},
"item_id": {
"description": "An item id",
"type": "string",
"minLength": 0,
"maxLength": 4096
},
"item_name": {
"description": "An item name",
"type": ["string", "null"],
"minLength": 0,
"maxLength": 4096
}
},
"required": ["content_type", "item_id"],
"additionalProperties": false
}
- I tested sending a payload with the above schema, and it worked correctly with the data loaded into Postgres and schema_version value in db shown 1-0-0.
- Then I changed the schema to version 1-0-1 (adding the optional field “status”) with the structure below:
{
"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Schema for a content context",
"self": {
"vendor": "com.mycompany",
"name": "myschema",
"format": "jsonschema",
"version": "1-0-1"
},
"type": "object",
"properties": {
"content_type": {
"description": "A content type",
"type": "string",
"minLength": 0,
"maxLength": 4096
},
"item_id": {
"description": "An item id",
"type": "string",
"minLength": 0,
"maxLength": 4096
},
"item_name": {
"description": "An item name",
"type": ["string", "null"],
"minLength": 0,
"maxLength": 4096
},
"status": {
"description": "A status of item",
"type": ["string", "null"],
"minLength": 0,
"maxLength": 4096
}
},
"required": ["content_type", "item_id"],
"additionalProperties": false
}
- Sending a payload with schema version 1-0-1 included data with every field from #3, but the data is not loaded into Postgres and is not sent to the bad stream either.
Note: I tested by sending a payload that referred to schema version 1-0-1 but used the data structure of version 1-0-0. It seems to work correctly, with the data loaded into Postgres and schema_version value in db shown 1-0-1.