Add extra fields to default iglu jsonschema

Hi there! I need possibility to add extra fields to iglu db table with schemas. The goal is to track schema owners. Now I resolve this issue by adding extra dictionaries with extra information about every event. But it would be much more convenient to put this meta information directly in json schema of event. Is it possible without iglu server and igluctl fork ?

If schema owner static property of the schema, than it won’t be injected into the events. Which wouldn’t solve the issue at hand. You’d need to fork transformer and schema-ddl too to achieve that.

Making it dynamic also does not work. As many central schemas are emitted by the enrichments, which won’t populate fields in the forked versions.

As described, I think easiest way of solving this would be during modelling step. For example, create a set of views for each owner. You could script it with dbt on top of our models.

Another approach is to create a schema with field “owner” type “string” and attach it (multiple times if needed) to your events during tracking. It would affect the entire event, which a little different from your requirement, but might be practical in some situations.

Thanks for fast response. I definitely don’t want to send such data with every event. Creating views is also not seems as perfect solution because in this case you should manage schemas and their metadata in different places. By the way, is this issue solved somehow in Snowplow BDP ?

Yes - in a way. In BDP we have support for custom metadata as part of the Data Structures API but this doesn’t really exist in Iglu / Iglu Server at the moment. If you add custom metadata though I’m pretty sure the Iglu Server API will very happily store it in the database (I haven’t tested this though).

For example you could likely do:

  "type": "object",
  "owner": "serhii",
    "properties": {
      "x": {
        "type": "string"
      }
    }
}

which will work in most versions of JSON schema (we support draft V4 but in this is valid up to the 2020-12 draft) but there’s a high likelihood that this may not work in drafts beyond that - referred to as “unknown keywords”. You can read through active discussion on that here.