Iglu JVM Embedded repo on runtime?

Hey Team,

I’m currently trying to use the JVM embedded repo to manage my json-schemas and as I understood the documentation,
I should be able to mount the repository in my application (in runtime) under /snowplow-enricher/src/main/resources/repo/schemas/... but not matter how I configure my resolver config, the Iglu Embedded Client does not find my repo. Currently, it looks like:

  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1",
  "data": {
    "cacheSize": 500,
    "repositories": [
        "name": "Iglu Private",
        "priority": 0,
        "vendorPrefixes": ["com.some.company"],
        "connection": {
          "http": {
            "uri": "/repo"

Is it possible to mount a repo that way, or is it only possible to bake the repository into the JVM before even building a docker container ?


Hi @jonas ,

Welcome to Snowplow community !

I don’t think that this is possible. From the documentation you linked :

As an embedded repo, there is a no mechanism for updating the schemas stored in the repository following the release of the host application.

May I ask for which application you’re trying to do that ?

If you want to test tracking, Snowplow Micro makes it possible to use schemas at runtime (thanks to this line).

Actually I think this might be possible, although the documentation is currently wrong.

The trick is to put it in a path where the JVM class loader can see it. If you use the standard docker image for enrich then classes are loaded from: /home/snowplow/lib. So try putting it under:


Disclaimer: I haven’t tried this yet! If it works though then we should add it to the documentation.

Yeah, I use snowplow/snowplow-enrich-kinesis:3.1.2
I just tried to mount the repo that way, i.e.


I also tried:


and I tried posting the event with:

curl 'https://adress.../com.snowplowanalytics.iglu/v1'\
-H  'Content-Type: application/json; charset=UTF-8'\
--data-raw '{"schema": "iglu:com.company.test/test_event/jsonschema/1-0-0", "data": {"example_value": "test_value"}}'

But sadly, the events end up in the bad-event bucket with a resolution error, so I think It’s probably not possible this way if I didn’t miss anything.

Thanks for the kind welcome!
I am trying to build a server side tracking solution with snowplow. We want to run snowplow in Kubernetes, and keep the json-schema inside one git-repository.

My idea was to sync the git repository and mount the schemas directly into the pod. That way, I could just add a new schema via a merge request, and it would automatically be pulled into the enricher. This way, I wouldn’t need to build an extra static repository, but the enricher would also not need to communicate with anything other than the kinesis stream.

It should be possible to use directly Github as a static HTTP server holding the schemas.

For instance let’s say that your schemas are there, then you can use the raw function of Github and use https://raw.githubusercontent.com/snowplow/iglu-central/master/ as Iglu URI in your resolver.

master can be updated to any branch BTW.

Would that be also possible with a private repository? I forgot to mention that keeping the schema repository private was one of the reasons why I came up with the idea of syncing it directly into the Kubernetes cluster.

I’m afraid that it’s not, as downloading would require to provide an authorization token in the HTTP headers but that’s not possible.

In case it can help you, your schemas could just be served by a static HTTP server, e.g. with python -m SimpleHTTPServer in the folder that contains schemas/.

That’s a great idea! That way I should be able to quickly sync new schema while keeping things private inside the cluster.
Thanks for taking the time.

1 Like