Running iglu-server (schema repo) locally for snowplow-micro

Hey,

we’re running snowplow-micro to test and validate our schemas. For this purpose, we created 4 docker containers:

    ├── iglu-server
    │   ├── Dockerfile
    │   └── iglu_conf.conf
    ├── igluctl
    │   ├── Dockerfile
    │   └── schemas
    │       └── com.myapp
    │           └── minimal_tracking_event
    │               └── jsonschema
    │                   └── 1-0-0
    ├── postgres
    │   ├── Dockerfile
    │   └── init.sql
    └── sp-micro
        ├── Dockerfile
        ├── iglu.json
        └── micro_conf.conf

In the igluctl container we’re trying to push our schemas to our local iglu server.

FROM snowplow-docker-registry.bintray.io/snowplow/igluctl:0.6.0

COPY schemas/ ./schemas

CMD ["static", "push", "--public", "./schemas", "iglumock:80/iglu-server", "4ab49682-d0d5-11ea-87d0-0242ac130003"]

Linting the schemas in this container worked without a problem but pushing it to the iglu-repo results in the following error:

No usable value for read
Did not find value which can be converted into java.lang.String

For our iglu-server we use the following config:

repo-server {
  interface = "0.0.0.0"
  port = 8080
  pool = "cached"
}

# 'postgres' contains configuration options for the postgre instance the server is using
# 'dummy' is in-memory only storage
database {
  type = "postgres"
  host = "postgres"
  port = 5432
  dbname = "igludb"
  username = "postgres"
  password = "iglusecret"
  driver = "org.postgresql.Driver"
  maxPoolSize = 5
  pool = {
  }
}

# Enable additional debug endpoint to output all internal state
debug = true

And the iglu-server has an interface at:

[2020-07-29T08:15:26.313Z] [ioapp-pool-1-thread-1] INFO org.http4s.server.blaze.BlazeServerBuilder - http4s v0.20.19 on blaze v0.14.11 started at http://0.0.0.0:8080/

Maybe our approach of setting up snowplow-micro is too complicated but it would be good to know whether we use the igluctl static push command correctly or whether something about our iglu-server config is incorrect.

1 Like

Hi @mgloel,

You don’t need to have a new image to run igluctl subcommands or iglu server. I’d recommend using official docker images of iglu server and igluctl, which allows using subcommands as if you’re using the executable.

Please have a look at a sample run. In your scenario, you need the following to start with:

$ docker run snowplow-docker-registry.bintray.io/snowplow/igluctl:0.6.0 lint

It’ll show you expected parameters and flags with their meaning.

Please also check

$ docker run snowplow-docker-registry.bintray.io/snowplow/igluctl:0.6.0 static push

to see usage, options and flags with their meanings.

To run Iglu Server with Postgres, please have a look at our example running Iglu Server and Postgres together in a docker-compose example where README shows how to push your schemas to Iglu Server.

I believe using official images will ease your workflow. Please let us know how it goes!

2 Likes

Hi @mgloel,

I can suggest a much simpler setup of micro and iglu, which might help you out.

The trick is to create a iglu.json file that fetches schemas from local files instead of from a http server. My iglu.json file looks like this:

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-0",
    "data": {
      "cacheSize": 500,
      "repositories": [
        {
          "name": "Iglu Central",
          "priority": 0,
          "vendorPrefixes": [ "com.snowplowanalytics" ],
          "connection": {
            "http": {
              "uri": "http://iglucentral.com"
            }
          }
        },
        {
          "name": "local Iglu repository",
          "priority": 5,
          "vendorPrefixes": [ "test.example.iglu" ],
          "connection": {
            "http": {
              "uri": "file:///local-iglu"
            }
          }
        }
      ]
    }
}

Notice the line "uri": "file:///local-iglu"

Finally, you need to mount the schemas into the micro docker container when you run it:

docker run \                                                                                                                                                                          
  --mount type=bind,source=$PWD/example,destination=/config \                                                                                                                
  --mount type=bind,source=$PWD/schemas,destination=/local-iglu \                                                
  -p 9090:9090 \                                                                                   
  snowplow/snowplow-micro:latest --collector-config /config/micro.conf --iglu /config/iglu.json

With this setup you don’t need any of the iglu-server, igluctl or postgres docker containers in order for micro to see your schemas.

Hi @istreeter ,

I’ve been trying to use (without success) your solution.

I have in snowplow-config directory the following:

schemas/com.myapp/<event>/jsonschema/1-0-0
iglu.json
micro.conf

in iglu.json I got the local repo added after the iglucentral:

        {
          "name": "local repo",
          "priority": 5,
          "vendorPrefixes": [ "com.myapp" ],
          "connection": {
            "http": {
              "uri": "file:///config"
            }
          }
        }

I’m starting the docker with:

docker run \
--mount type=bind,source=$(pwd)/snowplow-config,destination=/config \
-p 9090:9090 \
snowplow/snowplow-micro:1.1.0 --collector-config /config/micro.conf --iglu /config/iglu.json

and no matter what I tried, I always get an error on http://localhost:9090/micro/bad:

Error while validating the event {
  schema_violations {
    ResolutionError{
      RepoFailure{
          "message": "sun.net.www.protocol.file.FileURLConnection:file:/config/schemas/com.myapp/<event>/jsonschema/1-0-0 (of class sun.net.www.protocol.file.FileURLConnection)"
      }
    }
  }
}

Do you have any idea what might be happening here?

Hi @rvp-diconium ,

The suggestion I made above (July 2020) does not work with the latest versions of snowplow-micro (since September 2020). Briefly, it is because the new micro uses a newer version of iglu client, which is more strict about uri format.

I have two alternative suggestions for how test and validate schemas during development:

Option 1 - host schemas in github

This is what we do in the snowplow-micro-examples repo. The schemas are in a iglu directory within the repo. Micro is then configured with a iglu.json file which is configured to find the schemas in github.

Option 2 - use a static file server locally

This one is easiest to describe using a docker-compose file. In this example I use halverneus/static-file-server to serve the schemas - but there are plenty of alternative static file servers. Notice that I use 2 docker containers - and therefore it is simpler than the original suggestion in this thread, which required 4 docker containers.

version: '3'

services:
  micro:
    image: snowplow/snowplow-micro:1.1.0
    ports:
      - "9095:9095"
    volumes:
      - ./micro:/config
    command: "--collector-config /config/micro.conf --iglu /config/iglu.json"

  iglu:
    image: halverneus/static-file-server
    volumes:
      - ./iglu:/web

My micro directory is mounted into the micro docker container. This directory has the micro.conf and iglu.json files. My iglu.json file is configured to point to the local iglu server on port 8080, plus iglu central

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-0",
    "data": {
      "cacheSize": 500,
      "repositories": [
        {
          "name": "Iglu Central",
          "priority": 0,
          "vendorPrefixes": [ "com.snowplowanalytics" ],
          "connection": {
            "http": {
              "uri": "http://iglucentral.com"
            }
          }
        },
        {
          "name": "My Schemas",
          "priority": 10,
          "vendorPrefixes": [ "com.acme" ],
          "connection": {
            "http": {
              "uri": "http://iglu:8080"
            }
          }
        }
      ]
    }
}

And my iglu directory is mounted into the file server docker container, and it has this structure:

iglu
└── schemas
    └── com.acme
        ├── my_schema1
        │   └── jsonschema
        │       └── 1-0-0
        └── my_schema2
            └── jsonschema
                └── 1-0-0

3 Likes

I’ve tried this with a very simple custom schema, but the record always ends up in micro’s ‘bad’ bucket.
Is there a way I can check if snowplow is able to see my local iglu server?

What does the error message in the bad bucket say?