Trouble Using Custom Iglu Schemas for Snowplow Micro

I’m trying to set up automated tests for a website to Snowplow Micro. The website is a simple webpage with some text and a form that prompts the user to submit their name. Currently, events such as form submissions, link clicks, and page pings are being successfully sent and enriched by Snowplow.

I want to use a custom schema that would mark an event as ‘bad’ if the name submitted is under 5 characters. I think I was able to add this to the schema registry because when I run curl --request GET --header "apikey:<MY_API_KEY>" <URL_TO_SCHEMA_REGISTRY/schemas>, my output shows me the custom schemas I made.

To add this custom schema, I updated my iglu.json file to include

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-2",
  "data": {
    "cacheSize": 500,
    "cacheTtl": 600,
    "repositories": [
      {
        "connection": {
          "http":{ 
            "uri":"http://iglucentral.com"
          }
        },
        "name": "Iglu Central",
        "priority":10,
        "vendorPrefixes":[]
      },
      {
            "connection": {
              "http":{
                "apikey":"MY_API_KEY",
                "uri":"URL_TO_SCHEMA_REGISTRY"
              }
            },
            "name":"SCHEMA_REGISTRY_NAME",
            "priority":0,
            "vendorPrefixes":["com.MY_VENDOR_NAME"]
       }
     ]
  }
}

And under the same directory that is mounted when starting Micro, I have the following directory structure:


With the following code in my 1-0-0 file:

{
	"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
    "description": "Schema for an username changed event",
    "self": {
      "vendor": "com.MY_VENDOR_NAME",
      "name": "username_changed",
      "format": "jsonschema",
      "version": "1-0-0"
    },
    "type": "object",
    "properties": {
      "username": {
        "description": "Identifier for user",
        "type": "string",
        "minLength": 5,
        "maxLength": 255
      }
    },
    "additionalProperties": false,
    "required": [
      "username"
    ]
}

When I run Snowplow Micro on docker with docker run --mount type=bind,source=$(pwd)/example,destination=/config -p 9090:9090 snowplow/snowplow-micro:1.3.0 --collector-config /config/micro.conf --iglu /config/iglu.json and use cat on /config/iglu.json, I see the iglu.json file I’ve written above.

However, since adding this new schema, seemingly nothing has changed. Even when I type in a username under 5 characters, no bad events are created. Further, using the Poplin Data Snowplow Chrome extension, I’m unable to find the custom schema I created under the “Manage Schemas” tab. Any assistance would be greatly appreciated, and I’m happy to provide additional details if needed!

Thanks so much!

Hello @eileen_dover and welcome to the Snowplow community!

Whenever you have a minute, it’d be great if you share some more details to help us confirm!

  1. Expected outcome:

I want to use a custom schema that would mark an event as ‘bad’ if the name submitted is under 5 characters.

Does that mean that you prefer to have a username_changed event to end-up in your bad-events queue if it contains a “bad” username? This means that those events, even if they may represent valid user behavior, they will not appear in your warehouse, as with any event not conforming to its schema.

  1. Tracking details:

since adding this new schema, seemingly nothing has changed

Could you also please confirm that since you added the new schema, you are also tracking such an event (e.g. when a user submits the form)? For example, with something like:

// assuming Snowplow JavaScript tracker v3
snowplow('trackSelfDescribingEvent', {
  event: {
    schema: 'iglu:com.MY_VENDOR_NAME/username_changed/jsonschema/1-0-0',
    data: {
        username: 'user'         // intentionally a "bad" one
    }
  }
});

If yes, do you actually see those events firing in Poplin Chrome extension, but can’t find them in Micro’s good or bad REST API endpoints?

  1. About embedded iglu:

The embedded iglu capabilities of Micro are great for testing (and can be setup independently of the iglu resolver configuration in your iglu.json). For Micro to find your embedded schemas, the directory structure has to “match” your schema name’s structure. For example:

example
├── iglu-client-embedded
    └── schemas
        └── com.MY_VENDOR_NAME
            └── username_changed
                └── jsonschema
                    └── 1-0-0

To quickly check whether Micro sees your embedded schemas:

curl -X GET http://localhost:9090/micro/iglu/com.MY_VENDOR_NAME/username_changed/jsonschema/1-0-0

At this point you can test your webpage’s tracking without any change to your iglu.json.
In fact, since you are using the brand new v1.3.0, you could just spin up Micro with just:

docker run -p 9090:9090 snowplow/snowplow-micro:1.3.0

And once you are happy with the above, you could switch over into using your private iglu repository, exactly as you have done already in your iglu.json!

4 Likes

Hi @Ada ! Thank you so much for your response, this was exactly the walkthrough I needed. It turns out that we never added the code to make sure that Snowplow was tracking the event. Thank you again for your help!

2 Likes

That’s great to hear! Thank you very much for letting us know!

By the way: brand new Micro v1.3.1 is just released and it resolves an issue around embedded iglu that you may have come across too. Just thought to let you know as is definitely worth it and easy to upgrade!

2 Likes