schemaKey vs schemaCriterion

created a super simple schema and I am trying to use a curl request to test it. both commands are below.

Implemented the quickstart on AWS and follow this to register the schema.

In my error message I am getting a tracker protocol violation that refers to a schemaKey vs schemaCriterion conflict.

"messages":[{"schemaKey":"iglu:com.simplebet/simpleEvent/jsonschema/1-0-0","schemaCriterion":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-*"}]}

What am I doing wrong here??? I can test the example event given at the end of the quickstart but cant move forward with custom events like this.

SCHEMA

{
	"$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
	"description": "simpleEvent",
	"self": {
		"vendor": "com.simplebet",
		"name": "simpleEvent",
		"format": "jsonschema",
		"version": "1-0-0"
	},

	"type": "object",
	"properties": {
		"some_words": {
			"description": "Example string field",
			"type": "string",
			"maxLength": 255
		},
		"some_integer": {
			"description": "Example integer field",
			"type": "integer",
			"minimum": 0,
			"maximum": 100000
		},
		"some_numeric": {
			"description": "Example number field",
			"type": ["number","null"],
 			"multipleOf": 0.0001,
			"minimum": -1000000,
			"maximum":  1000000
		}
	},
	"minProperties":1,
	"required": ["some_words"],
	"additionalProperties": false
}

CURL

curl '{{COLLECTOR_URL}}/com.snowplowanalytics.snowplow/tp2' \
   -H 'Content-Type: application/json; charset=UTF-8' \
   -H 'Cookie: _sp=305902ac-8d59-479c-ad4c-82d4a2e6bb9c' \
   --data-raw '{
     "schema": "iglu:com.simplebet/simpleEvent/jsonschema/1-0-0",
     "data": {
       "some_words": "hey there",
       "some_integer": 8675,
       "some_numeric": 309
     }
   }'

You’re not sending a tracker protocol request to the collector - the format is wrong.

Rather than trying to construct a tracker protocol format, it’s much easier to use one of our trackers to send data, or if you want to do it via command line, use the tracking cli.

2 Likes

thanks for the hint @Colm - started using the cli - sdjson and json commands are below but I see nothing happening in the event database. i do see a receipt for these events in the “raw” s3 bucket and not the “bad” s3 bucket so I guess thats good? couple of questions though:

  1. am I doing something wrong? its not really clear how the sdJSON or JSON should be built but I provided a key value pair for the only required attribute in the specified schema.
  2. hypothetically, where is this event going to land? in the pre-existing event schema? will a new schema be created automatically?

Thanks in advance for your help

./snowplow-tracking-cli --protocol "http" --collector {{COLLECTOR}} --method POST --sdjson "{\"schema\":\"iglu:com.simplebet/simpleEvent/jsonschema/1-0-0\", \"data\":{\"some_words\":\"hey\"}}"

./snowplow-tracking-cli --protocol "http" --collector {{COLLECTOR}} --method POST --schema iglu:com.simplebet/simpleEvent/jsonschema/1-0-0 --json "{\"some_words\":\"hey\"}"

bumping this. any feedback on the last comment? I have a very simple 3 column custom event I cant get to work so I am either over thinking things or snowplow is way more complicated than I thought

@simpledave ,

am I doing something wrong?

I see nothing wrong with your CLI usage. Both variations should produce the same result.

hypothetically, where is this event going to land?

It depends on how you have set up your pipeline. With the standard configuration, I would expect the events to land in the enriched bucket as well (which is not the same as raw bucket). It is the enriched data that would typically be processed further and loaded to your final destination (again, depends on your pipeline setup).

For a better visibility, you can refer to the pipeline architecture (various modifications) depicted in Batch pipeline steps.

You might wish to spin Snowplow Mini and see how your test events look like when landed in OpenSearch/Elasticsearch, for example.

Thanks so much for your reply @ihor - I obviously need to read more into the batch pipeline steps and possibly spin up snowplow mini to help with testing.

Just trying to answer what I think is a simple question though. I used the quick-start setup for AWS, so I would just assume my setup is very “vanilla”. Out of the box, I got an “event” table in the “atomic” schema of the database I set up. Would new, custom events route to the “event” table? Do I need to build new tables in “atomic” for each custom event or are they created upon first event/schema registry??

Thank you again for your help

@simpledave , if all pipeline components set up properly and you are using their latest versions then the custom tables would be created automatically. RDB Loader is smart enough to know that a new custom event got processed and requires a dedicated table, which will be created in accordance with the JSON schema that describes it.

Here’s the diagram that shows this process. Though, it is outdated and depicts the old workflow but the general principle is still the same.

Note that while custom data would be loaded to the dedicated (child) table, it still has its root in events table. That is the custom table will have root_id field that shares its value with events.event_id field. Note that it is specific to Redshift only. Any other supported data store would have a single events table (no shredding takes place).

1 Like

@ihor thank you once again for explaining this! the issue was that the tutorial I was following (GitHub - snowplow/iglu-example-schema-registry: Example static schema registry for Iglu) explicitly requested that I create a table for each event. if this was a little more clear in the quick-start I would have been able to get this done in hours vs weeks. really appreciate the clarification.

Hi @simpledave, thanks for pointing that out! The iglu-example-schema-registry repo is quite outdated (we no longer need JSONPaths or the sql directory). The quick start guide and also this page should be the correct source of information.

May I ask how you came across iglu-example-schema-registry? I’d like to archive it to avoid confusion, but would need to ensure everything else that links to it is updated. I found one reference to it in the docs which I am about to remove.

EDIT: I think I’ve now removed all references to this repo and marked it as outdated. Hopefully this saves future readers some time!