I have recently been trying to add some extra validation to my pipeline using Iglu and unstruc_events through to the stream enricher.
It seems that the iglu client within the stream enricher is trying to regex validate the self-describing json schema instead of the one I have specified, yet when I try to send the event through my tracker without the SelfDescribingJson() it fails the test that it is a SelfDescribingJson, what should I do?
below:
- enrich error
- tracking code
- intended schema
- resolver.json
<-------------- ENRICH ERROR EXAMPLE --------------->
{
“line”: “{lots of base 64 line}”,
“errors”: [
{
“level”: “error”,
“message”: "error: ECMA 262 regex “^iglu:[a-zA-Z0-9-.]+/[a-zA-Z0-9-]+/[a-zA-Z0-9-_]+/[0-9]±[0-9]±[0-9]+\" does not match input string \"http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0\"\n level: \"error\"\n schema: {\"loadingURI\":\"#\",\"pointer\":\"/properties/schema\"}\n instance: {\"pointer\":\"/schema\"}\n domain: \"validation\"\n keyword: \"pattern\"\n regex: \"^iglu:[a-zA-Z0-9-_.]+/[a-zA-Z0-9-_]+/[a-zA-Z0-9-_]+/[0-9]+-[0-9]+-[0-9]+”\n string: "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0"\n"
},
{
“level”: “error”,
“message”: “Unstructured event couldn’t be extracted”
}
],
“failure_tstamp”: “2016-11-15T12:53:07.633Z”
}
<-------------- END ENRICH ERROR EXAMPLE --------------->
This is my python code to send the event
<-------------- TRACKER CODE --------------->
s = Subject()
t.subject.set_platform(platform).set_user_id(uid).set_lang(“enc”).set_ip_address(ip)
event = SelfDescribingJson(schema=“iglu:com.busuu/standard_event/jsonschema/1-0-1”,
data={
“event”: {event_name},
“uid”: {uid},
“language_learnt”: {language_learnt},
“interface_language”: {interface_language},
“params”: {custom_context},
“platform”: {platform},
“app_id”: {app_id},
“version”: {version},
“environment”: {environment},
“user_agent”: {user_agent}})
t.track_unstruct_event(event)
<-------------- END TRACKER CODE --------------->
this is the schema that I am trying to validate against
<-------------- SCHEMA VALIDATOR CODE --------------->
{
"$schema": “http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#”,
“description”: "Schema for the busuu ",
“self”: {
“vendor”: “com.busuu”,
“name”: “standard_event”,
“format”: “jsonschema”,
“version”: “1-0-0”
},
"type": "object",
"properties": {
"event": {
"type": "string",
"maxLength": 255
},
"uid": {
"type": "string",
"maxLength": 255
},
"ts": {
"type": "string",
"maxLength": 255
},
"language_learnt": {
"type": "string",
"maxLength": 255
},
"interface_language": {
"type": "string",
"maxLength": 255
},
"params": {
"type": "string",
"maxLength": 500
},
"platform": {
"type": "string",
"maxLength": 255
},
"app_id": {
"type": "string",
"maxLength": 255
},
"version": {
"type": "string",
"maxLength": 255
},
"environment": {
"type": "string",
"maxLength": 255
},
"user_agent": {
"type": "string",
"maxLength": 255
}
},
"additionalProperties": false
}
<-------------- END SCHEMA VALIDATOR CODE --------------->
and finally my resolver.json
<-------------- RESOLVER CODE --------------->
{
“schema”: “iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1”,
“data”: {
“cacheSize”: 500,
“repositories”: [
{
“name”: “Iglu Central”,
“priority”: 0,
“vendorPrefixes”: [ “com.snowplowanalytics” ],
“connection”: {
“http”: {
“uri”: “http://iglucentral.com”
}
}
},
{
“name”: “busuu Iglu Repo”,
“priority”: 5,
“vendorPrefixes”: [ “com.busuu” ],
“connection”: {
“http”: {
“uri”: “{ip of my resolver}”
}
}
}
]
}
}
<-------------- END RESOLVER CODE --------------->