I used this command:
(include the path for iglu file)igluctl static push schemas/com.vendor/event_name/jsonschema/ hostname key
I used this command:
(include the path for iglu file)igluctl static push schemas/com.vendor/event_name/jsonschema/ hostname key
Hello Kim,
Thank you so much for your response.
I am also using the same command, let me explain you our scenario.
we have a filed called button_click.json and the path for this file is schemas
└── com.snowplowanalytics
└── button_click
└── jsonschema
└── 1-0-0
└── button_click.json
the connect of the button_click .json file:
{
“$schema”: “http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#”,
“description”: “Schema for a button click event”,
“self”: {
“vendor”: “com.snowplowanalytics”,
“name”: “button_click”,
“format”: “jsonschema”,
“version”: “1-0-0”
},
“type”: “object”,
“properties”: {
“id”: {
“type”: “string”,
“minLength”: 1
},
“target”: {
“type”: “string”
},
“content”: {
“type”: “string”
}
},
“required”: [“id”],
“additionalProperties”: false
}
After that, first I used the lint command to validate the json file and the command is →
./igluctl lint /Users/ridhamshah/Documents/CDP/igluctl/schemas/com.snowplowanalytics.self-desc/button_click/jsonschema/1-0-0/
and we are getting an error as shown below:
Cannot read [/Users/ridhamshah/Documents/CDP/igluctl/schemas/com.snowplowanalytics.self-desc/button_click/jsonschema/1-0-0]: Path is neither file or directory
Cannot read [/Users/ridhamshah/Documents/CDP/igluctl/schemas/com.snowplowanalytics.self-desc/button_click/jsonschema/1-0-0]: no valid JSON Schemas
TOTAL: 1 files corrupted
Could you please give me an update on this one where am I doing wrong?
Is there any configurations we need to do regarding the vendor prefix or json file?
Do we need to configure the iglu reslover file as well and if yes then where?
Thanks
The file path should be this
└── com.snowplowanalytics
└── button_click
└── jsonschema
└── 1-0-0
The file 1-0-0 is where you store your json not in button_click.json
The correct command is: ./igluctl lint /Users/ridhamshah/Documents/CDP/igluctl/schemas/com.snowplowanalytics/button_click/jsonschema/1-0-0/
It is recommended to have the vendor name set as your company name btw. (com.snowplowanalytics → com.[yourcompany]
Hi Mike, why would a schema be considered unrecognised and how would I validate it? I did a igluctl push and pull and received a response of 200 for both. Unable to determine what the cause is. Thanks
Following on what @Mike and @josh said, as previously mentioned I’m working with @kim on this as well - in our iglu resolver config files, we correctly set the IP to the address that was printed when the iglu server was created using terraform. The API key is also the same one (double checked it!), between both the pipeline and the server. We are not sure why we can see the fact that the server is receiving and storing the correct data (visible with igluctl static push/pull respectively), but the events are not flowing into BQ.
Are you seeing any messages in the output of enrich bad in pubsub (or in failed events)? This would indicate a failure to send the data from the enriched good PubSub topic to BigQuery.
Hey folks, a couple of things you mightn’t be aware of and I can’t see in my cursory glance at the thread:
Your events might be landing in failed events - if you weren’t aware of this concept, it’s worth reading those docs.
For a faster test loop while setting up tracking, you can use Snowplow Micro locally. You can use the same iglu resolver as your main pipeline, and it’ll give you instant feedback and easy access to error messages. That is probably the fastest path to figuring this out. There’s a quickstart guide here, with which it shouldn’t take long to get up and running.
I think the simplest path is to use Micro, and reason about the error messages it gives you. If you don’t understand the errors, post them here.
PS. I wouldn’t worry about the schema displaying as ‘unrecognised’ in the browser debugger tool - I think it’s a red herring in this thread. It means that client-side tool can’t recognise the schema, but this doesn’t relate to what’s happening in the pipeline. The client-side (ie the tracker) doesn’t needs access to schemas in order for tracking to work.
Hi all -
@kim and I are following up to the last couple responses with a few questions:
@Colm -
You mentioned that our events might be landing in the failed events table. In our pipeline script, to our knowledge this should end up in the dead letter queue configured in the pipeline script right? If this is the case, we are not seeing any activity in this queue at all - even under the monitoring section there is no data flowing through. We are considering using Micro to see if these events are actually not being delivered or if they are ending up somewhere else.
@mike -
We are not seeing anything in failed events (again, assuming that this is the dead letter queue configured in the pipeline tfvars file). If we were to examine the pubsub topic, is there a preconfigured named bucket that it goes to? If so, what is the name, or how can we figure out the topic name that the pub sub events are going to?
We have also been looking through various snowplow documentation and resources and we are wondering where the listener is created in the quick start process. We were also looking at this to see if we were missing a step in configuring the loader + mutator. How would we run these if we’re hosting everything on a GCP managed iglu server? Do these not apply to Snowplow CE and only to BDP? If that’s the case, where in the tfvars file do we configure the listener + mutator? I’m not seeing any variables or configurable options there for the mutator/listener.
Thanks all for your continued help and looking forward to your responses!
The dead letter queue is not what we are referring to, no. We are referring to the failed events stream.
This is the loader configration, there is no other config required - but I don’t think it’s worth spending time thinking about the loader or mutator until you have checked the failed events stream and/or verified the tracking with Micro. From experience this is the explanation in the vast majority of similar cases.
Thank you for the help! We’re now able to see some forced failed events using the cURL commands on the snowplow page for forcing a failed custom event.
One more question - with snowplow community edition, do we need to set up a mutator and additional bigquery loader? If so, are we to install them ourselves by ssh’ing into a google cloud kubernetes engine instance (one that was already created by the terraform script), and then running the docker commands mentioned here? Is this necessary for the community edition, or is it not needed for tracking custom events?
Hi, I wanted to revist this.
Is this the correct url for the iglu server?
http://{hostname}schemas/com.{vendor}/{event_name}/jsonschema/{version}
Thanks
Using Snowplow micro, we are able to finally see a bad event. However, we are unsure how to resolve the error we are resolving in Snowplow mirco. Does it have to do with our $schema property? (http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#) If that’s the case, what should the correct value be? http://{hostname}schemas/com.{vendor}/{event_name}/jsonschema/{version}? Thanks
Error in Snowplow micro:
Our working schema (validated using --lint) :
{
“description” : “Schema for Striim Developer sign up”,
“properties” : {
“name” : {
“description” : “”,
“type” : [
“string”
],
“maxLength” : 128
},
“email” : {
“description” : “”,
“type” : [
“string”
],
“maxLength” : 128
},
“source_adapter” : {
“description” : “”,
“type” : [
“string”
],
“maxLength” : 128
},
“target_adapter” : {
“description” : “”,
“type” : [
“string”
],
“maxLength” : 128
}
},
“additionalProperties” : false,
“type” : “object”,
“required” : [
“name”,
“email”,
“source_adapter”,
“target_adapter”
],
“self” : {
“vendor” : “com.striim”,
“name” : “developer-signup-2”,
“format” : “jsonschema”,
“version” : “1-0-0”
},
“$schema” : “http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#”
}
That means it couldn’t find the schema - if you look at the lookupHistory section of the error, you’ll see where it looked and what results it got.
Are you using the same iglu resolver config for Micro as your pipeline? Perhaps you can share the iglu_resolver.json
with any sensitive values removed?
The $schema
value is correct, I don’t see any issue with the schema itself.
PS. I see you’ve named the event developer-signup-2
. Schemas are versioned, so this is effectively developer-signup
verson 2-0-0
. We have docs on versioning here.
We didn’t create an iglu_resolver.json, as it wasn’t mentioned during the setup process for the community edition. According to a previous post here in this thread, since we’re using the community edition, the resolver is just the few lines in the terraform script indicating the IP and the API Key correct?
Apologies, I didn’t explain this very well, allow me to try again:
What you need to do is make sure your Micro can communicate with the same Iglu Server that the pipeline uses - this is now easier to configure than I thought. You just need to configure these env vars in micro to the values for your pipeline’s iglu.
If you have already configured this, and still get a ResolutionError, then please copy and paste the full ResolutionError, including the lookupHistroy (I don’t think it would contain a sensitive value, but please do check and remove if it does).
Forget about iglu_resolver.json
- previously it was needed, but now both the terraform quickstart and Snowplow Micro create it for you under the hood. I understand the confusion, my previous message was presented in a more confusing way than was needed.
I tried with Micro … But still it fails with following error. Still it points to iglu central … How does resolver gets updated with "custom vendor " in community edition
{“value”:{“Iglu Central”:{“errors”:[{“error”:“NotFound”}],“attempts”:1,“lastAttempt”:“2024-05-29T12:09:01.996Z”},“Iglu Central - Mirror 01”:{“errors”:[{“error”:“NotFound”}],“attempts”:1,“lastAttempt”:“2024-05-29T12:09:03.295Z”},“Iglu Client Embedded”:{“errors”:[{“error”:“NotFound”}],“attempts”:1,“lastAttempt”:“2024-05-29T12:09:00.852Z”}}}