Enricher lost connection to S3 buckets. Can't read iglu schemas

We have an enricher running in AWS as Autoscheling group and it can’t access S3 buckets with iglu schemas. Here is the error in Enriched Bad files.
{
“schema”: “iglu:com.snowplowanalytics.snowplow.badrows/schema_violations/jsonschema/2-0-0”,
“data”: {
“processor”: {
“artifact”: “streamCommon”,
“version”: “2.0.5”
},
“failure”: {
“timestamp”: “2024-12-11T20:23:38.429Z”,
“messages”: [
{
“schemaKey”: “iglu:com.datarobot.ui.contexts/user/jsonschema/1-0-0”,
“error”: {
“error”: “ResolutionError”,
“lookupHistory”: [
{
“repository”: “DataRobot Iglu Server”,
“errors”: [
{
“error”: “NotFound”
}
],
“attempts”: 17,
“lastAttempt”: “2024-12-11T20:21:57.746Z”
},
{
“repository”: “Iglu Central”,
“errors”: [
{
“error”: “NotFound”
}
],
“attempts”: 17,
“lastAttempt”: “2024-12-11T20:21:57.775Z”
},
{
“repository”: “Iglu Central - GCP Mirror”,
“errors”: [
{
“error”: “NotFound”
}
],
“attempts”: 17,
“lastAttempt”: “2024-12-11T20:21:57.886Z”
},
{
“repository”: “Iglu Client Embedded”,
“errors”: [
{
“error”: “NotFound”
}
],
“attempts”: 1,
“lastAttempt”: “2024-12-11T20:04:20.536Z”
}
]
}
}
]
},

This error suggests that the schema can’t be found, rather than the enricher not being able to access the bucket (generally if this is the case you will get a RepoFailure or a ClientFailure). Given this is returning a NotFound error this suggests that enrich was able to connect to this repo but not find the schema.

Is the schema published in one of these repositories? How has the connection to DataRobot Iglu Server been setup in the resolver file?

Here is the iglu config

{
“name”: “DataRobot Iglu Server”,
“priority”: 2,
“vendorPrefixes”: [ “com.datarobot”, “com.datarobot.ui.contexts”, “com.datarobot.ui.events” ],
“connection”: {
“http”: {
“uri”: “http://iglu-schemas.s3.amazonaws.com
}
}
}

Are the schemas published and publicly accessible in this bucket?

e.g.,http://iglu-schemas.s3.amazonaws.com/schemas/com.datarobot.ui.contexts/user/jsonschema/1-0-0 returns a 403 where I would expect it to return a 200.

Thanks, Mike, for the pointer. Yes, it was the issue with the public access to the S3 bucket. Once we restored the public access, it started working.

1 Like