Bad Rows Mapping Kibana

ks107 · May 12, 2023, 3:16pm

Hi there, we switched from managed to self-hosted a while back and was wondering if there’s any guides out there to help make our bad rows mapping a little prettier in Kibana.

When we were on managed, we were able to pull out the different context properties and see individual values. Like what the error was and which context was affected. Currently we can only see the the entire payload and have to sift through the JSON to see what the issue is.

ihor · May 12, 2023, 6:30pm

@ks107 , bad data has different structure depending on the rejection reason and point of rejection. Therefore, you cannot have a universal approach to all the bad data. Different bad data failure type would have different structure. The JSON schemas that describe those structures could be found in Iglu Central.
The visualization of the data in Kibana could be done as per their documentation by selecting the fields of interest - something like shown in their doc here. Additionally, you could use search query to narrow down the search. For better visibility, take a look at the screenshot below (very basic).

ks107 · May 15, 2023, 1:25pm

Hi Ihor,

Thanks for the reply. I understand the different structures of the different rejection reasons.

However, even the data.failure.messages is not shown as it’s own data field. It is currently within data.payload_str, which is why I’ve asked for help here parsing out the data.payload_str into more usable and viewable fields.

We are using the same mapping provided in here but I do not see the same fields being parsed out like in your screenshot. My screenshot below of usable fields
Screenshot 2023-05-15 at 9.21.56 AM

Then if I uncheck “hide missing fields” I get a lot of these fields that don’t seem to map to anything.

ihor · May 15, 2023, 4:44pm

@ks107 , if you do not see all the fields, try to refresh the fields list. Note the issue you are dealing with is related to the product not created or managed by Snowplow. Please, refer to Kibana documentation.

ks107 · May 15, 2023, 5:43pm

Hi @ihor Thanks, I did refresh and I see no change in my bad rows.

I understand that this is a kibana/elastic search issue, I was just looking for guidance as to how to replicate the bad rows columns that we saw in our managed instance of snowplow into our self hosted instance to make debugging easier on the team. I also wondered if anyone in the discourse community faced similar issues and how they went about solving them. If there was anything we were missing in setting up our instance of spmini which is causing our bad row data to show up like this and not like the screenshot you initially sent.

ihor · May 16, 2023, 4:54pm

@ks107 , to my knowledge, when you set up Snowplow Mini everything is done for you automatically - no need for creating the mapping manually as per the guides below:

ks107 · May 16, 2023, 5:39pm

@ihor Thanks for the links. That was my understanding as well. Which is why I was confused as to why our bad rows columns look so different from the screenshots in the examples.
This is the example image from the usage guide, where you can see all the different properties of the payload. In my example above, I only have on long JSON string within data.payload_str rather than data.payload_refererurl or data.failure.messages as shown in the screenshot below.

If everything is supposed to be mapped automatically, I’m trying to figure out why our bad row data is so limited to exploring unlike the example screenshots.

ihor · May 17, 2023, 3:34pm

@ks107 , I do not see anything wrong with the last screenshot. Note that all the bad data displayed there is of “adapter_failures” type. As such, the data representation (fields) corresponds to that type. Once you get bad data of a different type the corresponding (new) fields will be added dynamically. If they do not appear in the fields panel once you have new bad data type captured, you can refresh that list as I mentioned earlier.

ks107 · May 17, 2023, 3:48pm

@ihor, that last screenshot is from the link you sent above, so of course it has no issues. Please review my earlier posts for the screenshots that I see in my bad rows instance. I will post them here again:
Screenshot 2023-05-15 at 9.21.56 AM

you’ll notice that the available fields list does not match the fields from the sample screenshot.

ks107 · February 20, 2024, 10:11pm

@ihor revisiting this very old thread as we’re still not able to see our bad rows in a way that makes it easy to debug.

I noticed that the data.payload_str is mapped as “text” in the bad-mapping.json file. However in our bad rows, the data.payload_str contains all the different JSON objects that we’d like to be parsed out for better readability.

I can see these fields in our index pattern and in the JSON file above, but because they are all under the data.payload_str it seems they don’t get parsed out properly and therefore give us the limited options I’ve shown in my screenshot. Any ideas on how to fix this?

ihor · February 20, 2024, 11:17pm

@ks107 , the failed events are categorized depending on the nature of the failure. As such, they come with different data structure. These data structures are defined by the corresponding JSON schemas.

The fact that you see only payload_str (and not payload.body) leads me to believe that the failed events in questions are those corresponding to one of the following failed events types where the payload is only available as “text” (aka string). Note that the reason for that is because that text cannot be parsed into JSON as the failure took place before the events could be shaped into a Snowplow event according to Snowplow Event Specification (that is, such a data that still contains HTTP headers, for example).

If you can see your bad data in payload_str, can you also see what failed events type it belongs to? If it is one of those three, then you cannot get parsed JSON for such a data.

Again, there are many different failed events data types. They do not have the same structure and hence you might not be able to extract the properties in uniform way. The property that should be present in all of them, however, is schema. It describes what failed event types the bad data belongs to.

ks107 · February 21, 2024, 1:55pm

@ihor thanks for the response. All of our errors are showing up as generic_error:
schema_violation, adapter_failure, enrichment_failures etc.

The schema at the beginning of the _source always shows schema:iglu:com.snowplowanalytics.snowplow.badrows/generic_error/jsonschema/1-0-0

Then the rest of the details are found in payload_str where the data.payload_str.schema denotes the error type (schema_violation, adapter_failures etc. ) at the very beginning of the string.

Is there something we can fix on our end to not have all of these come through as generic_error to ensure they get parsed correctly?

Screenshot example:

ihor · February 21, 2024, 9:10pm

@ks107 , the screenshot was enough for me to tell you this. That kind of error could be ignored. It is a product of HTTP requests sent to the collector endpoint “/akamai/sureroute-test-object.html”.

Most likely it was caused by a bot/crawler going over your collector domain (possibly exploring its vulnerabilities). The Snowplow collector has a few default endpoints, such as “/com.snowplowanalytics.snowplow/tp2” for POST requests or “/r/tp2” for redirects. In other words, that bad event is not a Snowplow event and hence it was categorized as “generic_error”. It could be safely ignored.

ks107 · February 21, 2024, 9:27pm

sorry @ihor yes I know this specific error can be ignored, I only used it as an example as it didn’t contain any company information. As I mentioned it’s happening on all of our other important errors as well. Screenshot of a schema_violation that is also coming through as a generic_error

ihor · February 21, 2024, 10:32pm

@ks107 , I need to see the whole failed event to comment on that. As it is listed under “generic_error” it still would mean the event could not be recognized as valid Snowplow event.

ks107 · February 22, 2024, 2:17pm

@ihor I’ve pasted below most of the error payload.

The data.payload_str clearly states that it’s a schema_violation error, so I’m trying to understand why it’s still being classified as a generic_error, Is there something in our set up that could be causing the error in mapping?

Please note, this happening for ALL of our bad row events. Every single one comes through as generic_error.

BenB · February 23, 2024, 8:16am

Hi @ks107 ,

generic_error is the bad row type emitted by Elasticsearch loader when something goes wrong in it.

NotTSV is an error from the analytics SDK when it can’t parse the enriched event.

So it seems that your ES loader is trying to read the bad rows as enriched events.

After reading the thread, it’s not clear to me if you are using Mini or your own ES loader instance, and you picked the mapping on Mini repo ?

If you’re running your own instance, I suspect that you have this field misconfigured.

ks107 · February 23, 2024, 4:02pm

@BenB Thanks for the response.

Upon checking we are using an older version of mini. Our bad loader config.hocon looks identical to this where the purpose property was not required at the time.

ks107 · February 23, 2024, 4:49pm

Could this be the cause?

In this file here

In the out object, the reference file has:
bad = BadEnrichedEvents

In our file we have bad = EnrichedEvents

Could this be the issue?

BenB · February 26, 2024, 7:56am

I think so ! Can you try with bad = BadEnrichedEvents please ?

Topic		Replies	Views
Debugging bad rows in Elasticsearch and Kibana [tutorial] For data modelers & consumers	2	8104	June 24, 2016
Debugging bad rows in Elasticsearch using curl (without Kibana) [tutorial] For data modelers & consumers	0	3391	April 19, 2016
Question about static repository Enrichment	3	1060	October 28, 2020
Debugging bad rows in Spark and Zeppelin [tutorial] For data modelers & consumers	1	13939	August 10, 2016
Process bad rows from Elasticsearch and form them into good rows Troubleshooting	5	3229	May 16, 2017

Bad Rows Mapping Kibana

Related topics