How to collect all the badrows & badrows complete classification

I have built a badrows pipeline with the Google Cloud Storage loader. For now, I use GCS loader to load only enriched-bad topic in PubSub. And I only received five kinds of errors, including schema violations, adapter failures, enrichment failures, collector_payload_format_violation, and tracker_protocol_violations. But I saw there are eleven errors listed in this repo (snowplow-badrows-tables/bigquery at master · snowplow-incubator/snowplow-badrows-tables · GitHub). I am also not quite sure how I can collect all the existing bad rows in my pipeline. Do I need to set up multiple dataflow jobs for each topic, like bad, bq-bad-rows, bq-failed-inserts?

Another question is about how to classify all the error types into the four categories below in the image.

This image is found online, which should be the official classification. But I only found six errors on the page(maybe some errors are folded). Is there any way to find the complete classifications and their detailed subdivisions?


The full list can be found in the docs here.

I’m making a guess as to how to categorise them but I’d probably go

adapter_failures (collector)
collector_payload_format_violation (collector)
enrichment_failures (enrichment)
generic_error (any)
loader_iglu_error (destination)
loader_parsing_error (destination)
loader_recovery_error (recovery)
loader_runtime_error (destination)
recovery_error (recovery)
relay_failure (destinations)
schema_violations (enrichment)
size_violation (size validation / collector)
snowflake_error (destination)
tracker_protocol_violation (tracker protocol)

Note that each of these schemas has a processor.artifact field associated with it which will tell you what component raised the failed event.