I use the Terraform modules for the Open Source Snowplow Pipeline and I recently noticed that none of my bad events were loading into my Postgres Database… they did load into S3 (which is where I ended up going), but I wanted to track down why this was happening.
I hadn’t updated the Postgres loaders, but the EC2 instances had rebuilt in the past week, so it wasn’t a case of needing to restart them. My kinesis bad stream was set to 24 hours data retention, so if it were one bad bad event, that should have cleared.
I found the last bad event had loaded into the database on the 6th Jan 2022, which was just before I updated the Collector to patch the Log4J vulnerability, bringing the collector up to version 2.4.5.
I then saw an error in the logs when forcing a bad event with a CURL.
[ioapp-compute-1] [1;31mERROR[0;39m [36mc.s.s.postgres.streaming.Sink[0;39m - Failed StatementExecution: ERROR: value too long for type character varying(32)
Okay, so I took a look at all the tables in
atomic_bad to see what had a 32 character limit. Payload collector, encoding, processor.version, payload raw loader name… I then downloaded the raw event JSON from S3 and ran it through prettier.
Samples from a couple of bad events:
whereas it used to be
ssc-2.3.1-kinesis (taken from the database).
So that’s the culprit - a schema violation. The collector name is now more than 32 characters.
It’s not a huge deal, it just makes it harder to catch schema violations when testing in iOS (which is what I was doing); however it is broken and will be affecting everyone on v2.4.5 of the collector.
So I can personally just create a migration to update the offending rows, but I figured a. someone else might be in the same situation if they keep their modules as obsessively up to date as I do; and b. it should probably get an official bug fix.