Hey @Frank_Fineis!
It seems that RDB Loader “fails” on what seems to be dogfooding of monitoring details back to Snowplow. When Loader finishes its work it reports back to a Snowplow collector you have configured in monitoring.snowplow
section in config.yml
file and seems that the collector is not responding in time. Unless you have this requirement - it’s completely optional and doesn’t bear much of useful information, my guess is that you have monitoring.snowplow
section configured by mistake.
What confuses me the most is that it’s an error, while clearly must be just a warning.
I’ll investigate if this is still the case in latest 1.0.0 version, but very likely it is not. There’s a lot of changes since the version you use (presumably R32). An important detail is that RDB Loader is not running on EMR cluster anymore (and it’s never been a Spark job, btw, so setting timeout wouldn’t help anyway). I think you might want to consider upgrading to the latest version: