We are attempting to reprocess events (not bad events) from the Clojure collector, and are running into trouble.
A new S3 bucket has been created, and set as the ‘in’ bucket for the EmrEtlRunner config to reprocess a few months worth of raw clojure events from the archive. Now that the clojure files are being renamed per the R91 release (https://snowplowanalytics.com/blog/2017/08/17/snowplow-r91-stonehenge-released-with-important-bug-fix/), we keep receiving the following error:
ERROR FileFormatWriter: Aborting job null. java.io.IOException: Not a file:
The bucket structure after the staging has been completed looks like the following:
run=2018-08-28-12-00-00
i-1/
var_log_tomcat8_rotated_localhost_access_log.txt1502463662.gz
i-2/
var_log_tomcat8_rotated_localhost_access_log.txt1502467262.gz
The in setting on the bucket looks like:
s3n://bucket-name/
Any archive folders that were created prior to the R91 release do not experience this issue. Is this the expected behavior?