I’ve had the batch pipeline running for over a week now and the other day the EMRETLRunner began to error out with the following message:
`D, [2016-06-18T00:05:07.233000 #8] DEBUG – : Waiting a minute to allow S3 to settle (eventual consistency)
D, [2016-06-18T00:06:07.237000 #8] DEBUG – : Initializing EMR jobflow
F, [2016-06-18T00:06:09.814000 #8] FATAL – :
Snowplow::EmrEtlRunner::UnmatchedLzoFilesError (Processing bucket contains 4775 .lzo and .lzo.index files, expected an even number):
/usr/local/bin/snowplow-emr-etl-runner!/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:99:in initialize' /usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/method_reference.rb:46:in
send_to’
/usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:305:in call_with' /usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:159:in
common_method_added’
/usr/local/bin/snowplow-emr-etl-runner!/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:67:in run' /usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/method_reference.rb:46:in
send_to’
/usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:305:in call_with' /usr/local/bin/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:159:in
common_method_added’
file:/usr/local/bin/snowplow-emr-etl-runner!/emr-etl-runner/bin/snowplow-emr-etl-runner:39:in (root)' org/jruby/RubyKernel.java:1091:in
load’
file:/usr/local/bin/snowplow-emr-etl-runner!/META-INF/main.rb:1:in (root)' org/jruby/RubyKernel.java:1072:in
require’
file:/usr/local/bin/snowplow-emr-etl-runner!/META-INF/main.rb:1:in (root)' /tmp/jruby5459262531968505237extract/jruby-stdlib-1.7.20.1.jar!/META-INF/jruby.home/lib/ruby/shared/rubygems/core_ext/kernel_require.rb:1:in
(root)’
Error running EmrEtlRunner, exiting with return code 1. StorageLoader not run`
I can clear out the events staged for processing and it will continue to run normally but eventually it always does this. I don’t want to lose a days worth of data each time this happens. What is causing this and how can I fix it?