A few days ago my pipeline suddenly broke down, with the following output:
D, [2017-02-08T00:00:19.333000 #16545] DEBUG – : Waiting a minute to allow S3 to settle (eventual consistency)
D, [2017-02-08T00:01:19.337000 #16545] DEBUG – : Initializing EMR jobflow
D, [2017-02-08T00:01:20.782000 #16545] DEBUG – : EMR jobflow j-1OM63IVDKT8P2 started, waiting for jobflow to complete…
F, [2017-02-08T00:17:22.217000 #16545] FATAL – :
Snowplow::EmrEtlRunner::EmrExecutionError (EMR jobflow j-1OM63IVDKT8P2 failed, check Amazon EMR console and Hadoop logs for details (help: Troubleshooting jobs on Elastic MapReduce · snowplow/snowplow Wiki · GitHub). Data files not archived.
au Snowplow ETL: TERMINATED_WITH_ERRORS [VALIDATION_ERROR] ~ elapsed time n/a [ - 2017-02-08 00:16:55 +0000]
- Elasticity S3DistCp Step: Shredded HDFS → S3: CANCELLED ~ elapsed time n/a [ - ]
- Elasticity Scalding Step: Shred Enriched Events: CANCELLED ~ elapsed time n/a [ - ]
- Elasticity S3DistCp Step: Enriched HDFS _SUCCESS → S3: CANCELLED ~ elapsed time n/a [ - ]
- Elasticity S3DistCp Step: Enriched HDFS → S3: CANCELLED ~ elapsed time n/a [ - ]
- Elasticity Scalding Step: Enrich Raw Events: CANCELLED ~ elapsed time n/a [ - ]):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:475:inrun' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in
send_to’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in
block in redefine_method’
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:68:inrun' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in
send_to’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in
block in redefine_method’
uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:39:in<main>' org/jruby/RubyKernel.java:973:in
load’
uri:classloader:/META-INF/main.rb:1:in<main>' org/jruby/RubyKernel.java:955:in
require’
uri:classloader:/META-INF/main.rb:1:in(root)' uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in
’
Error running EmrEtlRunner, exiting with return code 1. StorageLoader not run
D, [2017-02-08T06:00:11.778000 #17104] DEBUG – : Staging raw logs…
F, [2017-02-08T06:00:13.640000 #17104] FATAL – :
Snowplow::EmrEtlRunner::DirectoryNotEmptyError (Should not stage files for enrichment, processing bucket s3n://au-snowplow-analytics-etl/processing/ is not empty):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/s3_tasks.rb:124:instage_logs_for_emr' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in
send_to’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in
block in redefine_method’
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:51:inrun' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in
send_to’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with' uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in
block in redefine_method’
uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:39:in<main>' org/jruby/RubyKernel.java:973:in
load’
uri:classloader:/META-INF/main.rb:1:in<main>' org/jruby/RubyKernel.java:955:in
require’
uri:classloader:/META-INF/main.rb:1:in(root)' uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in
’
I don’t understand what that VALIDATION_ERROR
means, and opening the EMR web console on AWS, What I see is in events is this failure, and all steps seems to have been canceled:
Feb 8 01:16 AM Amazon EMR Cluster j-1OM63IVDKT8P2 (Au Snowplow ETL) has terminated with errors at 2017-02-08 00:16 UTC with a reason of VALIDATION_ERROR. j-1OM63IVDKT8P2 Cluster Cluster State Change CRITICAL February 8, 2017 at 01:16:58 AM (UTC+1)
I can’t find anything in the log in the job log folder on S3 (but I might be mistaken, I think it’s pretty hard to understand what the different log files are exactly). Inside the job folder only the node
folder was created with log files and sub directories were created, before the job was terminated. Also this pages doesn’t give much help - I think it’s quite outdated unfortunately Troubleshooting jobs on Elastic MapReduce · snowplow/snowplow Wiki · GitHub