EmrEtlRunner skip issues configuration

yali · July 21, 2016, 1:26pm

Hi Asaf,

When the EMR process runs, any data that cannot be successfully processed will be written to the ‘bad’ bucket. It is normal to have several of these lines generated with each run. We have guides to using these bad rows to debugging upstream issues here, here and here.

However, this process is non-blocking. The reason your EMR job is failing will not be related to the bad rows.

When the EMR job fails you should get an error message back from EmrEtlRunner. In addition it should be possible to look in the AWS EMR console and see what an error message there.

If you can share with us those error messages we’ll be in a better position to help you diagnose the route cause of the failures.

All the best,

Yali

Topic		Replies	Views
EmrEtlRunner::EmrExecutionError in the 3rd stage of the process AWS batch pipeline (Legacy)	4	2298	October 23, 2017
EmrEtlRunner stops with no error Enrichment	5	1642	June 20, 2017
Error on EmrEtlRunner, S3 not empty Enrichment	2	2068	December 16, 2016
Second job for importing bad rows Troubleshooting	1	1466	June 9, 2016
Emretlrunner executionerror data files not archived AWS batch pipeline (Legacy)	3	1459	September 27, 2017

EmrEtlRunner skip issues configuration

Related topics