Can't load data back into redshift

mjensen · June 1, 2018, 12:53pm

i don’t get this. we had that amazon AWS redshift outage last night for an hour. so EMR failed at the loading step. usually i can just run this command to reload the data stuck in shredded folder but it’s stuck on enriched. and i see the good folder in enriched folder that it’s complaining about. also i remember an older release had a file on S3 that told me what files were put into enriched processing. if i could find that file again i could just start over and re-process those files from scratch but i can’t find that file anymore.

./snowplow-emr-etl-runner --version
snowplow-emr-etl-runner 0.30.0

/var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from rdb_load
D, [2018-06-01T11:59:08.980000 #7229] DEBUG – : Initializing EMR jobflow
E, [2018-06-01T11:59:16.549000 #7229] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found
F, [2018-06-01T11:59:16.553000 #7229] FATAL – :

Snowplow::EmrEtlRunner::UnexpectedStateError (No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:719:in get_latest_run_id' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:488:ininitialize’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:102:inrun’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41:in’
org/jruby/RubyKernel.java:979:in load' uri:classloader:/META-INF/main.rb:1:in’
org/jruby/RubyKernel.java:961:in require' uri:classloader:/META-INF/main.rb:1:in(root)’
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `’

mjensen · June 1, 2018, 5:46pm

@anton or @alex any ideas? i was going to rerun from scratch if i knew what files enrichment processed so i can re-process them but can’t find the list. other problem is dynamodb already has the shredded items on file from going through shred.

anton · June 1, 2018, 8:00pm

Hey @mjensen,

Sorry, I’m not sure I entirely follow the problem, but what command are you using to run recovery? If your previous run failed on load then you should try this:

./snowplow-emr-etl-runner --resume-from load ...

This should load data from shredded good and then archive both enriched and shredded.

mjensen · June 1, 2018, 8:01pm

@anton sorry forgot to paste that part:

/var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from rdb_load

and i’ve tried all steps and none work. tried resuming from every single one.

mjensen · June 2, 2018, 3:15pm

found the folder that contains the list of files i need to re-process if i clean out enriched and shredded folders.

2018-05-31 18:28:05 14574509 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680164459533124461988638724456562-49580171778532873512806680316801111357379993223926120562.lzo
2018-05-31 18:28:07 1672 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680164459533124461988638724456562-49580171778532873512806680316801111357379993223926120562.lzo.index
2018-05-31 18:28:08 11767428 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680316857930870901880863856787570-49580171778532873512806680447002422129875679103558877298.lzo
2018-05-31 18:28:06 1344 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680316857930870901880863856787570-49580171778532873512806680447002422129875679103558877298.lzo.index
2018-05-31 18:28:08 14185467 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210795026291226807694957553385602-49580171778555174258005210945608882392186414162277040258.lzo
2018-05-31 18:28:03 1640 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210795026291226807694957553385602-49580171778555174258005210945608882392186414162277040258.lzo.index
2018-05-31 18:28:06 11469797 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210945619762724562945824849395842-49580171778555174258005211075495872451582296456486191234.lzo
2018-05-31 18:28:07 1328 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210945619762724562945824849395842-49580171778555174258005211075495872451582296456486191234.lzo.index

mjensen · June 2, 2018, 5:15pm

just to close this out. i had no choice but to clean out shredded and enriched folders and then move the raw files in archive directory back into processing and then run --skip staging. i had to disable dynamodb for that batch since they already were run through it.

btw our dev cluster had the same exact problem as prod.

egor · June 4, 2018, 3:15am

Hi @mjensen,

E, [2018-06-01T11:59:16.549000 #7229] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found

This error occurs when EER can’t extract the latest run ID due to a large number of empty files ( *$folder$) in snowplow-enriched and snowplow-shredded buckets. The files get left around as part of the S3DistCp routine. There’s an open issue to add a maintenance step - (#3439).

As for now, you can create a script to remove the files regularly or remove them manually when there is such need with aws s3 rm command. Once the files are removed --resume-from rdb_load should work as expected.

Hope this helps.

mjensen · June 4, 2018, 12:20pm

@egor thanks,

Topic		Replies	Views
Can't resume from rdb-load step Storage targets	8	2368	March 13, 2018
EmrEtlRunner not loading data into RedShift For engineers	22	2155	November 11, 2019
Loading data from s3 to Redshift after EmrEtlRunner Troubleshooting	7	3574	November 19, 2018
EmrEtlRunner::EmrExecutionError in the 3rd stage of the process AWS batch pipeline (Legacy)	4	2298	October 23, 2017
EmrExecutionError - Enriched HDFS -> S3: FAILED Enrichment	7	1343	May 3, 2019

Can't load data back into redshift

Related topics