S3 Extended Request ID

buddhi_weragoda · January 17, 2020, 3:30am

HI

To investigate my issue I pulled the cluster details and found that all the clusters failed to execute the step function and all of them failed with the same issue.

com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: B45741D03; S3 Extended Request ID: ZK7GWdk03GRanA0EGKSQBYV48PxSkWQgepd2ke795DPLxliAiaYPwF7kIj1q+=), S3 Extended Request ID: ZK7GWdk03GRanA0EGKSQBYV48PxSkWQgepd2ke795v97zIYs=

Looking further into the same step function error logs, I also found the following error:

Error: java.lang.RuntimeException: Reducer task failed to copy 488 files: s3://wogaa-snowplow-production-sentiments-kinesis/2020-01-15-4960066067674028270618312387666095132115106-496006606767402827061831029376723819246242.gz etc

Taking it further, I found for your s3-dist-cp job there were 31 reduce jobs launched out of which 13 passed and the rest of them failed. This was because the s3-dist-cp command launches as many reducers as possible to increase speed up the copy. This is usually effective in getting the copy done as soon as possible. However, when the EMR cluster is big, you can quickly reach the API rate limit imposed by S3, which is described in the following AWS documentation. When you copy data from HDFS to S3, the corresponding rate limit is 3,500 PUT requests per second.

Can I can reduce the request rate by limiting the number of reducers writing to S3 using property -Dmapreduce.job.reduces=X. Reducing the number of reducers might slow down the job but it can help in completing it successfully without any S3 issues

Or do you have any good solution for this? This is a very painful issue for me from 2 weeks.

ihor · January 18, 2020, 12:18am

@buddhi_weragoda, you can control the size of the files being written to S3 by tweaking S3 Loader config. Also, when moving shredded files from HDFS to S3 you can engage consolidate_shredded_output to ensure fewer files are produced by shredding step and thus prevent AmazonS3Exception.

buddhi_weragoda · January 23, 2020, 1:47am

Thanks a lot for your reply. it’s works after change ** consolidate_shredded_output** as True.

Aurimas_Griciunas · June 25, 2020, 2:38pm

Hi, Ihor,

Is there a possibility to only apply consolidation on shredded-types?
I am asking since setting of “consolidate_shredded_output: true” also tries to consolidate both enriched events and atomic-events (in my case s3DistCp on enriched events takes more time than enrichment itself).

ihor · June 25, 2020, 3:46pm

@Aurimas_Griciunas, consolidate_shredded_output affects only shredded data (consolidate - fewer files in output - and compress). Compression (no consolidation) of the enriched data is affected by

enrich:
  output_compression: GZIP

Aurimas_Griciunas · June 25, 2020, 3:54pm

@ihor I don’t think that’s the case, since the s3distcp tool is being run with “–groupBy .(part-)\d±(.) --targetSize 24” flags, which means that the data is also being consolidated or am I missing something? We only now upgraded to use use EmrEtlRunner that supports consolidation and “[enrich] spark: Enriched HDFS -> S3” step now runs longer than the enrichment itself compared to when we only had “–srcPattern .part-.” without consolidation (in this case the s3distcp finished in 1 minute or so). Am I missing something?

Topic		Replies	Views
Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down issues AWS batch pipeline (Legacy)	2	6670	November 13, 2017
Error in Raw S3 -> Raw HDFS Step AWS batch pipeline (Legacy)	0	1312	June 28, 2018
How to engage EMRFS consistency when running snowplow-emr-etl-runner AWS batch pipeline (Legacy)	1	2809	November 10, 2017
EMR ETL perfomance Enrichment	11	2080	January 25, 2017
EMR failure - could only be replicated to 0 nodes instead of minReplication (=1) AWS batch pipeline (Legacy)	3	6034	October 13, 2016

S3 Extended Request ID

Looking further into the same step function error logs, I also found the following error:

Error: java.lang.RuntimeException: Reducer task failed to copy 488 files: s3://wogaa-snowplow-production-sentiments-kinesis/2020-01-15-4960066067674028270618312387666095132115106-496006606767402827061831029376723819246242.gz etc

Related topics