S3 Extended Request ID


To investigate my issue I pulled the cluster details and found that all the clusters failed to execute the step function and all of them failed with the same issue.

com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: B45741D03; S3 Extended Request ID: ZK7GWdk03GRanA0EGKSQBYV48PxSkWQgepd2ke795DPLxliAiaYPwF7kIj1q+=), S3 Extended Request ID: ZK7GWdk03GRanA0EGKSQBYV48PxSkWQgepd2ke795v97zIYs=

Looking further into the same step function error logs, I also found the following error:

Error: java.lang.RuntimeException: Reducer task failed to copy 488 files: s3://wogaa-snowplow-production-sentiments-kinesis/2020-01-15-4960066067674028270618312387666095132115106-496006606767402827061831029376723819246242.gz etc

Taking it further, I found for your s3-dist-cp job there were 31 reduce jobs launched out of which 13 passed and the rest of them failed. This was because the s3-dist-cp command launches as many reducers as possible to increase speed up the copy. This is usually effective in getting the copy done as soon as possible. However, when the EMR cluster is big, you can quickly reach the API rate limit imposed by S3, which is described in the following AWS documentation. When you copy data from HDFS to S3, the corresponding rate limit is 3,500 PUT requests per second.

Can I can reduce the request rate by limiting the number of reducers writing to S3 using property -Dmapreduce.job.reduces=X. Reducing the number of reducers might slow down the job but it can help in completing it successfully without any S3 issues

Or do you have any good solution for this? This is a very painful issue for me from 2 weeks.

@buddhi_weragoda, you can control the size of the files being written to S3 by tweaking S3 Loader config. Also, when moving shredded files from HDFS to S3 you can engage consolidate_shredded_output to ensure fewer files are produced by shredding step and thus prevent AmazonS3Exception.

1 Like

Thanks a lot for your reply. it’s works after change ** consolidate_shredded_output** as True.

Hi, Ihor,

Is there a possibility to only apply consolidation on shredded-types?
I am asking since setting of “consolidate_shredded_output: true” also tries to consolidate both enriched events and atomic-events (in my case s3DistCp on enriched events takes more time than enrichment itself).

@Aurimas_Griciunas, consolidate_shredded_output affects only shredded data (consolidate - fewer files in output - and compress). Compression (no consolidation) of the enriched data is affected by

  output_compression: GZIP

@ihor I don’t think that’s the case, since the s3distcp tool is being run with “–groupBy .(part-)\d±(.) --targetSize 24” flags, which means that the data is also being consolidated or am I missing something? We only now upgraded to use use EmrEtlRunner that supports consolidation and “[enrich] spark: Enriched HDFS -> S3” step now runs longer than the enrichment itself compared to when we only had “–srcPattern .part-.” without consolidation (in this case the s3distcp finished in 1 minute or so). Am I missing something?