Hi,
With snowplow-emr-etl-runner-r117, our ETL job is failing at the “[shred] spark: Shred Enriched Events” step. Lots of *.gz files are left in S3 enriched/good/run=2021-06-01-08-30-14/stream/ .
stderr for the failed step doesn’t offer much of a clue:
21/06/01 16:07:05 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.0.1.79
ApplicationMaster RPC port: 0
queue: default
start time: 1622563598944
final status: FAILED
tracking URL: http://ip-10-0-1-129.ec2.internal:20888/proxy/application_1622563416514_0002/
user: hadoop
Exception in thread “main” org.apache.spark.SparkException: Application application_1622563416514_0002 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1104)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1150)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/06/01 16:07:05 INFO ShutdownHookManager: Shutdown hook called
When I try to restart the job with “-f shred” I get the same error.
I am trying to troubleshoot this service that was installed by Someone Who Is No Longer With The Company, so am really groping here.
Is there another place I should be looking for more informative error logs?
Any advice appreciated.