I downloaded snowplow_emr_r89_plain_of_jars.zip and tried to use JDecompiler to get the classes in snowplow-emr-etl-runner.jar. Although I got file names on the left pane of JD, but I cannot see any contents of those files in the JD right pane.
Thanks for help,
Hi @RichardJ - what is your goal here?
Sometimes data volume can be much larger than usual, and the pipeline doesn’t seem scale up well (at least in certain steps - and that might not be a snowplow problem rather it’s in our configuration). Right now emr-etl-runner/storage-loader is basically a black box to us. We rely on Jenkins’ logs to guess what’s going there. So if we have all the source classes/codes, we should be able to know what each step of the pipeline is exactly doing, and what each log line really means; then we may be able to skip or tune certain steps to get performance we need.
If somehow you didn’t know both EmrEtlRunner and StorageLoader are open-source applications, you don’t need to reverse-engineer them.
What is more important is that they also just thin wrappers and don’t make any heavy-lifting work. All your data volumes are processed on EMR cluster, which you need to scale manually according to your needs.
I know that’s open source appl, but wondering why JD did not decompile it?
That’s a question for the JD community.