EmrEtlRunner Config / environment variables not recognized

Hi @dadasami,

Does " RDB Loader R35 " offer the same functionality?

Yes.

EmrEtlRunner used to be Snowplow orchestration tool. Back in the day it was responsible for launching enrichment, shredding, loading, archiving etc, which was very sensible in batch-first world of 2014.

But as pipeline was getting more stream-oriented and less batch oriented we moved out data collection and enrichment into separate components and made them independent of EmrEtlRunner. The last functionality that could not work without EmrEtlRunner was shredding and loading into Redshift. And, in R35 we made it independent as well.

To put it in other words, if you setup EmrEtlRunner - you still will have to setup RDB Loader and Shredder (EmrEtlRunner responsible for launching them though) from R34 (or earlier). So you’d have EmrEtlRunner + Shredder + Loader R34. But if you go with R35 you can remove EmrEtlRunner from that chain. Not to mention it’s cheaper and more efficient.

One problem that I foresee though is that RDB Shredder and Loader accept its configuration as base64-encoded strings, where you must put all settings as is and cannot use environment variables.

1 Like