Snowplow R90 Lascaux released

We’re tremendously proud to announce the release of Snowplow 90 Lascaux:

This release introduces RDB Loader, a new EMR step to replace StorageLoader JRuby app as well as several enhancements to EmrEtlRunner.


Hey @anton!

Quick question - since RDB Loader runs exclusively from the master node, does it make sense for the job to automatically kill the (expensive) core nodes after Shred completion? It this possible at all?


1 Like

Hey @bernardosrulzon,

This can be quite good idea. We didn’t consider it.

One problem though is that always next after RDB Loader S3DistCp should be launched, which from my understanding work on core nodes. Also, if it is possible to do, I think it should be a orchestration tool’s (such as EmrEtlRunner or Dataflow Runner) responsibility, not job itself.

However, feel free to create an issue either at snowplow or snowplow-rdb-loader repositories. We would like to explore it.

It’s pretty complex to achieve this as those same nodes need to be made available again for the final S3DistCp step. This will change when we switch from file moves to use of manifests, but that’s un-specced and a long way off…