Hi team-snowplow,
I have a Clojure collector and batch pipeline set up in an AWS account dedicated to Snowplow, but the Redshift instance it’s loading into is in another AWS account. When the EMR-initiating instance used to run the storageloader step itself, the cross-account data load worked via whitelisted security group configured on the persistent instance. To replicate that, I believe I’ll need to apply the same group to the EMR slave instance(s), but that means Snowplow needs to set it in AdditionalSlaveSecurityGroups[1] when launching the EMR cluster, right?
Is there some config that will enable me to pass this specifically, or a general EMR config that gets passed down through?
[1] https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-additional-sec-groups.html