SendGrid+Snowplow+AWS S3&Redshift

anton · October 22, 2019, 4:06am

Yes, that’s right. I’m not even sure what other options we could have.

Yup, this is just an EC2 SSH key. You need to create one via EC2 console and specify its name. Using this key you’ll be able to log in to EMR master node (but I’m 99.9% sure you won’t ever need to do this). You can get list of existing SSH keys (or create new one) in AWS Console → EC2 → Key pairs.

subnet_id

Same here - it can be any EC2 subnet in your account. You can get list of existing Subnets (or create new one) in AWS Console → VPC → Subnets. IIRC pretty much any default configuration would work.

UPD: let’s use single thread, this one or RDB Loader, Storage Loader, EmrEtlRunner - #2 by ihor. Otherwise it’s hard to track what’s going on.

Topic		Replies	Views
Shredding to Redshift in the Scala Collector Flow AWS batch pipeline (Legacy)	2	2118	September 24, 2017
Snowplow Kinesis to EmrEtl For engineers	4	1773	July 31, 2019
'Serverless' Snowplow architecture For engineers	7	3513	June 1, 2017
How to shred events into Redshift from the real-time pipeline? AWS real-time pipeline	2	3013	May 25, 2016
Enriched event stream into Redshift using Kinesis Firehose AWS real-time pipeline	7	5765	May 31, 2016

SendGrid+Snowplow+AWS S3&Redshift

Related topics