Yes, that’s right. I’m not even sure what other options we could have.
Yup, this is just an EC2 SSH key. You need to create one via EC2 console and specify its name. Using this key you’ll be able to log in to EMR master node (but I’m 99.9% sure you won’t ever need to do this). You can get list of existing SSH keys (or create new one) in AWS Console → EC2 → Key pairs.
subnet_id
Same here - it can be any EC2 subnet in your account. You can get list of existing Subnets (or create new one) in AWS Console → VPC → Subnets. IIRC pretty much any default configuration would work.
UPD: let’s use single thread, this one or RDB Loader, Storage Loader, EmrEtlRunner - #2 by ihor. Otherwise it’s hard to track what’s going on.