-
Events that are too large are generally unrecoverable. There is a new (soon to be released) bad row format that attempts to better address this.
-
You could use Kinesis Analytics to pipe your raw stream to Kinesis Firehose that is configured to send data to s3 if you’re concerned that having only the enriched events in S3 but I think many might think thats overkill if you have proper testing of your event payloads against something like snowplow-mini.
-
You can run the s3-loader on the same ec2 instance that you run the collector on if you want. Some users run many of the components of the pipeline in containers with other things running on those hosts.
2 Likes