I am a Snowplow Insights customer, so the pipeline infrastructure exists in an AWS Sub-Account. I’d like to set up a consumer for the Kinesis enriched stream using a subscribed Lambda, an approach documented here:
The problem is that a Lambda can only be subscribed to a Kinesis Stream existing in the same account, so I’m wondering what best practice would be for making the stream consumable from our main account?
Our primary use-case is to trigger processes in our application in reaction to events in the stream.
My current approach is to have a Lambda in the sub-account subscribed to the stream, with permissions to assume a role and invoke another Lambda on the main account passing in the payload, as suggested below. I have this working as a proof of concept but I’d appreciate advice on whether there is a better way to do it.
As a Snowplow Insights customer, this kind of question is best directed through our dedicated support and customer success teams. Could you pop this into an email to support@snowplowanalytics.com?
This will make it easier for our Engineering and Tech Ops team to collaborate to get it sorted for you.
For the benefit of anyone stumbling across this, it boiled down to two serverless options:
Sub-account Lambda pushes to a replica Kinesis Stream in the main account
Sub-account Lambda publishes events to an SQS Queue in the main account
SQS was the clear choice for us as we wanted our solution to auto-scale. AWS managed auto-scaling isn’t available for Kinesis Streams and to implement our own scaling solution would have added too much complexity to our project.
And about the auto scaling: depends what you want to do. Kinesis Firehose and Kinesis Analytics Applications (Apache Flink) do scale automatically. We have good results with a Flink application that is auto scaled by AWS. It is doing real time analytics of news paper articles (around a million read events/day).