Kinesis S3 Sink not reading stream

bryce · September 6, 2017, 8:37pm

I’ve launched a new environment for our Snowplow pipeline and the Kinesis S3 Sink doesn’t appear to be correctly reading from the Kinesis stream.

The Scala Stream Collector is logging that it is publishing events to the stream, and I see in the AWS console monitoring that the stream is indeed receiving records.

The S3 sink runs, but never outputs any information and I never see files in S3. When the app starts it outputs the KinesisConnectorConfiguration which shows the correct input stream name, S3 bucket, etc.

That configuration output is the last thing it shows.

I do see some monitoring events (heartbeat) coming into the collector from the Kinesis S3 Sink app.

I’ve previously setup the pipeline with what I believe is the exact same configuration (with just different stream/bucket names) and it’s working properly, but this environment isn’t.

I tried deleting the DynamoDB table created by the sink and restarting the sink. That didn’t seem to help.

Any idea what could cause this?

bryce · September 6, 2017, 8:50pm

Oh just kidding, I think deleting the DynamoDB table did solve the problem. Still not sure what was really going on though; if anybody knows I’d love to understand!

alex · September 6, 2017, 10:15pm

Did you re-use the same application name (and thus the same DynamoDB table) between your different environments? This plays havoc with consumers like Kinesis S3.

bryce · September 7, 2017, 4:44pm

@alex The application names were different between environments.

However I did create this environment with a given name, then tear down and rebuild the infrastructure with the same name (including deleting/recreating the Kinesis streams).

I understand the DynamoDB table gets created when the Kinesis S3 Sink is started the first time. Do you think that I recreated the streams but didn’t manually delete the DynamoDB table before starting the sink in the rebuilt environment could cause similar havoc?

Thanks for your reply!

alex · September 7, 2017, 5:35pm

Yep, I suspect that your rebuild left the DynamoDB table referencing invalid Kinesis sequence numbers, so the Kinesis S3 app could not operate.

bryce · September 7, 2017, 5:40pm

This makes sense! Thanks again!

Topic		Replies	Views
Can't see any data sinked into S3 AWS real-time pipeline	12	3741	August 24, 2017
Confused about Stream Enrich -> S3Loader Step For engineers	12	2101	September 26, 2018
Reading data from raw stream to both batch and real time For engineers	2	924	November 27, 2017
Problem Sinking from Scala Stream Collector to Kinesis Collectors	3	1436	August 23, 2016
Nothing Happening In S3 Or DynamoDb Enrichment	0	1066	September 14, 2020

Kinesis S3 Sink not reading stream

Related topics