RT pipeline - Kinesis stream read throughput exceeded

moshesh · April 2, 2019, 3:46pm

Hi,

I’ve setup an events pipeline, something similar to the lambda architecture here: Is my version of snowplow lambda architecture correct

I have a scala stream collector, writing into kinesis stream. The stream has 20 shards.
This stream has 2 consumers - stream enrich and kinesis firehose.

When I run load test with about 700 request per second, I get provisioned read throughput exceeded alert from AWS and I feel that it shouldn’t happen with 20 shards.
20 shards mean that the consumers can consume up to 40MiB per second, in total.
I really don’t think that I reach this, and I don’t get write throughput exceeded alert while the collector is allowed to write in a rate of 1MiB per second (half of the allowed the read rate).

Another read limit is 5 requests per second per shard, so I suspect that the consumers try to read in a higher rate.

Another interesting thing is that the scala stream enrich write to a kinesis stream with two consumers as well - elasticsearch loader and s3 loader. This stream has only 10 shards. But here I don’t get any alert on the read operations.

Did anyone run into this issue or have any idea what could be the cause?

BTW, I checked with AWS, kinesis firehose doesn’t support enhanced fanout at the moment…

Thanks.

mike · April 2, 2019, 10:51pm

What do your Get Records and Read Throughput Exceeded Cloudwatch metrics look like?

20 shards is quite high for a 700 requests/second on the write side so it sounds like you’re probably jut hitting read limits from Firehose.

moshesh · April 11, 2019, 6:41am

Thanks @mike,

This is probably the case. I talked to AWS support and it seems that I reached 6 get records per second with those two consumers.

I think that more shards is the only solution at the moment, until kinesis firehose and/or stream enrich will support the enhanced fanout feature.

Topic		Replies	Views
~350k rpm of throughput with Stream Collector/Kinesis AWS real-time pipeline	3	3110	July 26, 2016
Scala Stream Collector and Kinesis Shards AWS real-time pipeline	1	1800	September 13, 2017
Scaling kinesis enricher for high loads Enrichment	11	2345	December 11, 2018
Lag in the elasticsearch Enrichment	2	1242	August 2, 2018
Scaling quickstart For engineers	6	794	October 17, 2022

RT pipeline - Kinesis stream read throughput exceeded

Related topics