Stream Enrich in Kubernetes cluster

Hi,

I was wondering if anybody had the chance setting up multiple stream enrich containers that work against the same kinesis stream?
I want to see if I can improve the performance of the pipeline.

I just want to understand how the shard assignment is managed (if at all), or if I need to handle it myself (and if so - how).

I couldn’t find anything in the documentation about that.

Thanks.

this definitely works with kafka, but never tried with kinesis.

Thanks @evaldas!

Did you have to do anything special for that or just add more containers?

if you run stream enrich as separate pod just use scale command to increase the pod count. From what I understand it uses kafka consumer group to synchronize messages in between each container (enrich kafka consumer), which avoids duplication.

Hi @moshesh,

I’m running the Snowplow pipeline on AWS ECS and stream enrich runs on multiple containers.

As far as I know, stream enrich uses KCL (Kinesis Client Library). This library handles the shard assignment (and re-assignment on scaling) for you. Here are some references: