Enricher doesn't scale out correctly and keep getting errors

Hi,

We use the snowplow collector and enricher docker container on ECS, they both have 4 tasks running in parallel with 2 VCPUs and 4GB of RAM, but when I try to add more tasks to the enricher (for instance 6 tasks instead of 4) I do not see a difference in the amount of data read in total by the enrichers, so its seems like adding more enricher tasks does not help to read more data from the kinesis stream after the collector, any idea why?

Also looking at the logs of the enrichers I see those 2 errors coming back all the time:
1)[RecordProcessor-0005] ERROR com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Caught shutdown exception, skipping checkpoint.
2) com.amazonaws.services.kinesis.clientlibrary.exceptions.ShutdownException: Can’t update checkpoint - instance doesn’t hold the lease for this shard.
Anyone an idea of what could cause these errors?

Thanks in advance,

Alexandre

Hi @Alexandre5602 ,

its seems like adding more enricher tasks does not help to read more data from the kinesis stream after the collector, any idea why?

How many shards do you have in the Kinesis stream that holds the collector payloads ? A shard can be consumed by only one instance at a time, so if you have more Enrich instances than the number of shards, the additional instances will do nothing (you should be able to confirm that thanks to the logs).

Also looking at the logs of the enrichers I see those 2 errors coming back all the time:
1)[RecordProcessor-0005] ERROR com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Caught shutdown exception, skipping checkpoint.
2) com.amazonaws.services.kinesis.clientlibrary.exceptions.ShutdownException: Can’t update checkpoint - instance doesn’t hold the lease for this shard.
Anyone an idea of what could cause these errors?

It might be that the additional instances try to steal the lease of the other instances.