Snowplow Enricher - CPU utilization issue

Shalini_Balakrishnan · January 30, 2023, 8:09am

Is it possible to scale Enricher vertically?
We are using EKS pods and streams are handled through Kinesis.
There cpu utilization of collector and loader seems fine but Enricher is something which creates bottleneck.

Please suggest the best options available.

BenB · January 30, 2023, 8:47am

Hi @Shalini_Balakrishnan ,

Can you provide more details please ? I.e. number and size of containers, throughput, CPU utilization

Do you mean that it’s not able to process events as quick as they arrive?

Alexandre5602 · March 1, 2023, 4:22pm

Hi Ben,

I have a similar issue indeed that the enricher cannot follow the speed at which events arrive and seems to be the bottleneck in my setup, both the enricher and collector run on ECS and we use kinesis streams with on demand capacity between the collector and enricher.
The maxRecords value in the enricher config is already set to 10 000 which I believe is the maximum possible value to get records on a kinesis stream anyway.

Any idea how to scale up the enricher?

Thanks for the help,

Alexandre

josh · March 2, 2023, 5:26am

Hi @Alexandre5602 both the Collector and Enrich can scale horizontally. So you should be able to add auto-scaling rules based on CPU for both of them to add extra pods as demand increases. Scaling up at 60-70% CPU across the group is generally a good rule of thumb and scaling down when you hit less than 20% CPU should work well.

The only caveat here is that for Enrich you shouldn’t have more pods than you have shards in the Kinesis Stream. Even though you are using on_demand mode under the hood you still have a certain number of shards being allocated that need to be distributed and if you have more pods than shards you will skew the auto-scaling logic as it will need to be able to grab any shards to process (so CPU will be artificially low).

This caveat is true for any Kinesis consumer application.

Hope this helps!

Alexandre5602 · March 2, 2023, 9:22am

Hi Josh,

Thanks a lot for your response I thought about scaling vertically the ecs task but not horizontally, it does makes sense to create several tasks in parallel for more read throughput.

The only issue with what you mentioned above is with the auto scaling right? Let’s say I fix the kinesis stream to have 3 shards but the number of enricher tasks will be defined by the CPU, I might end up with sometimes only 1 ecs task reading 3 shards (when traffic is low) and sometimes 3 ecs tasks to read the 3 shards(which would be optimal of course).

What’s your solution or opinion on this?

Thanks a lot,

josh · March 2, 2023, 11:00pm

Hi @Alexandre5602 this is the whole function of auto-scaling! During peaks you have more consumers and during lulls in traffic it reduces the number of tasks. This is how we configure scaling to work internally for our customers and it works quite well.

I would start with implementing the auto-scaling policies and playing with the CPU thresholds to trigger scaling on until you achieve the stability and throughput you need.

Topic		Replies	Views
Scaling kinesis enricher for high loads Enrichment	11	2345	December 11, 2018
Kinesis Enricher CPU usage recovers slowly after peak Enrichment	5	1726	December 15, 2021
Enricher doesn't scale out correctly and keep getting errors Enrichment	2	905	April 29, 2024
Compute profiles of Scala Collector & Enricher Enrichment	3	1460	November 29, 2016
Enricher high CPU utilisation issue Enrichment	33	4476	May 4, 2022

Snowplow Enricher - CPU utilization issue

Related topics