High CPU utilization on startup resulting in unnecessary scaling operations

Rob_Ellison · February 12, 2024, 10:26am

Hi All,
I have Snowplow Enricher Kinesis deployed through EKS. I’ve noticed an issue where when the pod first starts up it has a very high CPU utilisation which results in further scaling operations.
If I look at the traffic profile and also the collector performance this doesn’t seem to justify the scaling operations.

Scaling is based on CPU utilisation at 50%. We are currently running on version 3.4.0.
The probe is setup as follows:

  livenessProbe:
    exec:
      command:
      - sh
      - -c
      - "pgrep java"
    initialDelaySeconds: 15
    periodSeconds: 20
  readinessProbe:
    exec:
      command:
      - sh
      - -c
      - "pgrep java"
    initialDelaySeconds: 15
    periodSeconds: 20
  startupProbe:
    exec:
      command:
      - sh
      - -c
      - "pgrep java"
    failureThreshold: 20
    periodSeconds: 10

Do you have any suggestions to prevent these issues?
Thanks

istreeter · February 12, 2024, 11:52am

Hi @Rob_Ellison I suggest take a look at HPA scaling policies. It is possible to configure the Kubernetes autoscaler to increase the periodSeconds between scaling events. In effect, this means you can disable autoscaling during the short period after Enrich has first started up and is consuming lots of cpu.

In future, we are likely to add a health probe to Enrich, so it waits until it has finished configuring itself until it reports itself as healthy. Once we have implemented this, then you won’t need a custom scaling policy, because the HPA ignores the cpu of pods that have not yet become healthy.

Topic		Replies	Views
Snowplow Enricher - CPU utilization issue Enrichment	5	936	March 2, 2023
Kinesis Enricher CPU usage recovers slowly after peak Enrichment	5	1726	December 15, 2021
Enricher high CPU utilisation issue Enrichment	33	4475	May 4, 2022
Scaling quickstart For engineers	6	792	October 17, 2022
Autoscaling in kubernetes for collector and enrich pubsub GCP pipeline	0	941	January 26, 2022

High CPU utilization on startup resulting in unnecessary scaling operations

Related topics