Hi!
I remember from when I set up Snowplow in the past, I needed to set some Java options for the various parts of the pipeline. Something like -Xms512m -Xmx512m
. However, I can’t find any documentation around this. Are these still neccessary and if yes, what would be recommendend settings?
Thank you!
Hi @volderette, by default a JVM (java) application will configure itself to use 25% of the memory available on the server. That default applies to all JVM apps, not just Snowplow pipeline apps. E.g. if you use a EC2 instance with 2GB memory, then the app will self-set its max heap size to 512 Mb.
That 25% default is generally “safe”, meaning there is no chance you could exceed the memory available on the server. But… it also means you are wasting the remaining 75% of the memory available.
You might get slightly better performance out of your pipeline by setting the flags -XX:MinRAMPercentage=75 -XX:MaxRAMPercentage=75
which means use 75% of the memory available. Or you might use syntax like -Xms512m -Xmx512m
to explicitly set the heap to 512 M.
I found configuration reference for what each flag means. And there’s plenty of guides around like this one or this one.
1 Like
Thank you @istreeter! This is super helpful!
1 Like