BigQuery Stream Loader becomes very non-performant after processing large numbers of events

Hello.

We have been implementing the snowplow pipeline on GCP using the BQ stream-loader to insert our events into BiqQuery. While performing load testing to see how the pipeline would handle running in production, we noticed that we were building up a significant number of unacked events in our enriched events subscription and saw that the ack rate was very low (1-2 messages per second). Restarting the BQ loader container fixed it temporarily, but after an hour or so, we saw the ack rate decline again. We attempted using a larger VM (n2-standard-4: 4 vCPUs 16 gigs RAM) which significantly increased our ack rate to around 750 messages/second but again after about an hour we saw that drastically decline while still sitting at 100% vCPU usage. I looked into the VM running htop and saw that there were 4 processes that had been running for over an hour each that were each taking up 100% of their respective vCPUs. When checking the container logs I then saw this same warning repeating over and over again:

[WARNING] Your app's responsiveness to a new asynchronous event (such as a new connection, an upstream response, or a timer) was in excess of 100 milliseconds. Your CPU is probably starving. Consider increasing the granularity of your delays or adding more cedes. This may also be a sign that you are unintentionally running blocking I/O operations (such as File or InetAddress) without the blocking combinator.

It then eventually culminated in this error:

com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)
com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)
com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)
com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)
[io-compute-0] ERROR com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Shutdown - Error on sinking and checkpointing events
com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)
com.google.cloud.bigquery.BigQueryException: Error writing request body to server
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54)
        at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2140)
Caused by: java.io.IOException: Error writing request body to server
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(Unknown Source)
        at java.base/sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.base/java.io.BufferedOutputStream.flush(Unknown Source)
        at java.base/java.util.zip.DeflaterOutputStream.flush(Unknown Source)
        at com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1200)
        at com.google.api.client.json.jackson2.JacksonGenerator.flush(JacksonGenerator.java:46)
        at com.google.api.client.http.json.JsonHttpContent.writeTo(JsonHttpContent.java:77)
        at com.google.api.client.http.GZipEncoding.encode(GZipEncoding.java:53)
        at com.google.api.client.http.HttpEncodingStreamingContent.writeTo(HttpEncodingStreamingContent.java:48)
        at com.google.api.client.http.javanet.NetHttpRequest$DefaultOutputWriter.write(NetHttpRequest.java:76)
        at com.google.api.client.http.javanet.NetHttpRequest.writeContentToOutputStream(NetHttpRequest.java:174)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:117)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
        at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:505)
        at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097)
        at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91)
        at cats.effect.unsafe.WorkerThread.blockOn(WorkerThread.scala:668)
        at scala.concurrent.package$.blocking(package.scala:124)
        at cats.effect.IOFiber.runLoop(IOFiber.scala:953)
        at cats.effect.IOFiber.execR(IOFiber.scala:1319)
        at cats.effect.IOFiber.run(IOFiber.scala:118)
        at cats.effect.unsafe.WorkerThread.run(WorkerThread.scala:585)

Here is also a chart from when we were running the load test that shows the message rate decline:

We started the test around 2:30pm and you can see the ack rate jump to about 700 messages/second while our number of unacked messages slowly climbs then at around 4:00 the ack rate dropped drastically to about 30 messages/second then slowly declines over the course of the next 8 hours while the unacked count climbs until our load test finished running at around 5:00pm. Around midnight is when I restarted the container and you can see the ack rate jump to almost 800 messages/second and pretty quickly clear out the backlog of unacked messages.

Are there any configurations we can look at or recommendations you all have on how to ensure that we don’t run into this issue in production?

Thank you in advance for your help!

I will also attach the docker command we are running and our config file:

sudo docker run \
  -d \
  --name bigquery-loader \
  --restart unless-stopped \
  --network host \
  --log-driver gcplogs \
  -v ${CONFIG_DIR}:/snowplow/config \
  snowplow/snowplow-bigquery-streamloader:1.6.3 \
{
  "projectId": "HIDDEN"
  "loader": {
    "input": {
      "subscription": "flagship-bq-loader-dev"
    }
    "output": {
      "good": {
        "datasetId": "flagship_dev"
        "tableId": "events"
      }
      "bad": {
        "topic": "sp-bad-row-topic"
      }
      "types": {
        "topic": "flagship-types-topic"
      }
      "failedInserts": {
        "topic": "sp-failed-insert-topic"
      }
    }
  }
  "mutator": {
    "input": {
      "subscription": "types-sub"
    }

    "output": {
      "good": {
        "datasetId": "flagship_dev"
        "tableId": "events"
      }
    }
  }
  "repeater": {
    "input": {
      "subscription": "failed-inserts-sub"
    }

    "output": {
      "good": {
        "datasetId": "flagship_dev"
        "tableId": "events"
      }

      "deadLetters": {
        "bucket": "gs://dead-letter-bucket"
      }
    }
  }
  "monitoring": {} # disabled
}
1 Like

Hi @rellingson ,

Welcome to Snowplow community !

Thank you for providing all the details.

We’ve also observed the same behavior on our pipelines with 1.6.3 :

Memory usage keeps growing until it reaches the limit and the CPU stays at 100% (probably due to continuous GCs).

We’re investigating the root cause. New version with the fix will be released and announced on Discourse, stay tuned !

Meanwhile you could use an older version that doesn’t have this problem, for instance 1.5.2 (we are not sure yet in which version the problem started to appear).

Hey, @BenB.

Thank you for the response! We’ve downgraded the version of bigquery-streamloader to 1.5.2 and run through our load tests again and did not experience a drop in performance!
A couple of points that may be of interest to you as you look for the issue in the latest version:

  • We noticed that our overall message rate was lower in this version than in 1.6.3 (about 500/s instead of 750)
  • CPU usage never peaked above ~50% but all 4 vCPUs were in use
  • We were thinking that it could be a memory leak issue and garbage collection is what’s causing the high CPU usage.

Hope y’all find the issue and I’ll be keeping my eye out for the new release with the fix.

Thanks again!

4 Likes

Hey @rellingson ,

Awesome !

That’s actually good to hear ! In 1.6.0 we introduced some caching in the operations of the loader, which was aiming at improving the performance.


We’ve made progress on the troubleshooting, the memory leak was very likely introduced in 1.6.3 (here). We’ll release a new version with the fix soon, but meanwhile you could already use 1.6.2 that has the performance improvement.