Hi,
we are seeing regular errors with our Stream Loader machines. This is with a volume of 10-20 million events per day.
2023-01-27 08:10:58.738 CET
[io-compute-3] INFO com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Shutdown - Source of events was cancelled
2023-01-27 08:10:58.740 CET
[io-compute-3] ERROR com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Main - Application shutting down with error
2023-01-27 08:10:58.741 CET
com.google.cloud.bigquery.BigQueryException: An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115) at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.insertAll(HttpBigQueryRpc.java:507) at com.google.cloud.bigquery.BigQueryImpl.insertAll(BigQueryImpl.java:1097) at com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.$anonfun$mkInsert$2(Bigquery.scala:91) at blocking @ com.permutive.pubsub.producer.grpc.internal.PubsubPublisher$.$anonfun$createJavaPublisher$1(PubsubPublisher.scala:46) at flatMap @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.Bigquery$.go$1(Bigquery.scala:54) at *> @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.StreamLoader$.$anonfun$run$1(StreamLoader.scala:66) at flatMap @ fs2.Stream.$anonfun$parEvalMapAction$6(Stream.scala:2133) at *> @ com.snowplowanalytics.snowplow.storage.bigquery.streamloader.StreamLoader$.$anonfun$run$1(StreamLoader.scala:66) Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 500 Internal Server Error
2023-01-27 08:10:58.743 CET
POST https://www.googleapis.com/bigquery/v2/projects/project/datasets/snowplow/tables/events/insertAll?prettyPrint=false
2023-01-27 08:10:58.743 CET
{
2023-01-27 08:10:58.743 CET
"code" : 500,
2023-01-27 08:10:58.743 CET
"errors" : [ {
2023-01-27 08:10:58.743 CET
"domain" : "global",
2023-01-27 08:10:58.743 CET
"message" : "An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support.",
2023-01-27 08:10:58.743 CET
"reason" : "internalError"
2023-01-27 08:10:58.744 CET
} ],
2023-01-27 08:10:58.744 CET
"message" : "An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support.",
2023-01-27 08:10:58.744 CET
"status" : "INTERNAL"
2023-01-27 08:10:58.744 CET
}
2023-01-27 08:10:58.744 CET
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
2023-01-27 08:10:58.744 CET
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
2023-01-27 08:10:58.744 CET
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
2023-01-27 08:10:58.744 CET
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:428)
2023-01-27 08:10:58.744 CET
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
2023-01-27 08:10:58.744 CET
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
2023-01-27 08:10:58.745 CET
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
This seems like a problem on Google’s site, but since the error message suggests it could be solved by “Retrying the job with back-off as described in the BigQuery SLA”, my question would be if there is any back-off mechanism already implemented in BigQuery Stream Loader?
Thanks,
Andreas