Hello!
I am currently setting up Snowplow on GCP. I got a first simple setup with no enrichments to work, the BigQuery table schema was created automatically and the data ingestion worked fine. Even with the Google Analytics and Page contexts enabled the schema got adjusted accordingly.
But after I enabled below enrichments, the BigQuery connection stopped working and no schemas were created anymore automatically. It also doesn’t work if I manually add the previous schema to the table. I always get the same error message in Dataflow:
java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
“code” : 400,
“errors” : [ {
“domain” : “global”,
“message” : “The destination table has no schema.”,
“reason” : “invalid”
} ],
“message” : “The destination table has no schema.”,
“status” : “INVALID_ARGUMENT”
}
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:777)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:813)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:122)
org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:94)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
“code” : 400,
“errors” : [ {
“domain” : “global”,
“message” : “The destination table has no schema.”,
“reason” : “invalid”
} ],
“message” : “The destination table has no schema.”,
“status” : “INVALID_ARGUMENT”
}
com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1067)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$0(BigQueryServicesImpl.java:724)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
The enrichments I enabled are:
- ua_parser_config
- referer_parser
- campaign_attribution
- anon_ip
@anton, thank you very much for your great work, the rest works absolutely perfectly so far! Do you have an idea what could cause the automatic schema creation and/or altering to fail?
Best regards,
Ian
p.s.: I’m currently working on some improvements for the GCP setup and will share them once I tested everything.