We’re setting up the GCP pipeline with the following components:
- GCP collector
- FS-2 enrich
- Streamloader
- Biquery mutator
- Biquery repeater
We’ve got everything running, and the events are being passed through enrich, however they aren’t getting processed by the loader. When we check the pubsub logs, the following error appears in loader bad rows pubsub:
{"schema":"iglu:com.snowplowanalytics.snowplow.badrows/loader_parsing_error/jsonschema/2-0-0","data":{"processor":{"artifact":"snowplow-bigquery-streamloader","version":"0.6.1"},"failure":{"type":"NotTSV"},
The enricher config is:
# "Gcp" is the only valid option now
auth = {
type = "Gcp"
}
# Collector input
input = {
type = "PubSub"
subscription = "projects/dh-event-pipe/subscriptions/enricher_in"
}
# Enriched events output
good = {
type = "PubSub"
topic = "projects/dh-event-pipe/topics/enricher_good"
}
# Bad rows output
bad = {
type = "PubSub"
topic = "projects/dh-event-pipe/topics/enricher_bad"
}
assetsUpdatePeriod = "7 days"
metricsReportPeriod = "1 second"
The loader config is:
{
"schema": "iglu:com.snowplowanalytics.snowplow.storage/bigquery_config/jsonschema/1-0-0",
"data": {
"name": "GCP BigQuery test",
"id": "4c09e258-1ca7-41cc-9c09-700b0a0910ed",
"projectId": "dh-event-pipe",
"datasetId": "atomic",
"tableId": "events",
"input": "enricher_in",
"typesTopic": "event_types",
"typesSubscription": "mutator_in",
"badRows": "loader_bad",
"failedInserts": "loader_retry",
"load": {
"mode": "FILE_LOADS",
"frequence": 1800,
"frequency": 1800
},
"purpose": "ENRICHED_EVENTS"
}
}
I’m guessing we’ve got a config option wrong somewhere, any idea what could be causing this?