Bigquery stream loader - pubsub and bq in different projects?

iain · December 3, 2020, 5:37pm

Is there any way to set the GCP stream loader up so that you can have the pubsub topics in one project and the BigQuery table in another? At the moment, it looks like you need to use the same project ID for both in the config. Is there a way around that?

anton · December 5, 2020, 3:11pm

Hey @iain, I don’t think it is possible indeed - we were wondering if this is something users would be interested though. If there are some standard tools to forward data between pubsub topics in different projects then you can have a hacky setup with enriching data in project A, without BQ - then all data will be immediately sent to failed inserts topic. If you manage to forward this topic to another project - you can just insert it with BQ repeater, I think Loader’s and Repeater’s performance is comparable.

riwi · March 16, 2022, 11:50pm

Sorry to revive this topic. In fact, we also have a situation where exactly this scenario would be helpful: We have differently configured pipelines in several projects. We would therefore like to define a common dataset and table in a central analytics project as a sink for the different BigQuery StreamLoader apps.
Is this now possible with the BigQuery StreamLoader? Otherwise, do you have any other idea how to implement this requirement (loading the enriched events into a BigQuery table of another project)?

mike · March 17, 2022, 12:40am

It might be possible but I suspect you going to have deal with edge cases that are going to be a pain, particularly around things like mutation where you are potentially going to have multiple different projects mutating the same table.

I think this depends on what you are using the table for e.g., debugging, real time analytics etc but I’d consider a view in that project that unifies the tables from the separate projects or if you don’t need real time data a table that is incrementally rebuilt from the source tables using load_tstamp.

riwi · March 18, 2022, 3:00pm

The view variant turned out to be the solution for my requirements. Thanks for the tip!

Topic		Replies	Views
Kafka to BigQuery/GCS loader Storage targets	15	1172	October 28, 2023
BigQuery Loader and emr-etl-runner GCP pipeline	8	1385	December 6, 2018
BigQuery Data Location GCP pipeline	7	1007	August 2, 2022
Two pubsub sinks from one enricher Enrichment	3	888	August 16, 2022
Enrich (AWS) to BigQuery Loader Enrichment	3	1132	March 4, 2019

Bigquery stream loader - pubsub and bq in different projects?

Related topics