BigQuery Forwarder for "Successful" Inserts

I’m wondering if it is possible to send a feed of successful BigQuery inserts, passing as fully mapped JSON much like how the failedInserts forwarder works.

Subscribing to the enriched-good PubSub topic returns data in the TSV JSON data-only format like so:

value_one    value_two    value_three

While I can subscribe to this, it means I have to do the column name mapping myself. Ideally I’d let the BigQueryLoader perform the mapping, then retrieve the mapped JSON like so:

{ "header_one":"value_one","header_two":"value_two","header_three":"value_three"}

Just wondering if it’s possible to do this, or if I should just stick with performing the header mapping myself, which does seem error-prone. I’ve been using Snowplow for a few weeks now and it’s been great, only now I am trying to carry out tasks which don’t appear to be documented.

Hi @James_Buckley,

You have two options. You can query the data from BigQuery and export it as JSON, or you can use one of the analytics SDKs’ transform functions to return a JSON from the tsv.

Hope that helps.

Hi @Colm,

Thanks for the quick response. Option 2 is what I’m after at the moment, so thanks for pointing me in that direction.

Using the node.js Analytics SDK, I tried running a transform on the enriched-good TSV feed, but got a “Wrong schema format” error. I think this is because I am passing “” schema events, and perhaps the SDK is only accepting “com.snowplowanalytics” schema events, unless I’m mistaken. Is there a way to change this?

I’m building a real-time dashboard, so currently I’m trying to have the data feed into it as JSON events in real-time. Querying the BigQuery data every X minutes will be what I’d attempt next if this doesn’t work out.

Hi @James_Buckley,

Sorry for taking a while to respond, I haven’t logged into discourse in a while.

I’m not entirely sure why you’re running into that error message - looking at the source code it’s produced when the schema string for the event doesn’t conform to the expected format. I doubt it’s because it’s a schema though - the SDK should work with any valid schema path (indeed in the case of custom events/contexts this must be the case, since you can choose any vendor name).

If you haven’t resolved this yet, could you provide the full schema path string in question?