Bigquery streamloader unexpected option: --runner

Hello,

I’m following Simo’s Ahava guide to setup Snowplow in GCP. The guide has been published 3 years ago and I’m not sure how I should update some command lines.

Right now, I’m trying to setup the vm instance for the ETL process.

According to Simo’s guide I need to run the following command:

java -jar snowplow-bigquery-streamloader-1.0.1.jar --config=$(cat bigquery_config.hocon | base64 -w 0) --resolver=$(cat iglu_resolver.json | base64 -w 0) --runner=DataFlowRunner --project=$project_id --region=$region --gcpTempLocation=gs://$bucket_name/temp-files

When running it directly on the vm it returns me this error:

unexpected option: --runner

What does this mean? Why do I got this error?

I’m running other command working well relying on bigquery_config.hocon and iglu_resolver.json so I think they are well configured.

Thanks.

Hi @simonbreton ,

When that guide was put together, we didn’t have a streamloader, only loader.

The difference between them is that streamloader is a standalone JVM application, which you can run from our official docker image or indeed using:

java -jar snowplow-bigquery-streamloader-1.0.1.jar --config=$(cat bigquery_config.hocon | base64 -w 0) --resolver=$(cat iglu_resolver.json | base64 -w 0)

It only needs two arguments: --config and --resolver. All the other options, including --runner are inapplicable for streamloader.

They do apply to loader, which is still supported for now. Unlike streamloader, loader is designed to be run as a custom container Google Cloud Dataflow job. So you can’t launch it from a jar file, only by using the official image from Dockerhub.

For more information, check out the setup guide, especially the command line options for StreamLoader and Loader.

You can also find a detailed configuration reference for the HOCON file here.

1 Like