Snowplow Collector on GCP Cloud Run

Hi,

I was trying to run the snowplow collector on cloud run using the provided docker image but this attempt failed as I wasn’t able to pass the config file to the docker container through the gcloud command set. Is there a way to get around this?

I tried another approach, I created a alpine java docker image containing the jar file provided and the config file, this was a success.
The contents of the docker file are as follows

FROM openjdk:18-jdk-alpine
COPY snowplow-stream-collector-google-pubsub-2.3.0.jar /home/snowplow-stream-collector-google-pubsub-2.3.0.jar
COPY application.config /home/application.config

EXPOSE 8080/tcp
CMD ["java","-jar","/home/snowplow-stream-collector-google-pubsub-2.3.0.jar","--config","/home/application.config"]

Is there a better approach to this?

Thank you!

Hi @siv ,

Can you share the full gcloud command that you are using ?

Hi @BenB

I have uploaded the docker image to GCR

The command is:
gcloud run deploy snowplow-collector --project project-name --region region --allow-unauthenticated --image gcr-image-url --port 8080 --args “–config application.config”

i also tried keeping the config file in a gcs bucket:

gcloud run deploy snowplow-collector --project project-name --region region --allow-unauthenticated --image gcr-image-url --port 8080 --args “–config link-to-config-file-in-gcs”

The issue is that here you’re referring to a local file inside the Docker container whereas it’s not existing. You need to mount a volume with the file so that you can access it.

1 Like

Understood! thank you!

Would you recommend mounting a volume and using the snowplow provided docker image over the custom docker image i’ve created?

It’s up to you, but personally yes I would prefer to use directly the original Docker image rather than needing to rebuild one each time the config or version evolves.

2 Likes

Hi @BenB ,

We can’t mount a volume to cloud run as it is stateless. So, please suggest some other way of doing it.

For your reference:

Thanks

Hi @Hanumanth ,

Another possibility is to run the Docker image inside a Compute Engine instance.

We provide the Terraform module to do so.

Hi @BenB ,

Just a quick question. Can we pass the content of application.config instead of file directly to snowplow docker image?

Like:

gcloud run deploy snowplow-dev-collector-mount --project mcd-japan-analytics --region asia-northeast1 --allow-unauthenticated --image asia.gcr.io/mcd-japan-analytics/snowplow/snowplow-collector-mount --port 8080 --service-account snowplow-service-account-202@mcd-japan-analytics.iam.gserviceaccount.com --args "--config \"$(gcloud secrets versions access 1 --secret='snowplow-secret-test')\""

It was able to read the config content but deployment is not working. Do you have something in mind about how we can get it done?

Note: We don’t want to use VM here. That’s why we have switched to a cloud run

Thanks

Hi @Hanumanth ,

It’s not possible yet, but we plan to support this in next collector release, you can follow the issue here.

In this case until next collector release you, I would recommend to do like @siv and create your own Docker image with the config file in it.

Thank you @BenB for the information.

I used the following which I feel is an easy deploy process done by code. Obviously there are multiple ways, but this way, all code can be placed in repo and no manual intervention.

Below is the docker file which works in building the image cloud run. This is then stored into cloud registry.

FROM snowplow/scala-stream-collector-pubsub:2.3.1
USER root

RUN mkdir -p /var/snowplow/collector/
ENV _SNOWPLOWPATH=/var/snowplow
WORKDIR $_SNOWPLOWPATH

COPY collector/ /var/config/snowplow/collector/

EXPOSE 80 443 8080

WORKDIR /opt/docker/bin/

ENTRYPOINT ["/opt/docker/bin/snowplow-stream-collector", "--config", "/var/config/snowplow/collector/config.collector.pubsub.hocon"]

And inside cloudbuild.yaml use the following

steps:
- name: 'gcr.io/cloud-builders/docker'
  entrypoint: 'docker'
  args: ['build',
          '--tag=gcr.io/$PROJECT_ID/snowplow/collector-pubsub:$SHORT_SHA',
          '--file=Dockerfile_collector_deployment_cloudbuild', ##Above config file name
          '.']

- name: 'gcr.io/cloud-builders/docker'
  args: ['push', 'gcr.io/$PROJECT_ID/snowplow/collector-pubsub:$SHORT_SHA']

- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
  entrypoint: gcloud
  args: [
    'run', 'deploy',
    'snowplow-collector', '--image', 'gcr.io/$PROJECT_ID/snowplow/collector-pubsub:$SHORT_SHA',
    '--region', 'us-central1',
    '--platform', 'managed',
    '--allow-unauthenticated',
    ]

Hope this helps!!

1 Like