How to set cross origin CORS headers for Scala Collector

Hi, new here! I’m on the first steps of setting up Snowplow on GCP. I have the Scala Collector running, and I’ve verified it is healthy and receiving events by using curl.

I’m now testing in a web browser on localhost with the Javascript event tracker, and calls to my backend are apparently blocked because of a CORS issue. I wonder if someone could help me set these correctly?

I get the standard CORS error in Chrome when I try to track an event:

Access to XMLHttpRequest at 'http://track.unjank.com/com.snowplowanalytics.snowplow/tp2' from origin 'http://localhost:5000' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

By calling it like this:

window.unjankTracker('newTracker', 'sp', '//track.unjank.com', {
  appId: 'some-app-id',
  cookieDomain: 'track.unjank.com'
});

window.unjankTracker('trackPageView');

My intention is that I’d be able to run this on multiple domains, so would need to set cross-origin headers to allow “*”, at least for now while I’m prototyping.

My setup script for the collector is this:

#! /bin/bash
sudo apt-get update
sudo apt-get -y install default-jre
sudo apt-get -y install unzip
sudo apt-get -y install wget
archive=snowplow_scala_stream_collector_google_pubsub_1.1.0_rc4.zip
wget https://dl.bintray.com/snowplow/snowplow-generic/$archive
gsutil cp gs://snowplow-unjank-collector-bucket/application.conf .
unzip $archive
java -jar snowplow-stream-collector-google-pubsub-1.1.0-rc4.jar --config application.conf &

The application.conf file has the following apparently relevant config set:

  # Cross domain policy configuration.
  # If "enabled" is set to "false", the collector will respond with a 404 to the /crossdomain.xml
  # route.
  crossDomain {
    enabled = true
    # Domains that are granted access, *.acme.com will match http://acme.com and http://sub.acme.com
    domains = [ "*" ]
    # Whether to only grant access to HTTPS or both HTTPS and HTTP sources
    secure = true
  }

  # Configuration related to CORS preflight requests
  cors {
    # The Access-Control-Max-Age response header indicates how long the results of a preflight
    # request can be cached. -1 seconds disables the cache. Chromium max is 10m, Firefox is 24h.
    accessControlMaxAge = 5 seconds
    accessControlMaxAge = ${?COLLECTOR_CORS_ACCESS_CONTROL_MAX_AGE}
  }

I’m running on Compute Engine with a load-balanced cluster of Debian 9 instances as per these instructions: https://www.simoahava.com/analytics/install-snowplow-on-the-google-cloud-platform/#step-1-create-the-instance-template

I also have Cloudflare in front of the endpoint, and I’ve considered adding a Cloudflare Worker to add the appropriate headers, but I would have thought that I’d be able to change a setting to enable the correct CORS headers.

I hope someone can point me in the right direction for setting the CORS headers? Thanks in advance, folks!

Hi @Stef, the most common reason for CORS errors is some conflict between secure and unsecured connections. The most suspicious thing I see at first glance is that the collector is set to only grant access to HTTPS requests:

crossDomain {
    enabled = true
    # Domains that are granted access, *.acme.com will match http://acme.com and http://sub.acme.com
    domains = [ "*" ]
    # Whether to only grant access to HTTPS or both HTTPS and HTTP sources
    secure = true
  }

but the error indicates requests via plain HTTP:

Access to XMLHttpRequest at 'http://track.unjank.com/com.snowplowanalytics.snowplow/tp2' from origin 'http://localhost:5000' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Is it possible to use secured connections all the way? If not, for testing you might be able to use the tracker’s forceSecureTracker: false setting when initialising it; but you’ll still need to ensure that the collector plays ball.

Let us know how you fare and if you need further help.

Thanks Dilyan, I appreciate the help!

This was my initial instinct too. So I’ve run ngrok so I can use https in development, and I’ve also added the force secure setting.

If I remove the service from behind the standard Cloudflare proxy (no workers), this seems to work fine.

But if I turn the proxy back on, I get a CORS failure. Do other users not put Cloudflare in front of the service? First time I’ve seen doing so cause a problem like this.

Hi @Stef. Yes, people do use Cloudflare for this but the CORS errors are not a known issue…

Is it possible that Cloudlfare has cached the CORS headers and needs to be forced to retrieve new headers now that the traffic is fully secured?