I have setup the scala stream collector , And it is working fine. I have checked with command
$ curl http://localhost:8080/health . And it return status OK.
Could you ensure the port you are sending your request to is the same you configured your collector with?
Since your health check works fine via port 8080 then that’s what you have set up in the collector configuration file. But it look like you are using the default port 80 in your tracker.
You could ether change the port in the configuration file to be 80 or add the relevant port to your collector endpoint in your tracker initiator.
And I have already configured the port in the config file as 80. Below is the line of code where I have mentioned
The collector runs as a web service specified on the following
interface and port.
interface = “0.0.0.0”
port = 80
Production mode disables additional services helpful for configuring and
initializing the collector, such as a path ‘/dump’ to view all
records stored in the current stream.
production = false
Also the inbound and outbound traffic is open for all port meanwhile to test this, but still the GET call from the browser to the collector is getting failed (ERR_PROXY_CONNECTION_FAILED) error.
Could it be related to some firewall issue, if yes ? how can I validate it ?
Hey @PuneetBabbar - just to note that we strongly recommend putting your Scala Stream Collectors behind an Elastic Load Balancer in an Auto Scaling Group. This is a much more robust approach than exposing a single collector to the world.
what was the solution to your problem? Did you have to put your PublicIP:port and it worked fine?
So basically
window.snowplow(“newTracker”, “scalaCollectore”, “12.345.67.89:8080”, {
appId: “anyID”,
cookieDomain: “aDomain”
});
See the response from @alex above - you should be putting your Scala Stream collectors behind a load balancer (which will easily allow you to proxy traffic from port 80 to 8080 if required).
Thanks Mike. I have put my collector behind a load balancer. But I am still confused as to what do I put in for the collector endpoint in my javascript tracker initiation? Would it be the Public DNS name for the load balancer? Apologies if these questions are too naive, I am new to AWS and snowplow
window.sp(“newTracker”, “ssc”, “what goes here”, {
appId: “snowplowPOC”,
cookieDomain: “cookiedomain.com”
});
The public DNS name can be anything you’d like it to be. Typically people create a subdomain of the main site (e.g. if my site was domain.com I might use collector.domain.com) and then point the DNS entry for this subdomain towards the AWS Elastic load balancer.
I’d then specify collector.domain.com as my Snowplow collector endpoint.
Thanks Mike.
So as an example, I have my page which is snowplowpoc-env.wvq2iagxas.us-east-2.elasticbeanstalk.com/
and my ELB DNS is sp-poc-collector-lb-93272599.us-east-2.elb.amazonaws.com
When I put the collector instance and my elasticbeanstalk instance in the ELB, the image request to the collector seems to go through(I dont think the image request is right though- because copy pasting the request URL shows me the landing page of my website which is served through elasticbeanstalk, ideally it should be just a 1x1 pixel, right?).
But when I remove my elasticbeanstalk instance from the ELB, then the image request times out.
I have updated the security groups for my EC2 instance to get inbound traffic from the ELB. But I am not sure what I am doing wrong here.
Thanks again for your help in advance.