GCP Setup: Instance connection refused

Hello,

First time setting up a Snowplow pipeline on GCP, using this guide, which is very similar to the Snowplow documentation.

I’ve managed to set everything up, up until the last step where I can test my connection. When sending the test request via curl, I get the following:
(7) Failed to connect to XXX.XXX.XXX.XXX port 8080: Connection refused

I’ve not found a solution anywhere else. From what I’ve read I think the issue is that the port is not being listened to, and not a firewall issue (which would give me connection denied or similar). I’ve tried on my local machine and through the instance SSH, with the same result.

I’m a beginner, by the way. Technical knowledge is not great, just learning as I go.

Thanks,
Sam

Hi @samvdbt,

Sounds like you’re attempting to check connection to the collector - correct me if I’m wrong, but I’ll proceed to answer on that assumption.

The collector should be publicly available, so SSHing in shouldn’t be necessary.

I’m assuming you used something along the lines of the below, from looking at the guide.

curl -d "&e=pv&page=curl-test&url=http%3A%2F%2Fjust-testing.com&aid=snowplow-test" -X POST http://<COLLECTOR_ENDPOINT>:8080/com.snowplowanalytics.iglu/v1

Have you tried removing the port? ie:

curl -d "&e=pv&page=curl-test&url=http%3A%2F%2Fjust-testing.com&aid=snowplow-test" -X POST http://<COLLECTOR_ENDPOINT>/com.snowplowanalytics.iglu/v1

It’s also worth checking the health of the collector:

curl http://<COLLECTOR_ENDPOINT>/health

Also, check that your collector configuration allows accepting insecure requests, or try https:// .

If none of those succeed, it’s possible that there’s something to do with the networking rules that may be causing an issue - let us know what the response is from those requests and we might figure out what else could be going on.

Best,

Hi Colm,

Thanks for the reply. Unfortunately, none of these seem to work. I keep getting the same “connection refused” error. If I remove the port, I get the same error for port 80 instead of 8080. Checking the health also refuses connection.

Every secure option in the config file is set to false.

Thanks,
Sam

In that case, my bet is your network rules are blocking external connections. I would check your ingress rules on the instance, and any VPC/network it’s in.

You might want to try setting one up without any firewall/network restrictions, test it, then work from there having narrowed down the issue with more confidence.

Apologies for the delay. Couldn’t find anything that caused the issue, so decided to start with a fresh project in GCP. Set everything up, sent my test request, and got the “ok” message that I needed.

This time I followed the guide from the Snowplow site. I noticed that this one did not mention setting up a subscription for the good topic, as I did the first time. So decided to see if this caused the issue. Set up a subscription, and got the connection refused message. Removed the subscription, and now I’m still getting the connection refused message. So I’m still not sure what’s causing this. Since it was working the first try, I’m assuming it’s not the network or firewall settings…

Could this be related to the subscription in any way? Is there anything I need to reset when creating/deleting a subscription?

Thanks again.

Assuming you’re getting a connection refused response from a request to the collector, then no, the pubsub topic shouldn’t make a difference - once the collector is set up, it should return a 200 response regardless of what exists downstream of it.

Apologies for the delay. Couldn’t find anything that caused the issue, so decided to start with a fresh project in GCP. Set everything up, sent my test request, and got the “ok” message that I needed.

That’s progress!

Set up a subscription, and got the connection refused message. Removed the subscription, and now I’m still getting the connection refused message.

That’s a bit puzzling. Did you change anything in the collector configuration, or its configuration when you set up the topic?

Connection refused suggests to me that a good place to start is actually the firewall settings. Note that each individual rule on GCP can be for ingress or egress, not both. Also note that the default is to deny everything. So if you have changed something in the process of setting up PubSub, perhaps you have accidentally reverted back to the default deny all rule.

In case it helps pin down the issue - the collector should have an endpoint that is publicly open to the web (in a production setup this endpoint might actually not be in the collector but the load balancer, but we can ignore that for now). So the public endpoint should have at least an ingress rule of ‘allow all traffic’ (0.0.0.0/0). When you set up an output to Pub/Sub, it should separately have an egress rule that allows output just to that topic.

So, I would probably start by looking into what has changed in the collector configuration and any rules associated with it. If you’re still stumped, maybe remove all rules, and attach only a new one that allows all traffic, and see if you can get a response.

I hope that helps you debug!

Nothing else was changed, honestly, so I’m a bit lost :slight_smile:

You mentioned an egress rule to the topic. Could you elaborate on that? The setup guide only mentions ingress rules.

Ah - maybe egress rule isn’t necessary, I just assumed it was.

Something has to have changed. Machines don’t just start behaving differently without something changing. It’s a matter of finding what you haven’t yet noticed is different.

Double check the endpoint you’re using, the one you used to send the successful request, and if all else fails, start again and compare a new collector which accepts requests to the current one.

Okaaay, so I think I’ve found the issue, and I’m thinking it’s not really an issue as much as it is due to this being the first time I’m doing this.

So I’ve SSHed into the instance by using the SSH button next to the instance. When I run the setup commands in the window, I get “REST interface bound to /0:0:0:0:0:0:0:0:8080”. I then switch to my local terminal for the CURL request. My test requests work as long as the SSH window is still open. When I close the SSH window, the connection is refused. So I’m guessing that the instance only works with the SSH window still open, and the commands are still running. When I re-open it and run the last command (java -jar etc.) it works again (with the window open).

When setting this up for real, the commands are added in the startup script, so they are executed there…

Glad you found a way forwards!