Have triple-checked security groups (OK) - allowing traffic from 0.0.0.0/0 for EC2 and ELB security groups have no affect - no network ACL issues either
certificate is valid
enable/disable HTTP/2 at ELB has same 502 result
check ELB access logs display no response back from target host for requests coming in as HTTPS
The same request is being made to the target host on port 8000 for HTTP and HTTPS from ELB, but the HTTP request receives back 200 status from target whereas the HTTPS request receives nothing resulting in 502 response from the load balancer.
stdout/stderr logs for collector don’t show any results when HTTPS request is made, but do show for HTTP (eg. INFO com.snowplowanalytics.snowplow.collectors.scalastream.sinks.KinesisSink - Successfully wrote 1 out of 1 records)
forceSecureTracker in javascript does not change results - still 502s
changed config to interface=“0.0.0.0”
INFO com.snowplowanalytics.snowplow.collectors.scalastream.KinesisCollector$ - REST interface bound to /0:0:0:0:0:0:0:0:8000
same results - http OK, https 502
At this point we’re pretty much stumped because everything appears to be setup correctly. Our working theory is that the ELB is sending encrypted request to the collector and not terminating SSL at ELB but not sure how to prove/disprove this - are there additional logs somewhere we can see the incoming request to the collector?
Any input/questions/comments are greatly appreciated.
Interesting - I might be wrong but your description of what’s going on (HTTP succeeding and HTTPS failing 502) would make me check the following:
Is your health check, on the ELB, the same configuration for both? Is it using HTTP for both? Is your Instance Port the same for both?
My initial guess would be that your ELB’s HTTPS health check is trying to use HTTPS (443 instead of 80) against the Scala stream collector and failing. Thus it thinks there are no healthy instances and can’t direct your request.
Great suggestion and so I dug into the health checks… Currently both HTTP and HTTPS health checks on the ELB targets are failing, but HTTP requests to collector are succeeding… Lack of healthy target to my best understanding is that it just gets broadcast to all targets: None of these Availability Zones contains a healthy target. Requests are being routed to all targets
The health checks are set to use traffic port, but overriding to 80 or 8000 seems to have no effect (if collector is setup to bind to port 8000, should that be the port used?). Also, not sure exactly what to use for a path… Currently using: /com.snowplowanalytics.snowplow/tp2
Did some digging and saw that there is a /health endpoint for health checks!
Updated target health checks to use that endpoint and use port 8000 but still report unhealthy. Can hit /health by direct IP:port (bypassing ELB) and by subdomain on HTTP. Hitting it by HTTPS still 502s.
Woops, looks like there is a big delay between health checks - all are set to ping to /health on 8000 and all are now coming back healthy!
Unfortunately, the HTTPS is still coming back 502 when the ELB doesn’t receive a response from the collector. Direct to collector via IP, and http via subdomain through ELB still work. This is odd…
Do you know if there are there any collector webserver logs or something that can be looked at to see the inbound request (if any)?
I knew it was going to be something stupid. 3 of us are looking at it and missed the fact that we setup a target group using HTTP and 8000, then setup one using HTTPS using 8000… We then created a listener on 80 that points to HTTP:8000, and then created a listener on 443 that points to HTTPS:8000…
Resolution: Create one target group that uses HTTP and port 8000, but point both listeners at the same target group so that the target communicates via HTTP on port 8000.
Was staring back at us the entire time. All looks legit, but specifying a target group using HTTPS means that requests sent to the collector will be encrypted.
Really appreciate the help and eyes on this one @fingerco and apologies for wasting your time!!