This is a warning just for the user agent (it can be ignored) so this shouldn’t impact anything around IP address. I would look at the raw collector payloads (in PubSub) which should have an IP address in the header - it’s possible that GKE may be manipulating this header but in general it’s quite unusual to have IP address as null.
Ok - so that rules out the enrichment process doing anything odd.
At the collector level there is only really two instances where unknown is returned for the IP Address:
If you have SP-Anonymous enabled (and it is being sent in the header) or
Snowplow can’t find a header (e.g., Remote-Address, X-Forwarded-For) to extract the IP address so it will return ‘unknown’
If it’s the first option then you probably have that enabled for a reason, and if it’s the second unfortunately there’s not too much you can do if this information hasn’t been sent in the headers. I’d be tempted to check sending from the JS tracker straight to an external load balancer (rather than GKE directly) to see if you get additional headers that might be getting removed.
I was going through the GCP External Load balancer logs, and this is a sample POST request to collector.
"userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 15_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Mobile/15E148 Safari/604.1",
looks like remoteIp is a part of the header information sent to collector, and i also queried all the LB logs to see for any POST instance the remoteIp is empty, but for all of them the remoteIp has a value.
So basically all events that reach collector has a remoteIp, but when the events comes out of collector, some of the events have user_ipaddress = “unknown” .
I hope my analysis is correct, not sure if i am missing any step here.
Thanks mike and paulBoocock for helping me out to pin-point the issue, as you both mentioned the issue was with SP-Anonymous Tracking.
So the tracker was sending a GET request to collector for the first visit to webpage, and collector was able to get ip_address and other info, but the subsequent visits the tracker was sending a POST request with SP-Anonymous* header, which as you mentioned, collector set’s the user_ipaddress to unknown.
This was the reason we were seeing events with and without user_ipaddress in the good_events table.
Thanks a lot for helping me out, really appreciate.