We are proud to announce a new release of Snowplow:
This release is one the most-community driven release we’ve ever done, a huge thanks to everyone involved:
If you’re looking to contribute, we’ve significantly revamped our contributing guide.
Apart from containing great community contributions, this release revolves around new capabilities for the real-time pipeline!
4 Likes
Hi @BenFradet,
thanks for releasing yet another great increment to the real time pipeline.
However, I would like to challenge the decision about truncating the X-Forwarded-For
header field. We heavily rely on the whole list of IP-addresses or domain names present in that header field.
- Corporate and private users can be proxy’fied from different networks but still have same internal IP (
X-Forwarded-For: 192.168.123.45,cache.acme.com,gw.acme.com,cache.acme-s-isp.com
vs X-Forwarded-For: 192.168.123.45,home-isp-cache.foo.com
)
- Traffic compression servers like Opera Mini can put different kind of values into the header field. In example you can see 127.0.0.1 as first entry
I think it should be up to the snowplow stack user to decide what to do with the header, and maybe use a header enrichment which we contributed a few years back to selectively read from X-Forwarded-For
header and truncate to their liking?
What do you think?
Hey @christoph-buente,
The only truncation we do surfaces in user_ipaddress
, the headers are left intact. As such you can still use the header enrichment if you want to keep the whole chain of ip addresses.
2 Likes
Thanks @BenFradet for clarifying. I was under the impression the collector is changing the header field and there is no way anymore to access the original headers during enrichment phase.
Then please ignore my concerns
No worries, sorry if the explanation wasn’t clear enough.