This release adds new features to the Scala Stream Collector that will help users struggling with overzealous ad-blocking and the implications of WebKit’s recent ITP changes:
Setting first-party cookies server-side against multiple domains on the same collector
Configuring Secure, HttpOnly and SameSite attributes for the cookie
Custom request paths
It also includes other small upgrades to the Scala Stream Collector and EmrEtlRunner.
Hi, just FYI… If someone decides to enable SameSite settings as specced by Google(SameSite=None) for third party cookies, those will not work for older versions of IOs(check here).
Maybe it would be wise to add this as relevant info somewhere, as simply applying those settings will result in loss of existing third party cookies on IOs devices prior to iOS 13.
I was thinking of potentially contributing my changes to the collector which uses a regex to determine whether the configured SameSite flag should be set based on the UserAgent string.
Now, i know… It’s a trap, but i have no idea how to make sure that this new feature doesn’t cause degradation in all third party cookies people have gathered for some time. With the current state of affairs, you’re either going to lose people with older iOS versions( and looking at data, currently there’s a lot less traffic on 13 than under), or you’ll loose all chrome cookies.
One might argue that we don’t care about third party cookies in iOS, but still there’s some percentage where those cookies actually land on a device.
What’s do you think about a userAgent based regex to determine when to set those same site flags as a transitional period? I wouldn’t want to submit a PR unless you guys feel like the approach is ok.
Hi @jankoulaga, I think there are two distinct points:
Is upgrading to 0.16.0 going to cause degradation in the data?
Is there a better way to set the SameSite attribute?
I’ll address them in turn.
Is upgrading to 0.16.0 going to cause degradation in the data?
You are correct that setting SameSite=None might lead to loss of third-party cookies on iOS devices before iOS13. As the bug report you linked to suggests, the attribute might be treated as SameSite=Strict.
However, let’s look at the alternative. If you do not upgrade, and stay on 0.15.0, you will have no control over the SameSite attribute. On Chrome 80 and later, a missing SameSite attribute will be interpreted as SameSite=Lax by default, which again will result in the loss of third-party cookies.
So either way, because of the current state of the browser world, there’s this tradeoff, which is independent from whether users upgrade to 0.16.0 or not. They’ll have to decide which is likely to impact them more: no third-party cookies in iOS <13 or no third-party cookies on Chrome 80+.
Is there a better way to set the SameSite attribute?
We always welcome PRs from the community! But the approach with user agent sniffing is likely to be very hard. Almost all browsers engage in some sort of user agent obfuscation, so that is a very unreliable way of detecting the browser. And even if we can come up with something that works in the current moment, there are no guarantees that browsers won’t make arbitrary breaking changes.
In the long-term, a better option that’s worth exploring might be an update to the Snowplow JavaScript tracker, to identify the browser in the request.
As i said, it’s a trap, and there’s no easy way it would deterministically work on the collector, however, it’s the collector setting that cookie, so it’s collector’s concern to actually set it correctly.
I don’t think client side code can fix this problem because:
a) any third party library adds additional overhead to the tracker size
b) we’d probably need to update tracker dependencies on those UA parsers regularly, and currently, release cycle for the JS tracker is very long
c) we’d be duplicating the info about the useragent which is either way sent as a header, and can be processed more efficiently by the server side, than inside the browser.
I guess the only thing we can do is wait and see how fast people upgrade their os, and we have until feb 2020 to guesstimate what will happen.
This is interesting. @dilyan am I reading you correctly in that the idea here is to:
Instrument a JS tracker method that detects the browser (and version?)
Add that information to request headers
Have the collector’s cookie setting behaviour act according to this value
If the JS tracker doesn’t currently have a method to detect browser & version (I haven’t had a chance to check) then a quick google tells me that there are options - Bowser and Platform.js - can do it. Would need to see what the impact is on tracker size/if there’s a more lightweight option that’s maintained well.
@Colm, yes that was my idea. If there is a reliable way to get the browser info via JS, the tracker can send it as a custom header. Then it would be much easier to implement @jankoulaga’s idea for the collector setting SameSite based on the browser.
On the JS side, there are libraries I’ve linked, but they’re sitting at about 4-5kb which is a ~5% increase in tracker size - for something so small ideally we’d have a lighter-weight option.
It’s possible to instrument the code manually but that appraoch doesn’t feel terribly reliable/maintainable. That’s only with about 10 mins research though so I’d be optimistic of there being some compromise or alternate approach.
Come to think of it what I didn’t do is check whether the tracker can already retrieve this information without adding a new dependent lib. (I can’t remember of the top of my head and like I said it might be a little while before I can dig into it).
@jankoulaga On further thought, why would one worry about third-party cookies on Safari, where they are blocked by default anyway? Are you specifically thinking of users who have explicitly enabled them?
I think in the long run Chrome users on the default settings will vastly outnumber Safari users who have changed the defaults.