Starting from February 25th, we noticed a sharp increase of unique domain_userids generated by our JavaScript tracker v2.14.0.
It appears that the increase comes from events with “googleweblight” in their useragent string.
I haven’t found any fresh information about how Web Light handles cookies and whether it has changed its behaviour.
Have any of you guys noticed the same? How would you address this? Thinking about opting out of it following their instructions here: https://support.google.com/webmasters/answer/6211428
About 5% of our events are sent from Web Light users so it does have an impact.
I haven’t come across this behaviour before, so thanks for raising! We’ll need to think about whether the tracker needs a feature to handle this case. I’ve created an issue for it.
A cursory glance at the docs for Web Light show that it doesn’t seem to support cookies. So in the short term, the simplest option for now is probably to opt out. If you are averse to that, I think there’s a slightly awkward workaround option.
You could instrument some logic to get the domain userid, and pass it via the querystring as follows:
if [querystring parameter exists]
then [append to next querystring]
else [get duid and pass to next querystring]
Then you could track this value as a custom context - using the querystring param if it exists, or the duid from the above linked callback method if it doesn’t.
I’m not sure, however, that this will cover identifying users across sessions. That’s a topic which requires some thought I think.
By the way -
Starting from February 25th, we noticed a sharp increase of unique domain_userids generated by our JavaScript tracker v2.14.0.
Are you sure about the version? I don’t believe 2.14 was released on Feb 25th.
Indeed, on the 25th, we were not yet using 2.14. We thought that the reason we were getting so many more duids was because of Chrome starting to enforce its new samesite and secure cookie policy.
So we updated to 2.14 yesterday but it did not help. So I investigated further and found about Google Web Light.
Ah, OK, got it. I only asked because of our natural alarm about the possibility that it might be related to changes in the new version! Thanks for explaining
Out of curiosity how did you implement tracking on the site you are asking about? When I tried to load a few sites from WebLight that use Snowplow, Google, Adobe, Chartbeat and a few others - I was not able to see any analytics network requests being generated at all.
I have not noticed this behavior in our data yet but I will look into it.
@mbondarenko sorry if I’m misreading your comment but I’m curious what Google site you’re alluding to that uses Snowplow? I find it interesting to explore the different ways that others have instrumented their sites with custom events/contexts, etc.
@Boris thanks for the headsup, I checked on our end we also started receiving this browser even though the traffic is less than 1% if its starts increasing it can quickly become a problem with overall stats as the unique domain_userid count will increase unproportionally to the traffic. I would agree that opt-in out would be easiest solution unless one would expect significant impact to performance degradation and traffic loss from this new client.
The sites I tested were not Google’s sites. There are various product that use Snowplow (none of them are owned by Google as far as I know). Kewee and Mather Analytics are a few examples.
As for a specific example I looked at, one of them was LA Times, i.e.:
Interesting. When I tried a test suggested by Web Light documentation (i.e. see link above for LA Times). I did not see any requests being sent out. When I look at our own Snowplow data. I actually see some visits from Web Light (right now less than 0.0001%). I am just curious as to how to test it as adding https://googleweblight.com/i?u= in front of URL doesn’t seem to work quite well to test actual requests being sent out.