Is Google hiding referer data?

28 November 2011 by Bob North

When someone visits your website using a link from another site, you can usually tell where they came from, and if they came from a search engine, this will usually tell you what search words they typed in to find your site. This is, clearly, really useful information for your marketing.

However recently Google have been stripping out the search terms from the information they send over in the Referer (sic) header.

This was first brought to my attention by @charlottebritto and was later covered in The Register, but two questions remained:

1. What exactly is going on?
2. How can we get our referer data back?

So what's going on? Not all traffic from Google is being stripped of this data. It seems to be mostly in the US (ie people using google.com, rather than one of the local Google domains elsewhere in the world), and even then it's not affecting all traffic, just a large minority of it.

There are of course various conspiracy theories, that Google are deliberately hiding this information in order  to further their corporate aims, however this doesn't really hold water: for them to succeed they need their customers to succeed as well, and this information is helpful to everyone. There's just no advantage in hiding it.

It turns out that Google have recently changed to providing their US customers with an encrypted connection to their search engine by default. So when you search on Google.com, your are working over an https connection.

This matters, as when you then bring up a link to a site, the chances are that site will not be https, but plain old http. Now browsers are not supposed to 'leak' information about secure connections to insecure connections, which sounds quite reasonable, but this also includes referer data.

So, if someone searches on an https version of Google, and follows a link to an http site, regardless of what Google do, the browser should not pass the referer string on to the http site. That they often do is a failing in the browsers, which could (and should) be fixed at any time.

It looks like Google, aware of this principle, are simply not passing the referer data in https > http link situations, because they feel it is wrong in principle (even if the browsers aren't acting correctly).

So how do we get our referer data back? That seems quite simple: convert your site over to being https rather than http. A few years ago that idea would have been met with concern that it would slow down the server, but recent research (not least by Google themselves) suggests there will be no such slow down: when Google switched over to https themselves they say they didn't need to add any extra servers.

What you will need to be aware of is that for each https certificate you need a separate IP address. That won't be an issue if your are only running one site on your server, but if you are converting multiple sites on the same server over to https then you'll need to have extra IP addresses allocated.

Naturally if lots of sites are going to do this, then it will further exacerbate the global shortage of traditional IP addresses, and might hasten the switchover to IPv6 addresses.

 

Comments:

 

Categories:

 

Tags:

 

Archive:

July 2016
March 2016
May 2012
March 2012
February 2012
January 2012
December 2011
November 2011
September 2011
August 2011
July 2011
June 2011
May 2011
March 2011
October 2010
September 2010
August 2010
March 2010
February 2010
January 2010

 

Twitter:

 

Facebook:

 

 

 Website copyright and all rights reserved. - A blog made using clearString.

Admin