gle.ovo.

Google can’t track every single click of your web surfing. Only most of them.

Google can’t track every single click of your web surfing. Only most of them.

Posted by Matthias Gelbmann on 27 February 2012 in News, Blogger, DoubleClick, Google +1, Google AdSense, Google Analytics, Google Servers

Summary:

If you don’t trust Google, you may want to avoid it while surfing the web. Good luck to you.

If you are anything like me, you love a lot of what Google offers. As soon as I fire up my Google Chrome browser, I head over to Google Search, Google Maps, Gmail, Google Calendar, Google Docs or Picasa. And whenever I stop wasting my time on Google+, I continue doing so on YouTube. These services are mostly free and reliable, why should I think twice about using them?

There is a reason. Google most likely has more data about people in its data bases than any other organization in the world. More than the former Soviet KGB could have hoped to get in its wildest dreams. If you have teenaged kids with an Android phone, then Google almost certainly knows quite a few things about them, that you don’t. Google may know where they are at any moment via Google Latitude, who all their friends and acquaintances are via their synchronized contact list, what they did last night via their uploaded pictures, and what they say about you via Google Talk.

Now, one might say if you are worried about this, then simply stop using these Google services and you are off the hook.

Really?

If you don’t go near the Internet, then that’s probably the case. But if you happen to live in the 21st century, Google will still collect data from your website visits via services they provide for webmasters. We collect statistics about a number of such services for our surveys. These services are

Service

Percentage of websites using it

Google Analytics

55.6%

AdSense

18.3%

DoubleClick

1.6%

Teracent

< 0.1%

Google Web Servers

1.0%

Blogger

0.9%

Google Sites

< 0.1%

Google +1 (incl. the old Google Buzz)

11.3%

Google Library API

soon to be published

Taking these figures, we investigated how many sites are not using any of these services. We had to take into account the overlaps, e.g. some sites use Analytics and AdSense, therefore we cannot simply add the usage figures. This is what we found out: the percentage of websites that use any of these Google services is 63.5%. In other words

Only 36.5% of the web is Google-free.

This is a very conservative estimate, because there are several popular Google services that we don’t monitor: embedded YouTube videos, embedded Google Maps, Google Site Search, Google Checkout and Feedburner are some examples. However, the services that we left out tend to be used on individual web pages only, whereas the services from our surveys are typically used on all or on most pages of a site. Therefore, the percentage of web pages that are Google-free is almost certainly even lower than 36.5%, but probably not much lower.

What does that mean? Suppose somebody wants to stay away from Google out of concern for privacy or for any other reason. Suppose that person does some research on the web and visits any 5 websites that are not owned by Google. Then the chance that none of these sites uses any Google service, so that no traces are left on any Google server, is 0.65%

The probability of providing data to Google
when visiting 5 random websites,

without actively using any Google service,
is 99.35%.

There are a few things one could discuss concerning that figure, I will try to address some of them:

  • The various Google services run on separate servers, it is not possible to combine all these data.

    While it is technically not possible to have something like a super-cookie covering all Google property and thus readily identifying a visitor along the way, techniques such as Browser Fingerprinting combined with all the other data a website visitor leaves behind, can achieve pretty much the same. I think of this like a jigsaw puzzle, where Google tries to bring all these little data points together. They will never find and properly locate all the pieces, but it’s sufficient to have plenty of them in place in order to recognize the picture. I’m quite confident that the smart guys at Google know a thing or two about digging into large amounts of data.

     

  • You can turn off JavaScript and use Ad Blockers, so that you are not affected.

    Disabling some (but not all) of the Google data collection is possible. Google itself provides tools such as the Analytics Opt-out Browser Add-ons, and there are any number of third-party tools. However, selecting, configuring and updating these tools on several browsing platforms such as PCs, smartphones and tablets, is more effort than most people and most company’s IT administrators are willing to spend.

     

  • Who cares?

    Some people do, others don’t. I personally must say that I trust Google more than I trust any government in the world, including my own, but that’s a low bar. Call me naive, but I don’t believe terrible misuse of the data is planned at the Googleplex at this moment. I think that Google knows best that one can lose people’s confidence only once, and as soon as a Google ad is generally perceived as a severe privacy issue, that would pretty much be the end of the company.

    But that doesn’t mean that things can’t go wrong at some stage. Certainly, seeing all those mountains of data in one place does leave a nervous feeling. Mistakes are made, even at Google, as has been known to happen again and again. There could be data leaks, or outright criminal conduct, or a change of Google’s policies at any time. And while at the subject of governments of the world, all that data being available may well, under certain circumstances, give them too more information about me than I want them to have.

Whatever your personal conclusions are, I hope that this little investigation will contribute to making data collectors, surfers, webmasters and law makers alike aware of the magnitude of the problem. We have reached a critical point where it’s next to impossible for an individual to decide where and when he or she wants to give away some data to the biggest data collector. It all happens with or without you.

_________________
Please note, that all trends and figures mentioned in that article are valid at the time of writing. Our surveys are updated frequently, and these trends and figures are likely to change over time.

Share this page

4 comments

Andrew Schwartzmeyer on 28 February 2012

And then you also have people who use Google Voice (me included): which gives them all of your SMS and phone call communication. They give you the option of recording calls, so they certainly have the ability to as well. 

Stefan on 28 February 2012

> I personally must say that I trust Google more than I trust any government in the world, including my own

Don’t forget that Google has to obey US legislation, and legislations of other countries it operates in. So, even if you trust Google, do you trust Governments that can make google obey to user data requests? I don’t  🙂

http://www.google.com/transparencyreport/governmentrequests/userdata/

Jamie S. on 28 February 2012

Is there a list of the domains in that chart given in the article?

Reply by author Matthias Gelbmann on 28 February 2012

@Jamie: We have that list, it is the basis of our surveys, but you can’t download it, if that’s what you mean.

You can, however, check the technology usage of any site via our site info page: http://w3techs.com/sites

Leave a comment

“The probability of providing data to Google when visiting 5 random websites, without actively using any Google service, is 99.35%.”



Leave a Reply