I'll be speaking at IS' Internetix 2010 conference and this was originally posted there. I was asked to put a blog post together as a teaser for my talk.
Privacy is dead, or so the common wisdom says. But that can't be true. Centuries of philosophy tell us that it's vital for our development and existence as human beings. As a trite example, try imagine having a truly intimate conversation with your partner while knowing someone else was listening. But that's not what I want to talk about here. If you want to have that conversation, start with this paper.
What I do want to talk about is how much privacy invasion we allow in our daily online activities. But first let's talk about Google. Google is a hugely successful corporation. What's more, people *think* it is a hugely successful corporation, and so attempt to copy their methods and business models. A quick look on Amazon for business books about Google shows 1660 books, while a search for the same on Yahoo shows 635. If that's not enough for you, then try and imagine another way of monetising online content other than through advertising (unless you're Rupert Murdoch). Google is so exemplary of the online business model, that the next best example, Facebook, provides little meaningful differentiation when it comes to privacy invasion. So what is this miraculous, often copied, business model; wholesale personal data collection, correlation & aggregation used to better target ads.
You don't have to have thought very hard to have realised by now that Google's services aren't free. Sure, they don't cost you money, but Google needs to make money. They do that by collecting data about you, and using it to better target advertising at you. This doesn't worry most people, as long as that data isn't handed over to creepy government agencies or personal stalkers or allowed to be individually perused by Google employees. While all of those things are possible, and warrant enough worry in themselves, the truth is you don't really know what data is being collected, where it's being exposed, in what form and to who. Let's take Axciom, a company who's, until recently, sole purpose was to buy data about people and sell it back to marketers. How much do they know about you, who are they selling it to and with what controls?
So how does the average website leverage this world of advertising-based monetary rewards? They just include a few pieces of code into their website. This code can do all sort of things, from tracking you around the web to build a behavioural profile, interrogating your browser and computer for information, or just keeping a record of who and where you are. The problem is that sliding in these third party web-sources is easy to do, and there are many rewards to be had, both monetary and functional. The former is the primary driver, the filthy lucre of ad-click monetisation, while the latter gives you all sorts of ways to increase the loot (think analytics). Let's take an example site: memeburn.com. I've chosen this at random, not to single them out, because everyone is doing it. To view the kif content at memeburn, your browser only needs to communicate to the http://memburn.com/ webserver. However, when we hit the front page, before loading anything fancy like JavaScript, content is pulled from two other domains: afrigator.com (from the unsubtly named /track/ directory) and myscoop.co.za. After loading JavaScript, content is pulled from 34 domains in total (6 appear to belong to memeburn, 8 belong to Google, 6 to Facebook and 6 to Twitter with 10 others distributed among others). By way of comparison, a load of techcrunch.com hits 39 domains, this certainly isn't something memeburn only is engaging in. By just visiting the site, before we've even moved the mouse or read an article your browser has contacted, been poked, prodded and queried by dozens of services, none of which actually present you with the content you're there for, and with whom, for the most part, neither you nor the site have any contractual relationship with. Sure, they're privacy policies will state that they only give your information to business partners, aka anyone who will give them money for it. As we move up the stack and start using the web applications, the number of services and amount of information collected only increases. Come to the talk to see how something as simple as your search data speaks volumes about you. Now multiply that by every page you visit, every day you use the internet, over a lifetime; that's a lot of data. If you don't think it says anything about you, come to the talk to have your opinion changed.
The big problem is with finding solutions. For you to individually protect yourself against the multiple methods of data collection is currently a huge burden. If you ever want to see just how big, come and check out my browser setup. The balance needs to be tipped, with companies bearing more of the costs of privacy, instead of it all resting on the consumer. In the meantime, if you're a web developer, start thinking about whether you really need to hand so much of your users' data over to third parties. At the very least, it will result in faster page loads. In the meantime, while us consumers wait for privacy legislation to catch up, there is some help in the form of browser add-ons. For example, AdBlock (Chrome, Firefox) will cut out a lot of the third parties, and not impact your ability to see the content (i.e. no cost to you), in fact things look cleaner and load faster. This is the only way we can vote with our money and attempt to force a change in just how much privacy invasion needs to occur for something as uninteresting to the worlds problems as targeting advertising.
You don't have to have thought very hard to have realised by now that Google's services aren't free. Sure, they don't cost you money, but Google needs to make money. They do that by collecting data about you, and using it to better target advertising at you. This doesn't worry most people, as long as that data isn't handed over to creepy government agencies or personal stalkers or allowed to be individually perused by Google employees. While all of those things are possible, and warrant enough worry in themselves, the truth is you don't really know what data is being collected, where it's being exposed, in what form and to who. Let's take Axciom, a company who's, until recently, sole purpose was to buy data about people and sell it back to marketers. How much do they know about you, who are they selling it to and with what controls?
So how does the average website leverage this world of advertising-based monetary rewards? They just include a few pieces of code into their website. This code can do all sort of things, from tracking you around the web to build a behavioural profile, interrogating your browser and computer for information, or just keeping a record of who and where you are. The problem is that sliding in these third party web-sources is easy to do, and there are many rewards to be had, both monetary and functional. The former is the primary driver, the filthy lucre of ad-click monetisation, while the latter gives you all sorts of ways to increase the loot (think analytics). Let's take an example site: memeburn.com. I've chosen this at random, not to single them out, because everyone is doing it. To view the kif content at memeburn, your browser only needs to communicate to the http://memburn.com/ webserver. However, when we hit the front page, before loading anything fancy like JavaScript, content is pulled from two other domains: afrigator.com (from the unsubtly named /track/ directory) and myscoop.co.za. After loading JavaScript, content is pulled from 34 domains in total (6 appear to belong to memeburn, 8 belong to Google, 6 to Facebook and 6 to Twitter with 10 others distributed among others). By way of comparison, a load of techcrunch.com hits 39 domains, this certainly isn't something memeburn only is engaging in. By just visiting the site, before we've even moved the mouse or read an article your browser has contacted, been poked, prodded and queried by dozens of services, none of which actually present you with the content you're there for, and with whom, for the most part, neither you nor the site have any contractual relationship with. Sure, they're privacy policies will state that they only give your information to business partners, aka anyone who will give them money for it. As we move up the stack and start using the web applications, the number of services and amount of information collected only increases. Come to the talk to see how something as simple as your search data speaks volumes about you. Now multiply that by every page you visit, every day you use the internet, over a lifetime; that's a lot of data. If you don't think it says anything about you, come to the talk to have your opinion changed.
The big problem is with finding solutions. For you to individually protect yourself against the multiple methods of data collection is currently a huge burden. If you ever want to see just how big, come and check out my browser setup. The balance needs to be tipped, with companies bearing more of the costs of privacy, instead of it all resting on the consumer. In the meantime, if you're a web developer, start thinking about whether you really need to hand so much of your users' data over to third parties. At the very least, it will result in faster page loads. In the meantime, while us consumers wait for privacy legislation to catch up, there is some help in the form of browser add-ons. For example, AdBlock (Chrome, Firefox) will cut out a lot of the third parties, and not impact your ability to see the content (i.e. no cost to you), in fact things look cleaner and load faster. This is the only way we can vote with our money and attempt to force a change in just how much privacy invasion needs to occur for something as uninteresting to the worlds problems as targeting advertising.
Tracked: Oct 11, 08:09
Sounds interesting RT @singe: I blogged a teaser for my talk on Online #Privacy at #Internetix2010 tomorrow http://is.gd/fWC8W
Tracked: Oct 11, 05:50
RT @TheSuggmeister: Sounds interesting RT @singe: I blogged a teaser for my talk on Online #Privacy at #Internetix2010 tomorrow http://is.gd/fWC8W
Tracked: Oct 11, 06:07
RT @singe: I blogged a teaser for my talk on Online #Privacy at #Internetix2010 tomorrow http://is.gd/fWC8W
Tracked: Oct 11, 07:14