HOW DOES IT WORK?
Search is based on keywords. The engine adds complexity by also looking at correlations between multiple keywords in order to target results more specifically. If you type burns into the search field, you’ll get very different results than if you type, It burns when I pee.
The engine ranks the results using a piece of software called PageRank. Google won’t tell anyone exactly how PageRank works, but the simple version is that it ranks a site or individual page based on how many other pages link to it. Each incoming link is considered a vote for a page. Some “votes” count more than others—if the New York Times links to your homepage, that counts more than the link your aunt put on her cat’s blog.
The ads on Google’s search results page are served by a program called DoubleClick. DoubleClick thinks about your keywords in a different way than the search engine. When an advertiser signs up with Google, he selects keywords and concepts with which he would like to be associated. DoubleClick looks for those keywords in search queries and serves ads based on that. Pretty simple, and not at all scary, right?
Right. Except Google is in the business of improving the targeting of their search results and the relevance of their ads. The best way to do that is to pay very careful attention to who is doing what with their services. Whenever you perform a search through Google’s engine, they track your search terms, your IP address (which includes geographic information, just like your postal address does), what web browser you’re using, and the date and time. If you click on a link, that information is saved. If you’re signed into a Google account at the time of your search, even more information is captured and saved.
The ads served up by DoubleClick drop cookies on your machine that track much of the same data. DoubleClick keeps a log of any ads you click on and remembers you so that similar ads can be targeted at you in the future. This is all done without your explicit permission.
To become one of the nearly 200 million folks with a Gmail account, Google requires a name and birth date, and offers you the option to use something called Web History. Web History, if you opt in, can track every web page you visit and store a detailed history on the Google servers. This data is used to target search results and ads to your apparent preferences.
DoubleClick scans e-mails the same way it does search terms—every e-mail that moves through the Gmail servers gets scanned for keywords. Ads are served based on those keywords to anyone using Gmail. DoubleClick remembers those keywords, establishes trends, and targets ads to individual users based on what they have a tendency to write about.
Taken in pieces, none of this seems terribly sinister. Google’s entire corporate persona is based on the ideas that:
• You can make money without doing evil
• You can generate billions by decreasing the sum total of ignorance in the world
• The power conferred by knowledge should be shared with everyone
• Business does not require exploitation or violence
These are revolutionary ideas, and because of them we respond to Google with affection.
But the fact remains: Google wants to know everything about you. They have the capability to build a detailed dossier on every single person who uses their services, a dossier that could do far more damage to many individuals than a single embarrassing e-mail or misplaced instant message. Google is a vast, vertically integrated corporate behemoth—the biggest Internet company in the world. It is a giant that sees everything and never, ever forgets.