MotherJones.com / News / Feature
Is Google Evil?
Internet privacy? Google already knows more about you than the National Security Agency ever will. And don’t assume for a minute it can keep a secret. YouTube fans -- and everybody else -- beware.
Adam L. Penenberg
October 10 , 2006
Google Larry Page and Sergey Brin, the two former Stanford geeks who founded the company that has become synonymous with Internet searching, and you’ll find more than a million entries each. But amid the inevitable dump of press clippings, corporate bios, and conference appearances, there’s very little about Page’s and Brin’s personal lives; it’s as if the pair had known all along that Google would change the way we acquire information, and had carefully insulated their lives—putting their homes under other people’s names, choosing unlisted numbers, abstaining from posting anything personal on web pages.
That obsession with privacy may explain Google’s puzzling reaction last year, when Elinor Mills, a reporter with the tech news service cnet, ran a search on Google CEO Eric Schmidt and published the results: Schmidt lived with his wife in Atherton, California, was worth about $1.5 billion, had dumped about $140 million in Google shares that year, was an amateur pilot, and had been to the Burning Man festival. Google threw a fit, claimed that the information was a security threat, and announced it was blacklisting cnet’s reporters for a year. (The company eventually backed down.) It was a peculiar response, especially given that the information Mills published was far less intimate than the details easily found online on every one of us. But then, this is something of a pattern with Google: When it comes to information, it knows what’s best.
From the start, Google’s informal motto has been “Don’t Be Evil,” and the company earned cred early on by going toe-to-toe with Microsoft over desktop software and other issues. But make no mistake. Faced with doing the right thing or doing what is in its best interests, Google has almost always chosen expediency. In 2002, it removed links to an anti-Scientology site after the Church of Scientology claimed copyright infringement. Scores of website operators have complained that Google pulls ads if it discovers words on a page that it apparently has flagged, although it will not say what those words are. In September, Google handed over the records of some users of its social-networking service, Orkut, to the Brazilian government, which was investigating alleged racist, homophobic, and pornographic content.
Google’s stated mission may be to provide “unbiased, accurate, and free access to information,” but that didn’t stop it from censoring its Chinese search engine to gain access to a lucrative market (prompting Bill Gates to crack that perhaps the motto should be “Do Less Evil”). Now that the company is publicly traded, it has a legal responsibility to its shareholders and bottom line that overrides any higher calling.
So the question is not whether Google will always do the right thing—it hasn’t, and it won’t. It’s whether Google, with its insatiable thirst for your personal data, has become the greatest threat to privacy ever known, a vast informational honey pot that attracts hackers, crackers, online thieves, and—perhaps most worrisome of all—a government intent on finding convenient ways to spy on its own citizenry.
It doesn’t take a conspiracy theorist to worry about such a threat. “I always thought it was fertile ground for the government to snoop,” CEO Schmidt told a search engine conference in San Jose, California, in August. While Google earned praise from civil libertarians earlier this year when it resisted a Justice Department subpoena for millions of search queries in connection with a child pornography case, don’t expect it will stand up to the government every time: On its website, Google asserts that it “does comply with valid legal process, such as search warrants, court orders, or subpoenas seeking personal information.”
What’s at stake? Over the years, Google has collected a staggering amount of data, and the company cheerfully admits that in nine years of operation, it has never knowingly erased a single search query. It’s the biggest data pack rat west of the NSA, and for good reason: 99 percent of its revenue comes from selling ads that are specifically targeted to a user’s interests. “Google’s entire value proposition is to figure out what people want,” says Eric Goldman, a professor at Silicon Valley’s Santa Clara School of Law and director of the High Tech Law Institute. “But to read our minds, they need to know a lot about us.”
Every search engine gathers information about its users—primarily by sending us “cookies,” or text files that track our online movements. Most cookies expire within a few months or years. Google’s, though, don’t expire until 2038. Until then, when you use the company’s search engine or visit any of myriad affiliated sites, it will record what you search for and when, which links you click on, which ads you access. Google’s cookies can’t identify you by name, but they log your computer’s IP address; by way of metaphor, Google doesn’t have your driver’s license number, but it knows the license plate number of the car you are driving. And search queries are windows into our souls, as 658,000 AOL users learned when their search profiles were mistakenly posted on the Internet: Would user 1997374 have searched for information on better erections or cunnilingus if he’d known that AOL was recording every keystroke? Would user 22155378 have keyed in “marijuana detox” over and over knowing someone could play it all back for the world to see? If you’ve ever been seized by a morbid curiosity after a night of hard drinking, a search engine knows—and chances are it’s Google, which owns roughly half of the entire search market and processes more than 3 billion queries a month.
And Google knows far more than that. If you are a Gmail user, Google stashes copies of every email you send and receive. If you use any of its other products—Google Maps, Froogle, Google Book Search, Google Earth, Google Scholar, Talk, Images, Video, and News—it will keep track of which directions you seek, which products you shop for, which phrases you research in a book, which satellite photos and news stories you view, and on and on. Served up à la carte, this is probably no big deal. Many websites stow snippets of your data. The problem is that there’s nothing to prevent Google from combining all of this information to create detailed dossiers on its customers, something the company admits is possible in principle. Soon Google may even be able to keep track of users in the real world: Its latest move is into free wifi, which will require it to know your whereabouts (i.e., which router you are closest to).
Google insists that it uses individual data only to provide targeted advertising. But history shows that information seldom remains limited to the purpose for which it was collected. Accordingly, some privacy advocates suggest that Google and other search companies should stop hoarding user queries altogether: Internet searches, argues Lillie Coney of the Electronic Privacy Information Center, are part of your protected personal space just like your physical home. In February, Rep. Edward Markey (D-Mass.) introduced legislation to this effect, but Republicans have kept it stalled in committee. Google, which only recently retained a lobbying firm in Washington, is among the tech companies fighting the measure.
When I first contacted Google for this story, a company publicist insisted I provide a list of detailed questions, in writing; when I said that I had a problem with a source dictating the terms for an interview, he claimed that everyone who covers Google—including the New York Times and the Wall Street Journal—submits advance questions. (A Times spokeswoman told me the paper sees no ethical problems with such a procedure, though individual reporters’ decisions may vary; an editor in charge of editorial standards at the Journal said the same thing.) The Google flack assured me that this was so he could find the best person for me to talk to—more information for Google, so that Google could better serve me.
Eventually he agreed to put me in touch, sans scripted questions, with Nicole Wong, Google’s associate corporate counsel. I asked her if the company had ever been subpoenaed for user records, and whether it had complied. She said yes, but wouldn’t comment on how many times. Google’s website says that as a matter of policy the company does “not publicly discuss the nature, number or specifics of law enforcement requests.”
So can you trust Google only as far as you can trust the Bush administration? “I don’t know,” Wong replied. “I’ve never been asked that question before.”
And then there were four
Google is one of about four search engines that matter. There are many more than four engines, but only about four have the technology to crawl much of the web on a regular basis. As of July 2003, Yahoo owned Overture, Alltheweb, AltaVista, and Inktomi, and finally dumped Google in February 2004. Everything needed to turn Yahoo into a major search engine was now under Yahoo's roof.It is still possible that Yahoo will shoot themselves in the foot with all of this firepower -- their desire to monetize everything appears to be high on their agenda. But so far, after only a year, Yahoo has shown that their main index search results are on a par with Google's. This is true despite the fact that Yahoo has has infiltrated some pay-per-click links into the main index. One reason for Yahoo's success is that Google's main index, though free from paid results, has declined considerably since early 2003. Amazingly, there is on average only a 20 percent overlap between Yahoo's first 100 results and Google's first 100 results for the same search -- and still, Yahoo is just as good as Google. These days there is so little room at the top of the search results heap, that any combination of algorithms will produce acceptable results. The main difference now is in the depth of the crawl.
Microsoft recently developed their own engine because they found themselves squeezed between the advertising engine of Overture and the search engine Inktomi -- both of which became Yahoo property. In 2003 Microsoft began experimenting with their own crawler. Their new engine was launched in early 2005. If Microsoft puts their greed on a back burner for a few years, by doing deep crawls and presenting a clean interface, they could do to Google what they did to Netscape. There is no "secret sauce" at Google -- we now believe it was all hype from the very beginning. (To the extent that there ever was a secret sauce, the recipe is now known by countless ecommerce spammers, which makes it a liability rather than an asset.) Thousands of engineers in hundreds of companies know how to design search engines. The only real questions are whether you can commit the resources for a deep, consistent crawl of the web, and how aggressively you want to use your search engine to make money.
That gives us Google, Yahoo, and Microsoft. The last one worth watching is Teoma/AskJeeves. Their search technology is good, and they seem serious about expanding their crawl. It remains to be seen how deeply and consistently they will be able to crawl websites with thousands of pages.
Google is easily top dog. They provide about 75 percent of the external referrals for most websites. There is no point in putting up a website apart from Google. It's do or die with Google. If we're all very lucky, one of the other three will soon offer some serious competition. If we're not lucky, we will be uploading our websites to Google's servers by then, much like the bloggers do at blogger.com (which was bought by Google in 2003). It would mean the end of the web as we know it.
It is worthwhile to understand the pressures that the average, independent webmaster is under. And given that Google is so dominant, it's important to understand the pressures that are being brought to bear on Google, Inc. It does not take too much imagination to recognize that there's a struggle going on for the soul of the web, and the focal point of this struggle is Google itself.
At one level, it's a struggle for advertising revenue. The pundits look at only this level, and are unanimous that the only advertising model on the web with any sort of future is one where little ads appear after being triggered by keyword searches, or by the non-ad content of a web page. For example, a search for Google Watch may show some ads on the right side of the screen for wrist watches. While the technique doesn't work for this example, often it serves its purpose. There is only so much pixeled real estate that the average user can be expected to survey for a given search. Today up to half of each screen is dedicated to paid ads on Google, as compared to the ad-free original Google. Everyone wants a piece of this new wave in web advertising, and Google is making a lot of money.
Unfortunately, early evidence suggests that Yahoo is less interested in pure search algorithms, than in acquiring market share in a pay-for-placement and/or pay-for-inclusion revenue stream. The same may be true for Microsoft. Even Google, dazzled by the sudden income from advertising, must be wondering why they go to all that trouble and expense to crawl the noncommercial sector. Those public-sector sites, such as the org, edu and gov domains, do not provide direct income, even though the web would be unattractive without them. All the excitement over a revived online ad market, pushed by pundits hoping for another dot-com gold rush, is beginning to look like the days when AltaVista decided that portals were the Next Big Thing. That notion caused AltaVista to lose interest in improving their crawling and searching -- which is how Google succeeded in the first place.
There has been almost no interest in establishing search engines that specialize in public-sector websites. Where is the Library of Congress? Where are the millions of dollars doled out by the Ford Foundation? How about the United Nations? Why can't some enlightened European entity pick up the slack? Everyone is asleep, while the Internet is getting spammed to death.
At another level, it's a struggle over who will have the predominant influence over the massive amounts of user data that Google collects. In the past, discussions about privacy issues and the web have been about consumer protection. That continues to be of interest, but since 9/11 there is a new threat to privacy -- the federal government. Google has not shown any inclination to declare for the rights of its users across the globe, as opposed to the rights of the spies in Washington who would love to have access to Google's user data.
Much of the struggle at this new level is unarticulated. For one thing, the spies in Washington don't talk about it. Congress has given them new powers, without debating the issues. Google, Inc. itself never comments about things that matter. The struggle recognized by Google Watch has to do with the clash of real forces, but right now all we can say is that potentially this struggle could manifest itself in Google's boardroom.
The privacy struggle, which includes both the old issue of consumer protection and this new issue of government surveillance, means that the question of how Google treats the data it collects from users becomes critical. Given that Google is so central to the web, whatever attitude it takes toward privacy has massive implications for the rest of the web in general, and for other search engines in particular.
Call it class warfare, if you like. Because that brings up the other major gripe that Google Watch has with Google. That's the PageRank problem -- the fact that Google's primary ranking algorithm has less to do with the quality of web pages, than it has to do with the "power popularity" of web pages. Their approach to ranking is anti-democratic, in that already-powerful pages are mathematically granted extra power to anoint other pages as powerful.
It's not that we believe Google is evil. What we believe is that Google, Inc. is at a fork in the road, and they have some big decisions to make. This Google Watch site is trying to articulate and publicize the situation at Google, and encourage more scrutiny of their operations. By doing this, we hope to play a small part in maintaining the web as an information tool that is more useful for the masses, than it is for the elites.
That's why we and over 500 others nominated Google for a Big Brother award in 2003. The nine points we raised in connection with this nomination necessarily focused on privacy issues:
1. Google's immortal cookie:
Google was the first search engine to use a cookie that expires in 2038. This was at a time when federal websites were prohibited from using persistent cookies altogether. Now it's years later, and immortal cookies are commonplace among search engines; Google set the standard because no one bothered to challenge them. This cookie places a unique ID number on your hard disk. Anytime you land on a Google page, you get a Google cookie if you don't already have one. If you have one, they read and record your unique ID number.
2. Google records everything they can:
For all searches they record the cookie ID, your Internet IP address, the time and date, your search terms, and your browser configuration. Increasingly, Google is customizing results based on your IP number. This is referred to in the industry as "IP delivery based on geolocation."
3. Google retains all data indefinitely:
Google has no data retention policies. There is evidence that they are able to easily access all the user information they collect and save.
4. Google won't say why they need this data:
Inquiries to Google about their privacy policies are ignored. When the New York Times (2002-11-28) asked Sergey Brin about whether Google ever gets subpoenaed for this information, he had no comment.
5. Google hires spooks:
Matt Cutts, a key Google engineer, used to work for the National Security Agency. Google wants to hire more people with security clearances, so that they can peddle their corporate assets to the spooks in Washington.
6. Google's toolbar is spyware:
With the advanced features enabled, Google's free toolbar for Explorer phones home with every page you surf, and yes, it reads your cookie too. Their privacy policy confesses this, but that's only because Alexa lost a class-action lawsuit when their toolbar did the same thing, and their privacy policy failed to explain this. Worse yet, Google's toolbar updates to new versions quietly, and without asking. This means that if you have the toolbar installed, Google essentially has complete access to your hard disk every time you connect to Google (which is many times a day). Most software vendors, and even Microsoft, ask if you'd like an updated version. But not Google. Any software that updates automatically presents a massive security risk.
7. Google's cache copy is illegal:
Judging from Ninth Circuit precedent on the application of U.S. copyright laws to the Internet, Google's cache copy appears to be illegal. The only way a webmaster can avoid having his site cached on Google is to put a "noarchive" meta in the header of every page on his site. Surfers like the cache, but webmasters don't. Many webmasters have deleted questionable material from their sites, only to discover later that the problem pages live merrily on in Google's cache. The cache copy should be "opt-in" for webmasters, not "opt-out."
8. Google is not your friend:
By now Google enjoys a 75 percent monopoly for all external referrals to most websites. Webmasters cannot avoid seeking Google's approval these days, assuming they want to increase traffic to their site. If they try to take advantage of some of the known weaknesses in Google's semi-secret algorithms, they may find themselves penalized by Google, and their traffic disappears. There are no detailed, published standards issued by Google, and there is no appeal process for penalized sites. Google is completely unaccountable. Most of the time Google doesn't even answer email from webmasters.
9. Google is a privacy time bomb:
With 200 million searches per day, most from outside the U.S., Google amounts to a privacy disaster waiting to happen. Those newly-commissioned data-mining bureaucrats in Washington can only dream about the sort of slick efficiency that Google has already achieved.
No comments:
Post a Comment