A computer security researcher has launched a project designed to provide people greater privacy when using Google, as the company expands the scope of data its collects about its users.
The project, called GoogleSharing, is a Firefox add-on that uses an anonymous proxy service that gives Google false information when someone uses services that don’t require an account, such as its search, news, and images services, said Moxie Marlinspike , a security consultant and penetration tester with the Institute of Disruptive Studies.
Google collects a vast amount of information about its users, said Marlinspike, who gave a presentation at last week’s Black Hat conference. The company collects IP (Internet protocol) addresses, search requests, browser type and more.
Google as well as other major search companies have taken steps to allay concerns over data collection, such as anonymizing parts of IP addresses held in their records after certain periods of time. But Google dictates how it anonymizes information that could potentially be collated later to profile a user, Marlinspike said.
With IP addresses, for example, Google anonymizes the last octet of the address after nine months, Marlinspike said. Some privacy advocates argue that does not go far enough. Google also uses cookies, or data files stored by a browser, to associate search queries with a particular installation of a browser on a given computer.
“The main problem is that they [Google] have a lot of data,” Marlinspike said. “They do record everything. Forever. In many ways, the information they have probably paints a more complete picture of you than even your best friend would know.”
So Marlinspike built GoogleSharing, an add-on for Firefox. When it is enabled, GoogleSharing detects when someone is using a Google service that doesn’t need a login.
If it’s a search request, for example, GoogleSharing then strips the request of its cookie. The search request is encrypted and sent to a customized proxy server.
“You get SSL [secure sockets layer] protection on your local area network for Google services that normally don’t provide https:// access,” Marlinspike said.
The proxy server then assigns a different yet valid Google cookie to the request and washes the requests of its original HTTP headers. The request is then sent to Google, Marlinspike said. Google returns the answer to the proxy server, which is then passed on to the client.
There are other anonymizing services that provide a greater degree of privacy protection such as The Onion Router (TOR), Marlinspike said. TOR should be used for high-value searches, he said.
But TOR is “painfully slow,” Marlinspike said. Since TOR also strips out HTTP headers, Google may treat the request as an attempt to abuse its services, displaying a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) that a person must solve and respond to before it will deliver search results. That can disrupt a person’s productivity.
GoogleSharing is fast, transparent and doesn’t interrupt someone’s workflow, Marlinspike said.
With GoogleSharing,”what you’re trying to protect are your searches … that are only valuable in aggregate that paint a really big picture of who you are,” Marlinspike said.
Anyone can run a GoogleSharing proxy server. Although those running one of those proxies would have access to the same information Google would have received, “those that are running a GoogleSharing proxy server are in a much worse position than Google to make use of that information,” Marlinspike said.
If enough people run GoogleSharing proxies, queries could be distributed among all of the proxies, further diluting the information pool. The add-on can also be configured to use a specific proxy, he said.
“Your intent in using Google is not actually to share information with them,” Marlinspike said. “When you’re using Google you’re not actually trying to give them your personal information. You’re just trying to make use of the services.”
Google did not have an immediate comment, although the company explains on its privacy pages that it keeps search engine data such as queries in order to improve the service and for the security of its systems.
Google introduced a dashboard in November 2009 that lets users see and manage some of the data that the company holds. But users must have an account to access that panel, and it doesn’t show other information the company may have collected.