May 27, 2011

35 million Google Profiles collected into private database

If you are one of those individuals that made their own Google Profile, chances are that you knew and agreed to the fact that the information you included in it will be available for anyone who searches for it online.

But, maybe you haven't thought about the possibility of this information being harvested and indexed in order to make mining of it easier. Whether you have or not, it is ultimately irrelevant - you have shared the information with Google, and it does not forbid the indexing of the list.



Nor does it limit the amount of data that can be extracted. According to Matthijs Koot, a Ph.D. student of the University of Amsterdam who attempted this feat, Google didn't attempt to throttle, block, CAPTCHA or in any other way make his mass-downloading more difficult.

The result is that during the course of one month, he was able to create a database containing all Google Profiles - some 35 millions of them. In it are stored Twitter conversations, names, aliases, past education and employment information, links to Picasa photoalbums and - in 15 million cases - the username, which is easily translated into a valid Gmail address.

"My activities are directed at inciting, or poking up, debate about privacy -- NOT to create DISTRUST but to achieve REALISTIC trust -- and the meaning of 'informed consent'," points out Koot. "How can a user possibly be considered to be 'informed' when they're not made aware about the fact that it does not seem to bother Google that profiles can be mass-downloaded and about misuse value - or hopefully the lack of it - of their social data to criminals and certain types of marketeers?"

According to The Register, Google isn't worried about Koot's project and the implications. "Public profiles are usually discovered when people use search engines, and sitemap information makes it possible for search engines to index these public profiles so that people can find them. The sitemap does not reveal any information that is not already designated to be public," said the company spokesman.

And users can set their profile settings not to allow their profiles to be indexed by search engines. I guess that in Google's mind, their hands are clean.

And while I do believe that informed consent is definitely something the company should consider and work on, I can't help to think that people should accept part of the responsibility for their privacy and simply stop putting that much personal information online.

No comments:

Post a Comment