Public Facebook Information Is Just Part of Larger Online Data Gray Market
Whether through inadvertent security problems or malicious, aggressive actions by third party applications, Facebook has sometimes struggled to completely maintain the privacy of user data. The latest example: distinct user identity numbers in URLs have been getting passed by third-party developers to advertising networks, as The Wall Street Journal published as part of an ongoing series covering online privacy last night.
There’s more to the meta-story, too. The article mentioned LOLapps as one of the guilty developers; we covered how Facebook blocked its applications on Friday. It is now live again, for unstated reasons.
First, what is the Journal concerned about? The URLs in question can be used by third parties to figure out user names, but as with profile photos, friend lists and other information, all of this is already public information. Facebook changed its terms of service last December to require all of that information to be public, after having previously offered a setting, and terms to match, that allowed users to set all data to be private. When it introduced the Graph API — a way for developers to this access public user data — back in April, it made all interests public, too.
Although Facebook says it is working to control the URL ID problem, the fact that users’ real names are already public makes the particular news less interesting than it might seem.
The previous series of decisions was what made user names public in the first place, and provoked many privacy activists and a portion of the user base — for those who wanted complete privacy on Facebook, that move amounted to the company breaking its word. It has argued back that its service needs to be more open to be valuable to users, so it can do things like offer a developer platform that allows for socially rich applications.
Most users around the world either don’t care, or have so far continued using Facebook even with the more open approach that it has moved to. The site’s traffic numbers have continued to surge, for the most part, regardless of privacy changes, or redesigns, or other negative press, for that matter.
The real story isn’t just about the user IDs being aggregated and associated with public information by third parties, it’s about this happening to all user data provided by Facebook — and by any other web service. Over the years we’ve heard reports about unethical application developers, staying quiet, often in countries where any enforcement is difficult or impossible, and quietly building applications that get users to willfully share more private data like photos, videos, and anything else accessible on the platform.
A black, or at least gray market exists on the web for user data from any number of sources. Between security holes, readily public information, scams (including deceptive medical surveys and other privacy-infringing advertising offers), scraping and other methods, the problem transcends Facebook and impacts any social network and other online data service. The market for data about people, no matter how it has been obtained, includes a range of above-ground companies including performance marketing service providers, advertising networks, and we’ve heard even credit car and insurance companies. The scope of how data is being used, for anything from targeting ads to providing insurance rates, is currently unknown.
Facebook data, combined with all of the other possible sources out there, could be aggregated to create detailed, unauthorized and secret profiles of users.
One of the more intriguing points of The Journal’s article is about Rapleaf, a company that collects and distributes personal data about people. It has been quietly building a business buying and selling user data over the couple years, and it has previously come under heavy criticism for how it has used data.
It has become more public more recently, doing things like writing blog posts that describe how it is handling technical issues. It also claims on its web site that it keeps all sensitive user data private.
As we wrote in early May:
The bigger issue, as we’ve mentioned before, is that there are increasing reports of rogue applications and others who scrape and store Facebook user information then resell it on the black market for any number of purposes, from online lead generation to phishing and other scams. The extent of the problem is not well-understood, but Facebook appears to lack means to control third party redistribution of its data beyond doing things like suing companies or kicking them off of its platform.
Developers in the Facebook community have widely confirmed the existence of these practices, although most larger and more reputable developers have stayed away from this type of business. Facebook has had large teams of people working in security for years, and it says it has a number of steps in place to fight back — like ways of detecting third parties that are abusing user data, we’ve heard.
Congress has been begun to examine the more general problems of online privacy and user data control. The ad industry, fearing impending regulation, is already trying to self-police. Between the scope of the problem being unclear, and ongoing privacy and security problems at web companies, the government is likely to examine the entire industry even more closely.