A River of Facebook Profiles
23rd May, 2018 | Data Security | Entropic
In our previous article, we introduced a model to help describe the lifecycle of information after it has been disclosed to an organization by another individual or organization. In this article we'll focus on another phase of this lifecycle - the Third Party Sharing phase, and describe how it applies to the 2014 mass extraction of 87 million+ user's Facebook profiles performed by Global Science Research (GSR), which were subsequently sold to SCL Elections and Cambridge Analytica, a subsidiary.
Historically, the term data breach has been used to refer to the "intentional or unintentional release of secure or private/confidential information to an untrusted environment". A data breach might occur due to the negligence or malicious intentions of an employee. For example, an employee might have their notebook stolen while travelling, or intentionally steal a USB stick loaded with sensitive customer information from their work environment.
A data breach might also be instigated by a cybercriminal, criminal group or nation state who targets with the intention of exfiltrating information that can be sold, used to leverage money from, or compete with a victim organization, or otherwise undermine the security of another country. Some examples of this are Targeted Attacks, Advanced Persistent Threats and Business Process Compromise (BPC) attacks.
Whether lost due to negligence or criminal activity, once information has travelled across an organization's demarcation of control, it becomes difficult to quantify which individuals, organizations, software and hardware had access to the information, and subsequently where it might have flowed afterwards.
Our belief is that this problem can be solved through technologies that decentralize information. We can do this by ensuring that the information is never in a holistic state at any point during it's storage and handling.
In the case of the recently publicized Facebook incident, it was possible for Global Science Research (GSR) to create a simple Facebook App called ThisIsYourDigitalLife that requested access to a users Facebook profile. Once the user granted access to the App, it was able to use the publically available Facebook Platform to siphon profile information about each Facebook user's identity, their location, likes, friend networks and more. More importantly, the implementation of the Facebook Graph API at the time enabled the traversal of the friends within a their friend network, empowering GSR to glean an additional river of millions of user profiles beyond the few hundred thousand they started with.
Following the refinement of this gathered personal information, Global Science Research (GSR) then sold the information to SCL Elections/Cambridge Analytica - a move that violated the Facebook Platform Policy. Once Cambridge Analytica had this information, they used it to create timely, micro-targeted ads using Facebook and other advertising platforms, to sway opinions with individuals and groups in select regions, affecting the outcomes of elections in countries around the globe.
While the flaw in the Facebook Graph API that allowed this river of profiles to be extracted is important, it's also important to show how the information survived and was replicated after it was extracted. For this reason, we have created a simplified illustration based on the Information Disclosure Lifecycle, to show how the information was subsequently disclosed, refined and sold after it's initial extraction.
The level of access to Facebook accounts used by GSR was not unique, being made available to tens of thousands of other Facebook developers, representing individuals, companies, organizations, and nation states. This means that there is a high possibility that others may have exploited the Facebook Platform in a similar manner, over a period of many years.
As Facebook continues to suspend developer apps, and improve the security of the Facebook Platform as part of it's prioritized investigation into abuse of it's platform data, we'll likely also learn from other sources about additional entities that have been exploiting the Facebook Platform in similar ways.
Prior to commencing insolvency proceedings on the 2nd of May, SCL Elections and Cambridge Analytica had plans for controlling access to the personal data they collected using blockchain technologies, discussed further in this article by Cole Gibson originally posted on CoinCentral.com.
If you have any feedback, questions, or suggestions, please let us know.