A new year has arrived, along with the usual air of optimism. Yet the 21st century is already shaping up to be a challenging one. From climate change to terrorism, the difficulties confronting policy makers are unprecedented in their variety, but further in their complexity. Our existing policy tool kit seems stale and outdated. Increasingly, it is clear, we require not only new solutions but also new methods for arriving at solutions.
Data, and new methods for organizations to collaborate in demand to extract insights from data, is likely to become more central to meeting these challenges. We live in a quantified era. It is estimated that 90% of the world’s data was generated in the last two years — from that entirely new inferences can be extracted and applied to help address a few of today’s best vexing problems.
In particular, the vast streams of data generated over social news platforms, when analyzed responsibly, can offer insights into societal patterns and behaviors. These types of behaviors are difficult to generate with existing social science methods. All this advice poses its own problems, of complexity and noise, of risks to privacy and security, but it further represents tremendous potential for mobilizing new forms of intelligence.
In a recent report, we examine ways to harness this potential although limiting and addressing the challenges. Developed in collaboration with Facebook, the report seeks to grasp how public and private organizations can join forces to use social news data — over data collaboratives — to mitigate and perhaps solve a few our best intractable policy dilemmas.
Data Collaboratives: Public-Private Partnerships for Our Data Age
For all of data’s potential to address public challenges, best data generated today is collected by the private sector. Typically ensconced in corporate databases, and tightly grasped in demand to maintain competitive advantage, this data contains tremendous possible insights and avenues for policy innovation. But because the analytical expertise brought to bear on it is narrow, and limited by private ownership and access restrictions, its vast potential often goes untapped.
Data collaboratives offer a way around this limitation. They represent an emerging public-private partnership model, in that participants from dissimilar areas , including the private sector, government, and civil society , can come together to exchange data and pool analytical expertise in demand to create new public value. While still an emerging practice, examples of such partnerships now exist around the world, over sectors and public policy domains.
Through our research, we have identified six types of data collaboratives enabling the exchange of data and/or data science expertise over sectors to create public value.
First, data cooperatives or pooling involve corporations and other entities joining together to create shared data resources. Corporations can further make data available to qualified applicants who compete to develop new apps or discover innovative uses for the data over prizes and challenges. Data collaboratives can take the form of research partnerships as well, with corporations sharing data with universities, academics, or other researchers to enable the generation of new insights. Shared (often aggregated) corporate data can be used to create intelligence products such as tools, dashboards, reports, apps, and other technical devices to backing public or humanitarian objectives. Application programming interfaces (APIs) that offer direct access to corporate data streams, as well as data that can be manually scraped, are enabling researchers and practitioners to access data for research, testing, and data analytics. Finally, corporations can share data with a limited number of trusted intermediaries, such as the UK’s Consumer Data Research Centre and the international development nonprofit NetHope, to enable data analysis and modeling, as well as other financial worth chain activities.
The practice of using data collaboratives tends to be global and cross-sectoral. For example, the California Data Collaborative is a data pooling attempt among a coalition of water utilities, cities, and water retailers to create an integrated, California-wide platform to bring accurate technical analysis and ahead inform water policy and operational judgment making. Recently, we announced the creation of a data collaborative with UNICEF, Universidad del Desarrollo, Telefónica R&D Center, ISI Foundation, and DigitalGlobe — and supported by Data2X — to leverage mobile telephone and satellite imagery to increase our understanding of how megacities like Santiago, Chile, can create safer, more-efficient mobility solutions for women and girls.
How the Exchange of Data Can Help Solve Public Problems
At a extensive level, data collaboratives can help to unlock insights from vast, untapped stores of private-sector data. But approaching what purpose? Our research, applied to social news data and viewed more generally, indicates five public-value propositions. These include:
Situational awareness and response. Social news data, including shares, tweets, updates, and search data, can help NGOs, humanitarian organizations, and others ahead grasp demographic trends, public sentiment, and the geographic distribution of various phenomena (for example, the spread of disease).
Consider Facebook’s Disaster Maps initiative, that seeks to occupy any gaps in traditional data sources and to inform more-targeted relief efforts from responders. Following natural disasters, Facebook shares aggregated location, movement, and self-reported safety data collected over its platform with partner organizations, including the International Federation of Red Cross and Red Crescent Societies, UNICEF, and the World Food Program.
Knowledge creation and transfer. Data collaboratives can bring together (or “join”) extensively dispersed data sets, in the process creating a ahead understanding of possible correlations and causalities as well as what variables make a difference for what type of problem.
For example, MIT’s Laboratory for Social Machines’ Electome Project analyzed massive Twitter activity data sets to improve reporting around the 2016 U.S. presidential election; and LinkedIn’s Economic Graph Research Program is enabling new knowledge on economic conditions and job-seeking behaviors.
Public service design and delivery. Private data sets often contain a wealth of advice that can enable more-accurate modeling of public services and help guide service delivery in a targeted, evidence-based manner.
As an example, Waze partners with over 60 cities to share its crowdsourced traffic data to improve urban planning and bring insight into approaches for easing urban congestion, over its Connected Citizens Program.
Prediction and forecasting. Richer, more-complete advice from data collaboratives enable new predictive capabilities for policy makers and others, allowing them to be more proactive.
This was the approach taken by researchers aiming to predict the likelihood of floods using Flickr metatags, and adverse drug events informed by social news and forum postings.
Impact assessment and evaluation. Finally, data collaboratives can aid in monitoring, evaluation, and improvement. By leveraging social news data, public-interest actors can rapidly assess the results of their actions to iterate on products and programs when necessary.
This is what Sport England did, for instance, when it used data from Twitter to grasp women’s views on exercise and exercise messaging to inform and assess its #ThisGirlCan lobby aimed at improving women’s and girls’ health and physical activity.
Professionalizing the Responsible Use of Private Data for Public Good
For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations. Today, each attempt to establish a cross-sector partnership built on the analysis of social news data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.
As a consequence, the process of establishing data collaboratives and leveraging privately grasped data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.
By establishing data stewardship as a corporate function, recognized within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked.
If fresh efforts approaching this completion — from initiatives such as Facebook’s Data for Good efforts in the social news space and MasterCard’s Data Philanthropy approach around finance data — are meaningfully scaled and expanded, data stewards over the private sector can act as change agents responsible for determining what data to share and when, how to protect data, and how to act on insights gathered from the data.
Still, multiple companies (and others) continue to balk at the prospect of sharing “their” data, that is an understandable response provided the reflex to guard corporate interests. But our research has indicated that multiple benefits can accrue not only to data recipients but further to those who share it. Data collaboration is not a zero-sum game.
With backing from the Hewlett Foundation, we are embarking on a two-year project approaching professionalizing data stewardship (and the use of data collaboratives) and establishing well-defined data responsibility approaches. We invite others to join us in working to transform this practice into a widespread, impactful means of leveraging private-sector assets, including social news data, to create positive public-sector outcomes around the world.