Can we fix access to platform data? Europe’s Digital Services Act and the long quest for platform accountability and transparency

Svea Windwehr, Center for User Rights, Gesellschaft für Freiheitsrechte/Society for Civil Rights, Berlin, Germany, svea.windwehr@freiheitsrechte.org
Joschka Selinger, Center for User Rights, Gesellschaft für Freiheitsrechte/ Society for Civil Rights, Berlin, Germany, joschka.selinger@freiheitsrechte.org

PUBLISHED ON: 27 Mar 2024

From negative impacts on teenagers’ mental health to the abuse of data collection for political microtargeting and potentially abetting genocide against the Rohingya: in the past decade, online platforms like Instagram, TikTok and YouTube have been accused of contributing to — in some cases even driving — a host of real-life harms with significant impacts for individuals and communities across the world. Yet even after decades of research, our understanding of platforms’ implications remains limited. Companies tightly control access to their vast amounts of data, leaving researchers dependent on whatever access platforms are willing to provide — which may change at a whim (Keller & Leerssen, 2020). The Digital Services Act (DSA), Europe’s comprehensive new platform law, aims to address this issue by introducing a new right for researchers to access platforms’ data (Nonnecke & Carlton, 2021). But there are still many hurdles to clear before researchers can access platform data. To help enforce this critical aspect of the DSA, we call on researchers to make their voices heard and contribute to the upcoming European Commission’s consultation on Article 40 of the DSA. Now more ever, your input is needed to help deliver on the DSA’s promises.

Locked out: Restrictions to platform data

The DSA comes at a crucial time, as access to platform data has drastically deteriorated over the past years. Just a few months after Elon Musk’s takeover of Twitter/ X, access to Twitter’s API was drastically tightened. Previously, Twitter’s generous, unbureaucratic and, most importantly, free, access to research data had made it a treasure trove for scholars studying online harms like disinformation or hate speech. This not only contributed to significant research on Twitter’s impact on the spread of false information, for instance, but even resulted in an overrepresentation of Twitter data in social media research: for a moment in time, it seemed like Twitter was the internet.

However, with Twitters’ introduction of paid tiers of API access, many previous use cases became financially prohibitive; researchers, but also a host of useful services, were cut off. Following suit just a few months later, Reddit implemented changes to its API aimed at restricting AI companies like Google or OpenAI from using Reddit’s data to train their large language models, thus also curtailing access to data for researchers and non-profits. And only a few weeks ago, Meta announced the discontinuation of CrowdTangle, a popular tool that had allowed researchers and newsrooms to monitor and analyze content trends across Facebook and Instagram. During a global super election year, the timing of Meta’s announcement is particularly sensitive.

These examples highlight the one-sided nature of access to platform data: researchers are entirely subject to the platforms’ approaches to research and transparency, which frequently change in response to regulation (or fear thereof), management transitions or industry trends. Additionally, existing companies’ research programmes or APIs are rarely comparable, making it especially challenging to track issues across platforms or comparing different approaches to content moderation. And even during the heyday of Twitter’s free API or Meta’s CrowdTangle, the extent of access to research data was limited, as the platforms’ tools only gave insights into subsets of public data. Crucially, information such as deleted content, reasons for deletion, data over time, and information on the internal policies and practices have historically been off limits. Moreover, platforms have implemented legal measures to control researcher data access, including prohibitions on data scraping in their terms of services. Citing privacy concerns or negative impacts on their advertising business, platforms are also resorting to legal action to enforce these terms, as evidenced by Twitter’s recent lawsuit against the nonprofit organisation Center for Countering Digital Hate for scraping data for research purposes. After decades of research on online platforms and social media, our understanding of the implications of products that are used by millions on society remains limited. And while the relationship between online platforms and the scholars studying them has always been asymmetrical at best, these recent developments indicate severe further decline.

Enter the Digital Services Act

When European lawmakers set out to update and extend the legal framework governing online platforms across the European Union, academics and civil society alike saw a chance to advocate for a robust framework to enable critical research. Getting data access right was also a key concern for policymakers, as many understood that effective data access would be critical to understand companies’ compliance and ensure enforcement of the DSA. The resulting Article 40 on the one hand ensures that regulators will be able to ask service providers that meet the thresholds of very large online platforms and search engines (as defined by the DSA) for data necessary to monitor their compliance.

On the other hand, Article 40 establishes a mechanism to grant so-called “vetted researchers” access to data necessary to study systemic risks posed by platforms’ services. Systemic risks as defined by the DSA cover a wide range of (potential) negative impacts of platform practices, including any actual or foreseeable negative effects on the exercise of fundamental rights or civic discourse. Researchers can apply for “vetted researcher” status with their local Digital Services Coordinator, the supervisory authority in charge with DSA oversight in the Member State in which they live, or the Digital Services Coordinator in the country of establishment of the platform. To be considered, researchers need to fulfil a number of requirements, such as being affiliated to a research organisation (which may be a university, but can also be a civil society organisation), they must be independent from commercial interests, and be able to guarantee the security and confidentiality of data requested. Once awarded the “vetted researcher” status, their data request is forwarded to the platform in question. Platforms are in turn obliged to give access within a reasonable period, specified in the request (Vermeulen, 2022). In addition, platforms must grant researchers that fulfil the above-mentioned conditions access to publicly available data “without undue delay” — including data in real time. Article 40 thus not only covers data requests to non-public data but seems to establish a right to API-like access to public data — possibly allowing researchers to build on previous work based on Twitter’s API or CrowdTangle.

Article 40 has the potential to revolutionise research on online platforms and could lay the groundwork for effective enforcement of the Digital Services Act. But unfortunately, we’re not quite there yet.

A call to action

Article 40 lacks detailed guidance on how authorities should determine which researchers will be awarded the “vetted” status, how platforms should provide data, and according to which conditions. Many of these details will be outlined in a delegated act prepared by the European Commission. This act is crucial for fulfilling the DSA’s promises regarding data access and is expected in 2024. As part of its preparatory work for the delegated act, the European Commission will conduct a public consultation that will help shape the delegated act. To ensure that the delegated act reflects the needs and experiences of those it’s meant to serve, it will be crucial to include as many researchers’ perspectives as possible in responses to this consultation.

Beyond the delegated act, access to research data will depend on enforcement that centres academic freedom and researchers’ needs and experiences. Many platforms will likely not hand over data voluntarily, or may resist disclosing certain information. Supervisory authorities may support platforms in protecting information deemed as trade secrets. If you or your colleagues are affected by such decisions, it’s important to fight back. Challenge negative decisions, appeal them, and reach out to organisations that offer legal support. The DSA offers a pivotal chance to conduct groundbreaking research, illuminate the impact of platforms’ practices and business models on societies and communities across the world, and hold tech companies to account — let’s use it.

The Gesellschaft für Freiheitsrechte/ Society for Civil Rights is a strategic litigation NGO based in Germany. Its Center for User Rights focuses on the enforcement of user rights under the DSA through litigation and policy work. As part of this work, the Center supports researchers to apply for and enforce their rights to data access.

References

Keller, D., & Leerssen, P. (2020). Facts and where to find them: Empirical research on internet platforms and content moderation. In N. Persily & J. A. Tucker (Eds.), Social media and democracy: The state of the field and prospects for reform (pp. 220–251). Cambridge University Press. https://doi.org/10.1017/9781108890960

Nonnecke, B., & Carlton, C. (2022). EU and US legislation seek to open up digital platform data. Science, 375(6581), 610–612. https://doi.org/10.1126/science.abl8537

Vermeulen, M. (2022). Researcher access to platform data: European developments. Journal of Online Trust and Safety, 1(4). https://doi.org/10.54501/jots.v1i4.84

Add new comment