Data justice

: Data justice has emerged as a key framework for engaging with the intersection of datafication and society in a way that privileges an explicit concern with social justice. Engaging with justice concerns in the analysis of information and communication systems is not in itself new, but the concept of data justice has been used to denote a shift in understanding of what is at stake with datafication beyond digital rights. In this essay, we trace the lineage and outline some of the different traditions and approaches through which the concept is currently finding expression. We argue that in doing so, we are confronted with tensions that denote a politics of data justice both in terms of what is at stake with datafication and what might be suitable responses.


Introduction
The growing reliance on data-driven technologies across social life-what is commonly referred to as datafication-is widely seen to propel transformations across areas of science, government, business and civil society.These transformations are often simultaneously touted as enhancing forms of efficiency and better decisionmaking at the same time as presenting significant societal challenges.Data justice has emerged as a key framework for engaging with such challenges in a way that privileges an explicit concern for social justice.Privileging social justice concerns in the analysis of information and communication systems is not in itself new, but the concept of data justice has been used to pave a way for a shift in understanding of what is at stake with datafication beyond digital rights.In particular, Dencik et al. (2019: 875) argue that we have seen the concept of data justice 'used to denote an analysis of data that pays particular attention to structural inequality, highlighting the unevenness of implications and experiences of data across different groups and communities in society.' In this brief essay, we look at how this focus has manifested across different traditions and disciplines and point to a continued politics of data justice that illustrates the unsettled nature of this concept.
We argue that how we understand the ' grammar' of justice (Fraser, 2008)

Data justice in context
The concept of data justice draws from a range of long-standing traditions that have concerned themselves with the social justice implications of the nature of information and communication systems, ranging from debates on ethics and human rights to the orientation of activism and social movements.While these earlier discussions provide foundational insights, data justice has predominantly emerged in the dual context of the growing focus on so-called big data (and the more recent iterations of machine learning and artificial intelligence), and the perceived limitations in how such developments have been framed and approached.In particular, the revelations from the Snowden leaks, first published in 2013, pushed the societal significance of 'big data' into a more mainstream and public view (Lyon, 2015) but often in terms of a simple binary between enhanced efficiency and (state)secu-rity on the one hand and concerns with surveillance and privacy on the other (Hintz, Dencik, & Wahl-Jorgensen, 2018).This provided notable impetus for engaging with the implications of emerging technologies, which included the mainstreaming of privacy-enhancing technologies and encryption as well as the significant prominence of digital rights and anti-surveillance campaigning in the public realm, but it also privileged particular responses that struggled to account for the implications of datafication in relation to broader social justice agendas (Dencik, Hintz, & Cable, 2016).
As Andrejevic (2015) has outlined, the nature of surveillance programmes revealed in the Snowden leaks are intimately linked to a model of economics and state-corporate interests in detecting and predicting patterns, profiling and categorising populations rather than individual people.Data-centric information systems are instrumental as systems of control, not just by increasing the potential for monitoring, but as sorting mechanisms (Gandy, 1993).Data justice debates tend to understand how these sorting mechanisms work and what their relationship is to historical contexts, social structures and dominant agendas as not just a question of individual privacy, but one of justice.This focus is significant because although it is clear that how we make sense of the social world is central for how we also make claims about it, systems of communication and information infrastructures have tended to be neglected in prevalent theories of justice, often in favour of a focus on political institutions and moral ethics dating back to Aristotle through to Rawls (Bruhn Jensen, 2021).Whilst such a focus continues to be important for ideas of justice, the nature of institutions and the parameters for moral ethics are increasingly bound up with the nature of our information and communication systems.To speak of data justice is thus to recognise not only how data, its collection and use, increasingly impacts on society, but also that datafication is enabled by particular forms of political and economic organisation that advance a normative vision of how social issues should be understood and resolved.That is, data is both a matter in and of justice; datafication embodies not only processes and outcomes of (in)justice, but also its own justifications.
In this sense, data justice as a concept and focus speaks closely to the sorts of concerns that inform critical data studies and related fields in that it seeks to examine data issues in the context of existing power dynamics, ideology and social practices, rather than as technical developments in the interactions between information systems and users (boyd and Crawford, 2012;Van Dijck, 2014;Kitchin & Lauriault, 2014).The premise is that developments in data cannot be considered sepa-rately from social justice concerns and agendas, but need to be integrated as part of them (Dencik, Hintz, & Cable, 2016).However, what this means as an approach is varied, and we have seen a range of different perspectives engage with data justice, often across disciplines and traditions.Whilst these different approaches unite around a need to foreground justice in understandings of data, or to foreground data in understandings of justice, as we shall see they also elicit areas of tension in the meaning of data justice in important ways.As Fraser (2008) has argued, despite the many theories of justice that inform the architecture of institutions and laws to uphold justice, we rarely share a common ' grammar' of justice, such as the three 'nodes' of the what (ontology), the who (scope) and the how (procedure) of justice.This condition of 'abnormal justice' , she argues, is apparent with disruptive developments such as globalisation that highlight conflicts over what we want to make claims to, when we make claims to justice, who those claims apply to, and the processes through which they may be realised.
Datafication is often touted as a form of disruption, but only rarely in the context of justice.Drawing on Fraser's notion of abnormal justice can be fruitful for elucidating this relationship (Cinnamon, 2017;Dencik, Jansen, & Metcalfe, 2018).For example, as Couldry (2019) has argued, datafication significantly shapes what comes to count as social knowledge and the very terms upon which we come to reason about values as choice is automated and regulated by what legal scholar Karen Yeung (2017) describes as the 'hypernudge' .At the same time, our understanding of data itself is not clearly defined and so when we want to make justice claims about it, it is unclear whether this is about its distribution as a good or resource, the inferences made from it and how people come to be recognised, or the nature of how it is generated and attributed meaning.Similarly, the nature of data flows has dislocated any clear relationship between the loci of decision-making and the subject of such decision-making as well as any bounded polity of who can make claims to data justice.As Andrejevic (2014) has argued, datafication brings about particular social stratifications between different data classes whilst the notion of any individual data subject struggles to account for how data about an individual is bound up with population-level effects (Viljoen, 2020).Finally, the criteria or procedure through which disputes about the 'what' and 'who' of data justice should be resolved continues to be a source of tension.At one level, Pasquale (2017) has argued that we are moving from territorial sovereignty to 'functional sovereignty' in which technology companies increasingly take on governance functions and disrupt procedures for how decision-making might be challenged or held to account.At the same time, it is unclear what institutions should be the arbiters of justice claims about data, whether traditional avenues such as governments or courts are still adequate, and what role there is for computational or design mechanisms to uphold justice claims.

Approaches to data justice
There are therefore notable tensions around the what, who and how of data justice that speak to a particular politics around how to engage with the broader implications of datafication for society.This is perhaps unsurprising considering the inherently trans-disciplinary nature of datafication, and the many stakeholders that shape its development.However, it also points to the way different interests and perspectives manifest in not only the analysis of societal implications but also responses to them.
In policy and data governance debates, for example, several on-going concerns about digital rights became elevated in the aftermath of the Snowden leaks and with a renewed focus on big data, were translated into regulation.Most notably in Europe was the development of a new General Data Protection Regulation (GDPR) that was adopted in 2018 on the premise that individuals should be able to claim some rights with regards to information collected about their person, and that collecting such information requires some form of consent.Although broad in its conception of data protection, questions remain about both its scope and enforceability.Perhaps in part as a response, much attention and resources have been dedicated to advancing 'data ethics' (and its most recent iteration as ' AI ethics') as alternative and complementary frameworks.This field has engaged a range of different streams of thought and practice, some of which continue a long-standing tradition of computer ethics while changing the level of abstraction of ethical enquiries from an information-centric to a data-centric one (Floridi & Taddeo, 2016).That is, the focus shifts from a concern with how to treat information as an input and output of computing to a focus on how people access, analyse and manage data in particular, not necessarily engaging any specific technology, but what digital technology manipulates.
Data ethics foregrounds key challenges with datafication, including transparency, bias and accountability, but has also been criticised for containing such challenges within individualistic moral assessments or as procedural safeguards that do little to challenge existing power structures (D'Ignazio & Klein, 2019;Taylor & Dencik, 2020).However, traditionally there continues to be a close connection between ethics and justice.For example, in her engagement with data justice, Taylor (2017) puts forward a framework for determining ethical paths through a datafying world that can underpin data governance.This framework considers three central pil-lars-(in)visibility; (dis)engagement with technology; and antidiscrimination-that can form the basis of international data justice.These pillars collectively inform fairness in the way people are made visible, represented and treated as a result of their production of digital data.Importantly, they take into account the novelty and complexity of the ways in which data systems can discriminate, discipline and control.This builds on work on information justice put forward by Johnson (2016) in which he outlines how data systems have a disciplinary function because the way data is collected and structured constitutes a form of normative coercion.The task, therefore, is to make this politics of data technologies explicit and to consider both the right to be seen and represented as well as the right to withdraw from a database.In this sense, Taylor's framework for data justice accounts for both the positive and negative potential of new data technologies to facilitate human flourishing (Taylor, 2017).
More recently, we have seen some of the pillars outlined in Taylor's framework for data justice migrate into discussions on data governance that seek to broaden the scope for what such governance entails.A prominent focus has been on data stewardship, for example, such as the establishment of 'data trusts' that would provide a legal mechanism to ' empower' data subjects to 'take the reins' of personal data by introducing an independent intermediary between data subjects and data collectors (Delacroix & Lawrence, 2019).A related but different take on the control over data has been expressed in terms of 'data commons' that enable people to share their data for specific purposes or social benefit (Grossman et al., 2016;Morozov, 2018;Nesta, 2021).The premise is that data is a public good and that people should have some say in what data is collected, how it is used and who benefits.Viljoen (2020) has articulated some of these ideas within a framework she describes as 'democratic data governance' that shifts the lens away from a focus on the handling and processing of data towards the institutional reforms needed to facilitate democratic participation in determining the population-level effects of datafication.
These governance debates have also been significant for changing the perception of computer scientists and engineers and their role within society (Connolly, 2020).However, it is not always clear how, for example, the proliferation of guidelines for ethical and responsible AI and automation has actually translated into practice, and how data justice concerns might be addressed.In a review by Jobin et al. (2019) they identify justice as a principle in the advancement of data-driven technology as being predominantly expressed in terms of fairness and the monitoring and mitigation of so-called algorithmic 'bias' , which is often equated with discrimi-nation (Balayn & Gürses, 2021).Predominantly, discrimination by algorithms is understood as the result of existing discrimination patterns present in the training data (using demographic categories such as gender, age, ethnicity, or disability), but more comprehensive engagements with this issue also considers biases introduced via assumptions in labels or biases brought about in particular contexts of use (Hallensleben et al., 2020).Less common is the reference to justice in terms of diversity and the possibility to understand and challenge algorithmic decisions, although some frameworks do address such principles with reference to human rights (Fjeld et al., 2020).
The translation of social justice into fairness understood in computational terms has paved the way for different principles to guide the development of data-driven technologies.In some respects, it advances on the longer standing tradition of 'privacy-by-design' in computer science towards a commitment to 'fairness-by-design' .
However, as Gürses et al. (2015) have pointed out, the abstract nature of privacy can lead to very different systems as a result of choosing one or several particular privacy design patterns and privacy enhancing technologies.With a notion such as fairness, there is even less of a shared criteria for what this might mean for computational systems, and what the guiding principles of fairness actually are (Friedler et al., 2021).Moreover, as the community of computer scientists and engineers dedicated to establishing such fairness criteria has grown, especially through a focus on 'de-biasing' and algorithmic discrimination, prominent questions have been asked about the limits of this interpretation of data justice and the legitimacy of technologists to define and be the arbiters of justice claims (Gangadharan & Niklas, 2019).
Justice as a value is conditional on a range of principles that go beyond bias and that cannot be limited to technical components of a system.As outlined in the framework Algorithmic Ecology (Stop LAPD Spying Coalition and Free Radicals, 2020, n.p.), an 'algorithm is designed to operationalize the ideologies of the institutions of power to produce intended community impact' .As such, a value of justice applies not only to the many abstraction layers in which a system operates but also how justice is experienced.In this sense, the universal scope of a system often assumed in computational definitions of fairness in order to also accommodate population level optimisation falls short in accounting for the way systems are often used to target specific groups.Furthermore, principles need to be incorporated into not just the system, but the design process itself and the role and relation of technologists towards other stakeholders (Costanza-Chock, 2020).Such understandings invite more holistic views of computer science and software engineering methodologies as decidedly socio-technical (Connolly, 2020;Selbst et al., 2019).
Calls have therefore been made to focus justice concerns in computer science less on the input and output data and more explicitly on the connection of the optimisation process with the real-world task (Hooker, 2021;Lipton, 2018).Whilst the optimisation task of a system can be more or less explicit, the issue of misalignment between optimisation tasks and performance metrics and real-world problems is gaining traction within the field.It points to the limitations of fairness claims without an understanding of the effect of data collection, designer world views, and embedded values (Friedler et al., 2021).As McQuillan (2019) has argued, the optimisation process tends to implement societal structures and logics and secure the 'institution in the loop' in any system.At a technical level, such structures and logics can be challenged by moving from process optimisation to community well-being (Musikanski et al., 2020) or by counter-optimising a system to protect impacted communities that might be harmed by institutional optimisation logics (Kulynych et al., 2020).
These resistance strategies play an important role in how we might think of data justice in terms of political and social mobilisation.They point to the importance of situating technological developments in social, economic, political and cultural context and to consider data issues in relation to historical struggles for justice around issues such as equality, oppression and domination (Dencik, Jansen & Metcalfe, 2018).That is, data justice as a way to inform mobilisation needs to be levied at system-level critique in which the parameters of the debate do not begin and end with the technology itself but rather how datafication features in on-going negotiations of social relations and power dynamics within society.On this reading, the asymmetries between different data classes point to the entrenchment of social stratifications and the growing concentration of power in private hands, whilst shifting decision-making away from the public realm.Issues of 'bias' or discrimination in data-driven tools are not bugs in the system, but rather a structural feature informed by the historical social sorting of populations based on stigmatisation, marginalisation and exclusion.And the operationalism of data systems speaks to a prevalent rationality that has long dominated many parts of the world in terms of privileging individualism, market logics and bureaucratic control (Gandy, 1993;Fourcade & Healy, 2017;Benjamin, 2019;Andrejevic, 2019).
Approached from the perspective of political and social mobilisation, data justice draws from critical traditions in media studies that have been oriented toward 'media justice' , which have explicitly sought to situate media as a social justice issue.The aim is not necessarily to focus on media reform per se, but to bring together media scholars and activists and social justice scholars and activists as a way to identify synergies between the two fields and advance a better understanding of the role of media and communication in struggles for social justice (Jansen, 2011).
In particular, the media justice frame has sought to privilege the insights and experiences of historically marginalised communities and the long tradition of social justice activism around the world to inform media reform debates.As such, a key contribution of the media justice approach is to draw attention to whose voices are heard and what concerns are foregrounded in efforts towards media and social change.It highlights how the nature of media systems is intricately linked to social justice struggles, calling for different media representations and alternative ownership and governance structures in addressing injustices.Moreover, it calls for different movements and groups, across communication rights and socio-economic rights, to unite and find common ground.
Similarly, mobilisation under a data justice frame starts with a recognition that the burdens of datafication overwhelmingly fall on resource-poor and marginalised groups in society (Eubanks, 2018;Benjamin, 2019;Metcalfe & Dencik, 2020).This is important, as it cuts through the all-too-comfortable narrative that emerged out of the emphasis on mass data collection, particularly prominent in the aftermath of the Snowden leaks, that suggests we are all equally implicated in the datafied society.Instead, data justice debates have to contend with the way the development, advancement and impact of datafication is contingent upon deep historical social and economic inequalities, both domestically and globally.As a starting point, this shifts the focus of what voices need to be centred in any understanding of what is at stake with datafication and challenges the current constitution of the decisionmaking table as to how datafication can and should be negotiated.As an approach, it explicitly undermines the assertion that the technology industry should be able to dictate the scope of problems and solutions, let alone that a decision on what constitutes 'fairness' should be confined to what can be computationally determined.Perhaps more contentiously, it also asserts the need to move mobilisation on data beyond the domain of communication and digital rights groups.
Instead, Gangadharan and Niklas (2019) argue that there is a need to 'decentre' technology in data justice debates, and situate technology within systemic forms of oppression in which the harms that emerge from data-driven systems are articulated by those who are predominantly impacted and those who have a history of struggle against such oppression.That is, the concern with data needs to be part of an integrated social justice agenda, one in which definitions of problems and solutions may not actually be about data.As Hoffmann (2019) has argued, we can-not afford to continuously fail to address the logics that produce advantaged and disadvantaged subjects and the underlying structural conditions against which we come to understand data harms and injustice.In taking such an approach, we are invited to turn our attention to focus on what function datafication-as a discourse and practice-serves in different contexts, the social and political organisation that enables it, and who benefits.

Relevance of data justice
Importantly, therefore, the intersection between data and justice encompasses more than just technological questions and instead forces us to ask how society should be organised and what the role of technology might play in it.We see this also in the way that data justice debates are being shaped by activism and campaigning.The Center for Media Justice in the United States, for example, has created a Data Justice Lab dedicated to thinking through ways to bridge research, data, and movement work relating to issues like surveillance, carceral tools, internet rights, and censorship.The Detroit Digital Justice Coalition has worked with local residents to identify harms that emerge through the collection of data by public institutions, situating these in the context of on-going criminalisation and surveillance of low-income communities, people of colour and other targeted groups.In some instances, these activities have foregrounded a politics of refusal (Gangadharan, 2019) that advance an abolitionist agenda as articulated by groups such as the StopLAPD Spying Coalition and the Data for Black Lives initiative.Here, the focus is not to make technologies more efficient, but rather to recognise how technology has meaning and impact in relation to the inequalities manifest in capitalist exploitation and a history of state violence.The call is to divest resources into oppressive data systems and to 'abolish big data' that is used to measure and profile people, and instead reinvest in communities (Benjamin, 2019;Crooks, 2019).
In Europe, meanwhile, we have seen a growing mobilisation around social and economic rights in the context of datafication that has been particularly evident in the use of strategic litigation amongst non-governmental organisations against algorithmic systems and platforms.In the area of welfare, for example, coalitions between welfare and digital rights groups have successfully challenged the use of some algorithmic systems, such as SyRI in the Netherlands and an algorithm in the Department for Works and Pensions targeting disabled people in the UK (Toh, 2020;Savage, 2021).Similarly, in the context of the labour movement, there is growing engagement with the intersection of data and workers' rights that stretch beyond the issue of potential job losses in the face of automation and considers also the quality of work and the position of labour in relation to capital in datafied societies (De Stefano, 2018, Moore et al., 2017).This includes, for example, establishing workers' data rights as suggested by UNI Global Union, or the 'right to disconnect' as is the subject of significant union campaigns across Europe.Indeed, calls for 'data justice unionism' that would seek to explicitly connect digital rights with socio-economic rights and to build coalitions across social movements might provide an avenue through which the labour movement can play a role in connecting transformations relating to datafication in work to broader questions of society (Dencik, 2021).
In the context of environmentalism, the Environmental Data & Governance Initiative (EDGI) has preserved vulnerable scientific data in the aftermath of the US election of Trump in 2016, and in the process developed an ' environmental data justice' framework that considers the politics, generation, ownership and uses of environmental data (Vera et al., 2019).Similar concerns inform an increasing emphasis on 'sovereignty' in relation to data, particularly amongst indigenous communities, evident in the agenda set out by the growing Indigenous Data Sovereignty movement made up of a network of alliances and groups around the world that asserts that indigenous peoples need to be decision-makers around how data about them is collected and used.This orientation builds on long-standing struggles over the on-going extraction and exploitation of indigenous peoples and their knowledge systems, customs and territories (Kukutai & Taylor, 2016).
These different actions and struggles unite around a need to tackle the actual conditions that lead to experiences of injustice as they exist on the ground rather than necessarily pouring efforts into appealing to ideal formations of data and technology in contemporary society.Moreover, mobilisation in this sense is nurtured through solidarity, the aim of which is not simply the creation of just institutions that enact justice 'from above' but the manifestation of justice within and through social relations as they currently exist (Cohen, 2008).Holding on to the possibility of solidarity in determining how society should be organised and the role of technology within it has never been more relevant (Fenton et al., 2020).As Gandy (2020) has argued, such political mobilisation is precisely what is needed but also what is directly under threat with the advancement of datafication.As behaviours and activities are abstracted and reduced for the purposes of optimisation, people's shared experiences, and with that their political capability, are undermined as algorithmically-defined groups come to dictate the basis of social positioning.A call for data justice is therefore also a call for the continued relevance of social relations through which people can identify with each other and through which mobil-isation for struggles can be formed.

Conclusion
The concept of data justice borrows from many long-standing traditions, but it is also relatively nascent in its advancement and use.Although it has emerged out of pressing issues that arise from contemporary developments in digital technologies, it has found expressions in many diverse areas and fields.These expressions are not always aligned and speak to different interpretations of the ontology of data justice, who it applies to, and how it should be upheld.That is, they are expressions of the struggle over not only ideal formations of justice but the very grammar of justice that datafication disrupts.This is important as it alerts us to a politics of data justice that is currently played out across disciplines and practices.In this sense, we might say that the meaning of data justice is still up for grabs, and as with justice in general will continue to be interpreted and shaped by different interests and perspectives.However, in its current formation it holds significance for shifting our understanding of what is at stake with datafication and what might be possible responses.In particular, it alerts us to the need to consider issues of data not as siloed and abstracted technical issues, but as an embedded part of how we might think of social justice.As datafication continues to advance in different iterations, and under different modes of crisis, this need has never been more relevant.
will do much to inform what we mean by data justice, both in terms of what is at stake with datafication and what might be suitable responses.