The present article argues that the fact that personal data holds great value, in combination with a lack of transparency in its commercial use, leads to a need for consumer policy that strengthens consumer protection. The widespread practice of user agreements and consent-based regulation of personal data collection is not satisfactory for balancing these information-asymmetric markets. The lack of transparency deriving from the complex and massive datafication of consumers – where consumers are profiled, data is brokered and the algorithmically automated decision-making is opaque – speaks to the need for improved supervision at a more structural level above and beyond the individual consumer’s choices, preferably by more active consumer protection authorities.
This paper is part of
The Cambridge Analytica case, where third-party app developers gained access to a large amount of Facebook users’ data and used it for political campaigning, has not only spurred much debate on the need for algorithmic governance of platforms in our current status of networked publics, but also stresses the need for consumer empowerment in data-driven markets overall. The case highlights a lack of transparency that the ecology of actors collecting, handling and sharing personal data for various purposes ultimately mean for consumers and thereby also the difficulties in assessing the true value of the data collected. However, the lack of transparency and lack of consumers being informed over the data-handling is not an anomaly only true for some specific cases, but rather the norm for a data-driven economy. The main argument here therefore addresses the need for consumer empowerment in terms of transparency and ill-functioning notions of consent, in general, and methodological capabilities of consumer protection agencies, in particular.
In other words, from a consumer protection perspective, the data-driven economy poses great challenges in terms of the application of consumer regulations to information asymmetric relations – where one party has more or better information than the other, and the use of personalised services that include transactions of personal data (Larsson, 2017a; 2017c; Rhoen, 2016). Part of this regulatory challenge is arguably of a conceptual nature; that is, the practice of and supervision by consumer protection authorities is likely dependent on how the transactions, relations and conditions of the market are understood (cf. Larsson, 2017b). Given that the role, use and transactions of personal data is both opaque and part of an increasingly complex setting in what law professor Frank Pasquale has described as an “era of runaway data” (Pasquale, 2015), the need for a clarified description and understanding of the transactional character of personal data in the digital economy is called for. The main reason, here, is to be able to point out weaknesses in consumer protection, specifically with regards to imbalances or asymmetries in both information and power between consumers and digital service providers, how to deal with calls for transparency as well as the lack of consumer awareness when taking part in consent-based data collection.
There is little doubt that personal data do indeed hold significant value in the digital economy, and therefore can be understood as a sort of currency for services that are for free on the consumer level (cf. Schwartz, 2004; Spiekermann & Korunovska; Larsson & Ledendal, 2017). The notion of personal data
For example, at the end of January 2012, the European Commission presented the proposal for the comprehensive reform of EU data protection provisions, which resulted in, among other things, the Data Protection Regulation (also known as GDPR) which comes into force in May 2018. In connection with this, EU Commissioner Viviane Reding described personal data as “the currency of today’s digital market”.
This approach – that personal data has a central value in the digital economy and in practice can function as a currency – has become a fairly common view of many analysts of the computer-driven economy. For example, this notion is developed in a report from 2012 –
A problematic imbalance pointed out by Sarah Spiekermann and Jana Korunovska (2016), who research social and ethical problems of computer systems, is that there is a big difference in how individuals perceive the value of their personal information, on the one hand, and industrial players that utilise personal data as a central source of value, on the other:
The value of personal data is further
The article develops the argument of consumer empowerment and algorithmic governance in data-driven markets, and asks what this means for consumer protection policies and for the role of consumer protection authorities in terms of their supervision:
The article describes the current state of the knowledge on digital consumer profiling based in a forward-looking, if critical and consumer-based, perspective. This is then related to findings on consumer attitudes and sentiment. To what extent do consumers find the personal data collection to be problematic or worrying? This means pointing out some of the more important operators and more significant emerging markets in order to thereafter analyse, based in a policy relevant perspective, the most important aspects of consumer protection – a key issue being the wide use of user agreements to regulate what has become a strong information asymmetry in some of the data-driven markets.
One of the reasons for collecting large amounts of consumer data is to improve consumer profiling, that is, the practice of obtaining an understanding of consumers to form an underlying data basis for strategic decisions and, for example, marketing or product design. It is part of a development that can be described as industries’ attempt to create a “seamless, personalised digital customer journey” (cf. Edelman & Singer, 2015). This means combining information linked to an individual using methods that match specific consumer behaviour, demographic or psychographic characteristics (Harrison & Ti Gray, 2012; Hildebrandt, 2008). Profiling has become important not least in the marketing industry where this “new” kind of advertising can be described as “consumer-centric”, meaning it focuses on individuals (cf. Brown et al., 2016). In order to accomplish this, it is data-driven, i.e., effectuated by monitoring consumers’ actual internet-mediated behaviour – possibly in real time – in combination with collected data of previous behaviour, with the purpose of predicting future behaviour.
Profiles are used to categorise customers or customer segments in order to separate, for instance, the most profitable from the least, which information then comprises the strategic, underlying data used for marketing and other decisions. Consumers are therefore routinely studied, registered, analysed and ranked and may be offered both different prices and, to some extent, different services, depending on the individually associated information (“the data”), and their place of residence (Kitchin & Lauriault, 2014).
The field involved in collecting individual consumer information and profiling has also been described, using somewhat more negative connotations, as a growing “surveillance economy” (cf. Singh & Lyon, 2013; Teknologirådet & Datatilsynet, 2016) that also may lead to a misuse of consumer data. In an Australian and American context, Harrison and Ti Gray (2012) demonstrate how credit companies and banks use individual consumer profiling not only to identify the needs of individuals but also their weaknesses. This means among other things that they can specifically focus on consumers who will not be able to manage their credit payments during the interest-free period. This type of credit card user is also more profitable than users who do not incur credit card related interest costs. This entails, in other words, the identification of profitable customers that other operators might rate as being economically vulnerable (Stone, 2008). Others have shown a link between the increase in consumer credit and financial institutions’ access to consumer information (Sanchez, 2009), which emphasises the need for further research on digital consumption, credit and risks of over-indebtedness (cf. Larsson et al., 2016).
Studies conducted in an American market context show that consumers may be resigned about being able to influence traders’ use of their personal information rather than satisfied with the discounts they receive in exchange (Turow et al., 2015). A number of studies show that users are concerned about not having control over their Internet generated data as well as the fact that their information could be used in situations that are quite different to where the information was originally collected or shared (Lilley et al., 2012; Pew, 2014; cf. Halbert & Larsson, 2015). According to a Swedish study, 60% of the Swedish population is opposed to news companies collecting data to enhance the user experience (Appelgren & Leckner, 2016). Other studies conclude that consumers are concerned that third parties such as advertisers or other commercial operators may be able to access their personal information (for example, Kshetri 2014, Narayanaswamy & McGrath 2014, Pew Research Center 2014). Overall, this indicates that consumer data is a key issue in much of the current market changes, and that this area and these relationships are complex and need further study.
The main model utilised by data-driven services for the regulation of how to collect and handle consumers’ personal data is through user agreements based on the notion of informed consent. Formally, the users agree to the collection of their data. Critics, however, argue that this kind of “privacy self-management” does not provide meaningful control and that there is a need to move beyond relying too heavily on it (Solove, 2013). At least three main critical aspects can be put forward here.
Firstly, part of the challenge – as this model has become so common for our everyday digital practices – lies in what can be described as an
Secondly, part of the challenge likely lies in the fact that there are incentives for data collecting companies to be unclear about how much data is collected and how it is used: for example, Cranor et al. (2014) have studied 75 privacy policies from companies that store data on behaviour in digital contexts. They conclude that many of them lack important consumer relevant data management. This includes the collection and use of sensitive information and tracking data that can be used to identify individuals. Similarly, a study on privacy agreement texts and cookie consent information collected from 60 news sites in three countries (US, UK, and Sweden) shows that news sites “paternalistically” infer a wider consent from users than what can reasonably be expected, as a utilisation of “passive” consent. The reasons for collecting data can, according to Appelgren, therefore be said to be paternalistic in both a positive sense (i.e., beneficial to users) as well as in a negative sense, as choices may be imposed on users although users have not actively agreed, and potentially resulting in an undesired outcome.
Thirdly, part of the challenge likely also lies in the fact that emerging personal data-driven markets are complex, automated and swift – and thereby intransparent in practice. For example, the Norwegian data protection authority, Datatilsynet, conducted a study in 2015 on the amount of data collected when visiting the front page of six Norwegian newspapers (Datatilsynet, 2015). On average, the study found, between 100 and 200 web cookies were placed on any computer used to visit these homepages, information about the visitor’s IP address was sent to 356 servers, and an average of 46 third parties were “present” during each visit. One of the reasons for the presence of so many parties was the programmatic ad exchange taking place behind the web page in so-called programmatic advertising (cf. Busch, 2016), which involves increasing real-time bidding for selling advertisements that is dependent on profiling and targeting the individual visitor. However, none of the six newspapers provided their audience with any information relating to the presence of this large selection of third-party companies (Datatilsynet, 2015; Larsson, 2017c).
Each of these three examples point to the flawed notion of the individual consumer being able to, in a meaningful way, make informed choices with regards to the multitude of user agreements in play for an average digital consumer.
Media scholar and digital sociologist Anja Bechmann subsequently posits that “the consent culture of the internet has turned into a blind non-informed consent culture” (Bechmann, 2014, p. 21; cf. Joergensen, 2014). The fact remains that user agreements play a central role in regulating the handling of personal customer data between commercial parties and individuals, and that this striving for awareness is further emphasised by the GDPR. This leads to questions of how active consumer protection authorities preferably should be in empowering the “non-informed” but
A challenge from a consumer protection perspective regards the increasing complexity on data-driven markets, fuelled by both a lack of transparency – often behind proprietary software – and the fact that the data is traded and brokered. Media scholar Mark Andrejevic has commented on “the spreading of prediction markets” (2013, p. 68–70) in
The complexity of how data travels thereby leads to a fundamental challenge for consumer and data protection. As “prediction markets” spread, more types of industries will develop a more refined, personalised relationship to consumers, which can be both to the consumers’ benefit but also their detriment. Reliance on big data sets that can be complemented in real-time to analyse the specific consumers’ conditions is increasingly being used for anything from purchase predictions by retail stores, to credit scoring by lenders, to death predictions by insurers (Siegel, 2016). Data brokers provide for profiling – as in the Acxiom example above – in partnerships with all kinds of companies ranging from Facebook, Google, Twitter to banks, insurance and airline companies (Christl, 2017). One specific problem relates to data being erroneous – as it happens. Legal scholars Mikella Hurley and Julius Adebayo (2017) have argued, in relation to credit scoring based on large amounts of collected and analysed data:
So, the complexity of the market, the “ecosystem” of “runaway” data in essence describes what Nancy King and Jay Forder point out in a study on data analytics and consumer profiling (2016); i.e., that many of the companies dealing with consumers’ personal data gain access through secondary sources and use the information for purposes not known at the time of original collection (King & Forder, 2016). This further stresses the lack of possibilities for consumers to be informed about the uses of their data. Consequently, as consumer services – including credit scoring addressed by Hurley & Adebayo (2017) – becomes algorithmically mediated and automated, there is little chance for the individual consumer to assess if the outcome is reasonable, to counter if it is based on erroneous data, or even to clearly outline the inherent assumptions of the designed decision-making at hand. The black box of algorithmic decisions (cf. Pasquale, 2015), utilising secondary sources of data in consumer markets, is a clear challenge to consumer protection and the authorities representing it. How are they to detect if individual targeting – be it for ads or services – is based on illegal discriminatory grounds or exploiting particularly vulnerable groups?
Rhoen (2016), mentioned above, presents a socio-legally based analysis of how legal instruments can become more effective at improving consumer protection and the collection and use of consumer data (cf. Helveston, 2016). Rhoen (2016, pp. 6-8) argues, in a review of consumer protection and data protection legislation at the EU level, that a broader application of consumer protection regulation to user agreements may increase accountability for operators who collect and manage personal data, and in extension lead to increased codetermination for consumers. These consequences would, in that case, reduce the institutionalised power of the data managing parties in favour of the consumer. At the same time, however, Rhoen (2016, p. 8) points out that this can only be achieved if consumer protection legislation is applied pragmatically, which is partly the responsibility of the concerned supervisory authorities.
The European Data Protection Supervisor, EDPS, also points out the need for supervisory authorities – such as data protection and consumer protection authorities – to gain better insights into how data collection and covert profiling occurs (EDPS, 2015, p. 10), i.e., to study “the black box” (Pasquale, 2015). EDPS emphasises the lack of transparency involved and the challenges this entails also for governmental supervision; it is difficult to distinguish between advantages and intrusions when the data collection process and uses thereof are not visible (cf. King & Forder, 2016).
As shown, when it comes to the widespread practice of user agreements as a means to regulate the personal data collection, use and trade, the model seems flawed, particularly with regards to the notion of consumers making informed decisions. A wide array of studies show consumers’ concerns when it comes to the collection of their data, as well as the resignation or powerlessness to counter or take control over it. This relates to a widespread datafication (Larsson, 2017c) and quantification (Larsson, 2017d) of consumers, leading to a lack of transparency in data-driven markets, clouded by proprietary software and complex automated decision-making as the data travels, mediated by data brokers and others. This speaks to the need for an implementation of consumer policy that helps consumers recognise the perils of the new information landscape without being overwhelmed with information. Furthermore, and this is perhaps more important to point out, it speaks for the need to regulate consumer rights at a level that is not as strongly dependent on the consumers’ individual awareness. Pasquale, for example, also bears witness to this in relation to data brokers, stating that it is “unrealistic to expect individuals to inquire, broker by broker, about their files. Instead, we need to require brokers to make targeted disclosures to consumers. Uncovering problems in Big Data (or decision models based on that data) should not be a burden we expect individuals to solve on their own” (Pasquale, 2017).
Thus, given the overlapping character of personal data in the digital economy, there are a number of reasons why the data protection authorities and consumer-oriented authorities need to interact on a continuous and ongoing basis. Not the least the fact that personal data holds much of the value in a data-driven economy, combined with the fact that it is inherently hard for consumers to assess the bargain between data sharing and service access. This speaks for more structural solutions rather than depending on the consumers abilities of making informed choices about their personal data.
A recommendation for consumer protection authorities is therefore to develop synergies with, in particular, data protection authorities, to provide expertise on consumer protection. Transparency would likely have to include audits or control of how data-driven and targeting software operates, in order for consumer protection authorities to develop the ability to assess – in-house or perhaps through outsourced expertise – what the combination of algorithms and use of big data sources are leading to, and to discover the use of erroneous data (cf. King & Forder, 2016). This would be a way to propose a “qualified transparency” (Pasquale, 2015, p. 160–165) that may work in line with the need to “equalize the surveillance that is now being aimed disproportionally at the vulnerable” (Pasquale, 2015, p. 57). This could be a way forward to keep the proprietary software and the specific design of algorithms as the business secrets they may need to be, but at the same time provide for a necessary protective mechanism for the worst cases detrimental to consumers.
In the context of fintech firms, Pasquale (2017) witnessed before the United States Senate on the need for regulators to be able to audit machine learning processes to understand, at a minimum, whether suspect sources of data are influencing the decisions affecting consumers, such as credit scores. This would likely require data-driven and digital methods developed by the entities implementing the consumer protection supervision. In order to study the outcomes of automated services based on pattern recognition and to address accountability for these outcomes, a combination of legal and computer scientific expertise would be required. Or, put in a more general manner, in the European context, the methods operating in consumer markets have always called for scrutiny in order to secure the rights of weaker consumer parties. This was the case with traditional marketing and traditional credit scoring, and needs to be the case also for increasingly complex data-driven practices utilising increasingly sophisticated – and opaque – tools for the quantification of consumer preferences and automated responses to consumer interaction.
This article has focused on the collection and use of large sets of data in relation to consumers and their protection. It is therefore based on the assumption that consumer-focused activities in data-driven markets contain just that – data – which in theory can be scrutinised both with regards to its origin, its analysis, and application – which often means an algorithmically mediated automation. This is a field where contemporary consumer protection authorities need to have satisfactory supervisory methods.
In addition, as more and more consumer-related activities in the digital economy come to rely on artificial intelligence (AI) and machine learning, the demands of supervisory methodologies will increasingly face challenges relating to lack of transparency and autonomous agency in consumer-oriented products and services. They may even encounter a computation that is involved in decision-making that amounts to a form of cognition which is hard to explain and understand even for those that design the processes. As a response, perhaps future consumer protection authorities will find ways to utilise not only machine learning but also increasingly intelligent artificial agents to find and counteract inappropriate market behaviour, from a consumer protection point of view.