Digital Technology and the Resurrection of Trust Contents

Chapter 4: Transparency

149.In the previous Chapter we made proposals for how technology platforms should be held to account. This requires greater transparency. Transparency helps effective regulation by enabling regulators to understand what they are trying to regulate and empowering civil society to spot deficiencies. At the same time transparency cannot be thought of by itself and must come with measures to secure redress and accountability.

150.Transparency lies at the heart of trust in a democratic society. For the public to be able to believe that individuals and organisations with power are not abusing their position the public must be able to understand how power is being used. For this same reason, openness and honesty are two of the tenets in the Nolan principles of public life.249 Technology platforms wield enormous power over our public conversation and transparency should be a condition for permitting that to continue.

151.We are under no illusions about transparency being a panacea, and we acknowledge its efficacy depends on accountability mechanisms outlined in the previous Chapter. Baroness O’Neill of Bengarve warned us that transparency was not always effective and can result in data confusing those individuals who cannot understand it.250 This frames our general approach to this issue. Alex Krasodomski-Jones from Demos told us that it was important to establish who transparency is for. As well as transparency for the public, maximum transparency must be given to researchers in academia, civil society and news media, as well as to Government.251

152.Transparency should look different for different audiences. The act of publishing information, by itself, will not solve all problems for all audiences. Platforms should tailor the information they release to fit the needs of the audience it is intended for. We believe that transparency for the public should be based on empowering them to act by providing them with the information they need at the time it is most useful to them. We address the public’s needs in Chapter 7 on active digital citizens.

153.In this Chapter we focus on the transparency of the processes undertaken by technology platforms. It is not realistic to expect individual members of the public to read detailed disclosure documents or conduct data analysis. The primary audience for this will be independent researchers, civil society, regulators and Government. However, these recommendations will empower civil society and regulators to interrogate platforms’ activities and provide independent explanations to the public. This is vital to ensure public trust in digital platforms.

154.Tony Close, then Director of Content Standards, Licensing and Enforcement at Ofcom, told us that part of creating effective codes of practice lies in having a comprehensive evidence base.252 Too often, a high-quality evidence base about online platforms does not exist. As Professor Rasmus Kleis Nielsen, Director of the Reuters Institute, told us, this area is filled with nuggets of information that have developed a life of their own and are treated as an independent fact, separate from the anecdotal or poor-quality research from which they originate.253

155.A more informed public debate is in the long-term interests of all parties as it encourages sensible well-evidenced regulation and helps enrich democracy. By failing to co-operate with researchers to develop a better understanding of how they work platforms foster mistrust and risk incentivising low quality regulation focused on the wrong issues. In this Chapter we look at some of the areas where evidence is lacking and outline practical ways to improve platform transparency.

Do platforms cause polarisation and degrade democratic discourse?

156.One of the key questions our inquiry has looked at is the extent to which platforms have structural features which contribute to political polarisation and damage political discourse by promoting discord and disharmony. Whilst we have heard indications that this might be the case, the evidence base at present is far from strong enough to support effective regulation in this area. Below we explore possible mechanisms and establish the need for more research in each area.

Targeted advertising

157.One of the most commonly suggested mechanisms through which platforms could dramatically alter democratic discourse is in facilitating the micro-targeting of political advertising. Dr Martin Moore of King’s College London told us Facebook allows political campaigners to use an extraordinary amount of personal data to target political messages at small groups of people. He stated that it also gives campaigners the opportunity to engage in A/B testing of adverts, where two different versions are tested to see which performs better, which can be achieved at a remarkable scale with campaigns putting out 50,000 to 100,000 adverts per day.254

158.It is not clear whether advertising on such a scale is persuasive and changes the course of democratic events. Dr Moore highlighted the fact that social media can elicit a strong behavioural response, influencing people to action rather than necessarily having strong persuasive results.255 Paul Bainsfair, Director General at the Institute of Practitioners in Advertising, told us that from a commercial perspective this behavioural activation was the advantage of online advertising. He suggested that online advertising was effective to book a holiday if you had already searched online for one but that other forms of advertising like TV were better at creating a long-lasting impression about a brand.256 Dr Luke Temple and Dr Ana Lager told us that the evidence is not clear on the effect of online political advertising as to whether it has any persuasive effect or if it simply reminds people to vote for the candidate they already preferred.257 Eric Salama, then Chief Executive at Kantar, argued that the strength of online targeting was in reinforcing perceptions.258 This could mean that weaker opinions that are loosely held are transformed into more strongly held beliefs. He made it clear that the types of data available to political parties were similar to those available to commercial advertisers.259

159.Eric Salama also stressed the difficulty of evaluating the efficacy of political advertising. He noted that whilst commercial advertisers can look at sales, ultimately political adverts aim to influence votes, which happen infrequently, and there is no way to verify how someone voted.260

160.Paddy McGuinness, former Deputy National Security Adviser, told us that there was a conspiracy of silence around the lack of effect that it was possible to have through networks like Facebook. He suggested that this was because it was not in the advertising industry’s interest to admit how little effect online advertising has.261

161.Ben Scott, Director of Policy and Advocacy at Luminate, presented a different picture. He noted that it was a long-term mystery of communication studies that research could not find a specific effect of advertising, but they did know that if brands stopped advertising their market share declined.262 He suggested that the effects of advertising should be thought of as systemic rather than having a simple causal relationship. Mr Scott also stated that AI researchers believed that targeted messaging combined with AI could be used to persuade people and alter their behaviours.

162.In all this evidence it is hard to identify the effect that online microtargeting is thought to have, and in what way it could be harmful. Microtargeting happens in political campaigns offline as well as online. Parties use large datasets of personal information to target leaflets in a similar manner to which they target online adverts.263 Consultancies like Cambridge Analytica made bold claims suggesting a quasi-magical power to control voters, but they have produced no robust evidence of their impact.264 Despite the lack of research showing the extraordinary effects claimed by these practitioners there are legitimate concerns about their malign impact on democracy. Whistle-blowers and independent journalists have highlighted the unethical nature of this activity. The work of Cambridge Analytica and others has increased the perception of technology as a tool to subvert democracy. In this way micro-targeted advertising has undermined trust in our democratic system.

163.The Institute of Practitioners in Advertising felt there could be regulation to suggest minimum audience thresholds for targeting in order to protect open, collective debate.265 In a similar vein, during the course of our inquiry, Google changed their policy to ban advertising from being targeted at any more granular level than age, gender and postcode.266 At this stage, there is insufficient evidence as to whether this sort of change would have a meaningful effect or whether AI and A/B testing could have the same suggested persuasive effect at the proposed adjusted scale. Nor is the evidence currently sufficient to justify a different approach for online microtargeted advertising to that of offline microtargeted leaflets.

164.Keith Weed, President of the Advertising Association, argued that from a commercial perspective microtargeting was beneficial to the user as they were not forced to see irrelevant advertising. The same argument could be made about political advertising. Facebook also made the point that, whilst less targeted advertising is cheaper per person reached, microtargeted advertising allows smaller campaigns to get their message out to specific audiences where they would struggle to afford other types of advertising.267

165.There is a need for immediate safeguards that should be in place to prevent political campaigns from misusing this technology. We look at these in detail in Chapter 6 on free and fair elections. Beyond these immediate safeguards, before we can craft effective wider regulation, greater co-operation from platforms is needed in order for research to establish the extent to which looser targeting criteria changes the persuasive effects of advertising or whether there are other elements, specific to online targeted advertising, that are a cause for particular concern. This will facilitate future regulation in this area.

Foreign interference

166.A more specific worry that we heard was that platforms offer a new frontier for hostile governments to interfere with our democratic process. The Government told us that there has not been any successful interference in UK elections.268 However, whether this statement is true or not depends on how success is defined. Presumably the Government’s criteria for success is changing the outcome of the election. Yet this is not the only potential criteria to consider. Foreign actors could affect the margin of an election result or merely be perceived as having done so and in doing so undermine trust in democracy. Elisabeth Braw, Director of the Modern Deterrence Project at the Royal United Services Institute (RUSI), used the example of the US 2016 presidential elections to show the problems with the Government’s statement. She told us that whilst it was hotly debated to what extent Russia influenced that election, the important fact was that a large proportion of the American population thought that Russia had.269 Lisa-Maria Neudert, Commission Secretary at the Oxford Technology and Elections Commission, stated that interference was not aimed at spreading a specific message but was instead focused on sowing mistrust in the political system in general.270 We heard from Siim Kumpas, Adviser to the Government Office of Estonia, that no politician in Estonia doubts the importance of tackling foreign interference from Russia. However, it is unclear what level of threat it presents in the UK.

167.Ben Scott of Luminate argued that while it was difficult to determine the effect of foreign interference, it was having an effect. He explained that misinformation already exists and circulates online, and foreign states such as Russia only have to use their online networks to nudge the conversation in one direction or another.271 Mr Scott told us that we currently lack the information we would need to know to what extent this action is decisive, or to what extent it represents a small part of an already large misinformation system. Paddy McGuinness disagreed about the level of threat foreign interference presents but agreed on the same basic fact that foreign states were attempting to influence UK democracy using social media.272 He argued that the criteria for success for states like Russia were very low as they just have to add a little instability at a time when a country is vulnerable. Mr McGuinness stated that we do the job of hostile governments for them by talking up the threat they pose.

168.There is a need for greater research into the scale of misinformation put out by foreign governments. Sir Julian King, former EU Security Commissioner told us that there was a need for independent research scrutiny of alleged pieces of disinformation in order to map patterns of activity and better prepare for dealing with such activity in the future.273

169.For now, addressing this interference should be done through ensuring platforms can better tackle misinformation as we examine in Chapter 2, and by better preparing our citizens to understand the information we see online as discussed in Chapter 7.

Filter bubbles

170.Another way it has been suggested that technology platforms undermine democratic discourse is through the creation of filter bubbles. The Government told us, and stated in its Online Harms White Paper, that social media platforms use algorithms which can lead to filter bubbles where a user is presented with only one type of content instead of seeing a range of voices or opinions.274 Given the Government’s prominent endorsement of this theory in this major policy programme, it could be thought that there was strong evidence to suggest that this is a widespread phenomenon and represents a particular problem on online platforms. However, this does not appear to be the case.

171.Professor Helen Margetts, Professor of the Internet and Society at the University of Oxford and Director of the Public Policy Programme at The Alan Turing Institute, told us that human beings naturally prefer echo chambers. 275 Echo chambers can be a naturally occurring phenomenon where people speak to others who have similar opinions as opposed to the idea of filter bubbles which are driven by platforms’ algorithms. She said that the most perfect echo chamber would be to rely on a single news source like CNN, Fox News or the Daily Express. Professor Margetts told us that the research on the subject suggested the idea of online echo chambers has been exaggerated, with most social media users seeing a wider variety of news sources than non-users. Professor Cristian Vaccari from Loughborough University stated that his research from across the globe found that social media users were more likely to encounter views they disagreed with online and that echo chambers were more likely to exist in face to face conversation where people are likely to talk with those with whom they already agree.276 Previous research that suggested the existence of echo chambers has been criticised for not acknowledging the breadth of different media individuals use, and the amount of media choice available online.277

172.Dr Martin Moore told us that whilst there has never been a single public sphere, previously it was more constrained, and online platforms, particularly smaller platforms, allowed for a more atomised public sphere.278 This leads to a situation where some elements become more and more extreme. Professor Helen Margetts stated that what evidence there was of filter bubbles developing was in groups of older people and parts of the American right.279 The Oxford Internet Institute suggested that only a small portion of the population online were likely to find themselves in political echo chambers.280

173.The Online Harms White Paper suggests there might be a need to ensure that social media platforms increase the range of views that individuals encounter to counter the rise of echo chambers.281 Given the evidence suggests that social media users already encounter a more diverse variety of news that non-users, this does not appear to be the right approach. What appears to be significant is the design decisions taken by social media platforms in determining the content that users see. This suggests that the Online Harms White Paper should focus not on a specific action designed to counter a problem for which evidence is contested but should instead explore how design decisions taken by platforms influence user experiences. More research is needed on the relationship between the media people consume, social media recommendation systems and polarisation.

174.There is also a need for a greater understanding of the risk factors for the minority of users who could end up in echo chambers and ways in which these echo chambers can be better identified. Furthermore, research is needed to understand whether these echo chambers can normalise extremist views for those within them and the behaviours this may create.

Algorithmic design and outrage factories

175.A more general version of this concern is that the structure of platforms incentivises divisive content and that they effectively act as ‘outrage factories’. However, it is not clear that platforms are unique in functioning this way. Dr Ysabel Gerrard told us that the defining feature of social media platforms was showing people what they wanted to see, favouring extremism and powerful emotion over measured rational expression.282 As discussed in the previous Chapter, Mark Zuckerberg argues that people have an innate tendency toward more shocking and outrageous content which applies just as much to cable news and tabloids as it does to social media platforms.283 Professor Cristian Vaccari told us that we should think of social media’s connection with traditional media.284 He argued that the biggest outrage factory in the US was Fox News and that there was a connection with Facebook in that Fox News was the most shared news source on Facebook. Professor Vaccari also highlighted the fact that tabloids are outrage factories and that they are shared widely on social media. He noted that people who shared tabloid stories were also more likely to share misinformation. Professor Rasmus Kleis Nielsen of the Reuters Institute told us that high media literacy was associated with reading more upmarket newspapers and being less likely to share disinformation. This suggests that the amount of outrageous content on social media partly represents people’s pre-existing media habits being shared online.

176.The alternative argument is that social media companies’ business models are based on promoting outrageous content. Professor Safiya Noble of UCLA told us that we should think of these platforms primarily as advertising platforms which are designed to optimise performance for those advertisers who pay for them.285 Dr Jennifer Cobbe from the University of Cambridge argued that this meant that platforms are designed around keeping people on the platform longer so that they can be served more advertising.286 Christoph Schott of Avaaz also told us that the goal of platforms is to keep users for as long as possible, which might make them draw on more simple emotions such as outrage and hatred as these are what is popular.287 However, both Dr Cobbe and Mr Schott stressed that this ends up with advertisers appearing alongside content that they would not wish to be associated with, and when informed about it, removing their advertising from that content.288 We also heard from advertisers that they would not want to advertise next to dangerous content.289 Facebook has made the argument that it is not in its business interest to encourage this type of contentious content as an ugly, emotional atmosphere does not make people click adverts.290

177.Targeted advertising, foreign interference and echo chambers can contribute to platforms’ role in polarisation. Targeted advertising can spread further if it includes outrageous content and is shared organically. Foreign interference also uses outrageous content to further its spread. Similarly, extreme content created in echo chambers can spread widely and become normalised.

178.Given that advertisers would not wish to be adjacent to outrageous content, it is therefore not in the ultimate business interest of social media platforms to create or spread outrageous content, the priority is to study the causes and prevalence of such content. This could include looking more closely at user behaviour to see when and why people share this type of content; and what elements of platforms’ design increases or decreases the spread of such content, and how the most negative aspects of this situation can be improved.

Access for independent researchers

179.In each of the cases identified above, more research is needed, which requires independent researchers to have greater access to data from technology platforms. Whilst on occasion the experts we heard from disagreed, their clear consensus was that there was not enough data because platforms did not allow independent research to audit their performance. Ben Scott from Luminate told us that the anecdotal data he had seen suggested targeted advertising and foreign interference were platform problems but that we are forced to rely on incident reports rather than comprehensive data due to platforms not sharing their data.291 He suggested that we need an independent review of the effects of platforms to ensure we understand exactly what is going on. Paddy McGuinness disagreed about what he thought the anecdotal data showed but agreed that it was essential for independent non-state organisations to look at the data and explain to the public what was happening.292 Both stressed that in order to trust technology platforms we must be able to verify their activities and effects.

180.Researchers told us that it was very difficult to study the effects of the large platforms. Alex Krasodomski-Jones from Demos told us that over the past five or six years many of the tools that he used to monitor these spaces have stopped functioning, meaning that is harder to understand what is happening on these platforms.293 Professor Helen Margetts told us that no one had the types of data needed to do the necessary research.294 A recent review of the data sharing policies across the major platforms found major issues on each platform with the exception of Reddit.295 Professor Margetts stated that it was very difficult to measure the effects of echo chambers, misinformation or hate speech without access to this data.296 Professor Cristian Vaccari similarly told us that in many areas there were legitimate concerns but no data, and without that one cannot determine the size of the effect or the number of people it applies to.297 Professor Vaccari told us that initiatives he had worked with had failed to secure voluntary collaboration from Facebook to release more data and that in order for there to be progress it would be necessary for countries to mandate that platforms release data to researchers as a condition of operating in that country.298

181.In the period between us hearing from academic researchers and taking evidence from the technology platforms, there was a breakthrough in data sharing with Facebook. In February, Facebook gave Social Science One, a collaboration between Facebook and researchers like Professor Vaccari, a dataset including more than a billion gigabytes of data about URLs299 that had been shared on Facebook.300 This will be a useful dataset that will help researchers better understand some of the issues we have identified. Whilst revealing the potential for collaboration between platforms and researchers, improvements are needed as access remains limited to only select researchers and progress in granting access to data has been slow.

182.We heard that Social Science One have found Facebook’s interpretation of the General Data Protection Regulations (GDPR) to be quite restrictive, making data sharing difficult. Overcoming this barrier has proven especially difficult as Facebook has so far refused to release its legal analysis. We asked Karim Palant from Facebook if they would consider publishing this legal analysis and he stated that their legal assessments are subject to legal privilege and remain confidential. 301 It is worth noting that it appears that Facebook do not have such a restrictive interpretation of GDPR when it comes sharing data with their commercial partners. Facebook’s commercial partners have greater data access in some areas than external researchers.302

183.The Centre for Data Ethics and Innovation (CDEI) have suggested that the best way forward would be for the Online Harms regulator to consult with the ICO to develop a model that ensures all access to data is provided in full compliance with the GDPR with the ICO producing a statutory code of practice for researcher access to platform data. The GDPR allows for data sharing for research purposes under Article 89 and allows states to create codes of conduct under Article 40 and this is reflected in Section 128 of the Data Protection Act 2018.303

184.There are some quick wins that can be achieved without requiring additional innovation in establishing users’ privacy.304 Platforms have Application Programming Interfaces (APIs) that allow commercial partners to access data from their platforms. Access to these APIs would help researchers. Beyond this there is a need for deep partnership between platforms and independent researchers to collaborate and publish research into matters of public interest. There is also a need for innovation from platforms to provide researchers with access to sensitive data in controlled environments. The practice of research ‘clean rooms’ allows researchers to access and manipulate data with an environment controlled by the data provider. Facebook has previously established similar protocols for advertisers.305 There is a need to develop and more widely facilitate this clean rooms model to enable future high-quality research.

185.Ben Scott told us that the best empirical analysis of what is happening on digital disinformation lies within research universities across the world but that they need resources and more data to pursue this work. This should be a priority.306

186.Vint Cerf from Google told us that Google works with and funds researchers that have gone through their internal peer review process.307 However, this cannot be the extent of research access to these important platforms. For this research to be trustworthy it must be truly independent of these platforms and the research topic must selected by be the regulator and the academic rather than the platforms themselves. In practice, this means that Ofcom should have the power to compel companies to facilitate research on topics that are in the public interest. This could be done by instructing Ofcom to work with UKRI and the research councils to run a rolling funding call whereby researchers could propose projects that require currently unavailable data access. Ofcom would form part of the assessment panel, and act to facilitate successful research calls. In line with established research requirements, this publicly funded research would then need to be published on an open access platform to ensure its use for the public good. If platforms fail to comply then they should be seen to be failing to fulfil their duty of care and be sanctioned for it.

187.Ofcom should be given the power to compel companies to facilitate research on topics that are in the public interest. The ICO should, in consultation with Ofcom, prepare statutory guidance under Section 128 of the Data Protection Act 2018 on data sharing between researchers and the technology platforms. Once this guidance is completed, Ofcom should require platforms to:

(a)Provide at least equivalent access for researchers to APIs as that provided to commercial partners;

(b)Establish direct partnerships with researchers to undertake user surveys and experiments with user informed consent on matters of substantial public interest;

(c)Develop, for sensitive personal information, physical or virtual ‘clean rooms’ where researchers can analyse data.

Algorithmic transparency

188.Alongside this call for data transparency, we heard of the need for additional transparency about how platforms use algorithmic recommendation systems. Caroline Elsom of the Centre for Policy Studies told us that there was a need to compel social media platforms to be more transparent about how their algorithms work.308 Matthew d’Ancona, Editor and Partner at Tortoise Media, told us that platforms’ algorithms are black boxes that that they do not want to open but that are used to reinforce prejudice and to shut down debate.309

189.Platforms have been very reluctant to provide additional transparency. Vint Cerf argued that additional transparency was not necessarily due to the fact that its ranking criteria and properties are public, so researchers can do experiments to determine if Google’s intent in its criteria are being realised.310 However, this does not accurately describe the reality of how YouTube works.

Box 4: How Google’s algorithms work

Vint Cerf provided a short explanation of how Google Search weights different pages using its algorithm:

“The amount of information on the world wide web is extraordinarily large. There are billions of pages. We have no ability to manually evaluate all that content, but we have about 10,000 people, as part of our Google family, who evaluate websites. We have perhaps as many as nine opinions of selected pages. In the case of search, we have a 168page document given over to how you determine the quality of a website.

… Once we have samples of webpages that have been evaluated by those evaluators, we can take what they have done and the webpages their evaluations apply to, and make a machinelearning neural network that reflects the quality they have been able to assert for the webpages. Those webpages become the training set for a machinelearning system. The machinelearning system is then applied to all the webpages we index in the world wide web. Once that application has been done, we use that information and other indicators to rankorder the responses that come back from a web search.

There is a twostep process. There is a manual process to establish criteria and a goodquality training set, and then a machinelearning system to scale up to the size of the world wide web, which we index.”311

190.It is not possible to assess how Google produces the training datasets for YouTube. Google has published the evaluation criteria that it uses to assess Google Search results and claims that these are the same used to evaluate YouTube results.312 However, the document published solely uses examples from Google Search to explain what quality looks like and does not indicate what quality looks like in a YouTube video. This document is used to rate web pages from high to low quality to determine search weightings for Google Search. YouTube’s borderline content programme does not rely on a rating from high to low quality but instead produces a binary decision of whether to include a YouTube video within its recommendation system or not. This would strongly suggest a quite different document exists for evaluating YouTube content. When we asked Google how it determines borderline content on YouTube, we were directed to its community guidelines.313 These do not explain this process. Despite repeated questioning of Google on this important point we failed to achieve greater transparency on how this programme works.

191.Google have sought to contrast its algorithmic ratings with subjective determinations by humans of the truth.314 However, these ratings ultimately boil down to the subjective ratings of the humans who evaluate web pages. This is presumably why Google collects up to nine reviews of each page it ranks.315 Katie O’Donovan, Head of UK Government Affairs and Public Policy at Google, told us that Google works with external experts to ensure their machine learning tools correctly determine what is an authoritative response, but Google has not told us who these experts are.316

192.At the other end of the algorithm, it is not possible to determine what YouTube is recommending to users of its website. YouTube recommends videos to people based on the content that they have seen before. Its recommendation system is personalised. This means that external attempts to study YouTube’s algorithm are limited either to using anonymous recommendations (based on a person with no viewing history) or creating artificial profiles. The lack of data available here has led to disputes in the research community about whether or not YouTube’s algorithm can have a radicalising effect.317 Researchers on both side of the dispute agree that all studies are limited by the lack of access to data on what personalised recommendations YouTube is making.318

193.Katie O’Donovan told us that Google’s reticence to be more transparent was based on negative previous experience. She explained that in Google’s early days it published a paper on its search algorithm and quite quickly and systematically websites began paying to game the system.319 Whilst this is an argument against publishing additional details it does not apply to providing regulators with additional access. Dr Jennifer Cobbe told us that transparency to oversight bodies provided less risk of people using that information to game the system as they would not have access to it.320 She explained that it was reasonable for platforms to have concerns around commercial secrecy but they can be confident that regulators will not pass on the information. Alaphia Zoyab from Avaaz reiterated the fact that regulators routinely see commercially sensitive information and social media companies should not be exempt from that degree of scrutiny.321

Algorithmic bias

194.One particular problem with algorithmic recommendation systems is that they can have biases against certain groups. Roger Taylor, Chair of the CDEI, told us that there was clear evidence of bias in these systems although it is often unintentional. He cited the example of the fact that in many scenarios it is cheaper to target advertising online toward men than towards women.322 Dr Jennifer Cobbe told us that it is exceptionally difficult, if not impossible, to fully remove bias from machine-learning systems.323 This can be due to the fact that the dataset the model is trained on is not large enough and that it will de-prioritise things that it has not been trained on. Dr Cobbe also stated that this can be due to the system being trained on historical datasets that encode into the algorithm the structural issues that existed in society when that dataset was collected. She explained that if the designer has not tested or audited the system widely enough then potential biases will materialise.

195.One real life disputed example of this phenomena comes from YouTube. A number of LGBT+ YouTube creators are suing YouTube in the US because they claim that YouTube is disproportionately removing advertising from LGBT+ creators’ videos.324 This removes their ability to make money from their videos and reduces the viability of LGBT+ media outlets using YouTube as a platform. One study of YouTube’s algorithm found that an otherwise identical video would be demonetised if it used words related to LGBT+ people.325 Conversely, when words like “gay” and “lesbian” were changed to random words like “happy” the status of the video changed to being advertiser friendly. When we asked representatives from Google about this phenomenon we were told that it could not comment on live cases.326 Whilst it is understandable that Google did not wish to comment on something subject to current litigation, lack of transparency here makes it difficult to know what went wrong in this situation and how it could be stopped in the future.

196.One informed guess for why this might have happened is that the data that YouTube used to train its algorithm included more LGBT+ content in its non-advertiser friendly example data (for example LGBT+ sex education videos are not advertiser friendly) than in its advertiser friendly example data.327 This may be because content on YouTube that explicitly states it is LGBT+ in its title is more likely to be ‘non-advertiser friendly’ than content that does not. However, this does not justify removing the ability of all LGBT+ content to receive income from advertisers.

197.Dr Jennifer Cobbe told us that platforms could prevent their algorithms from discriminating against specific groups by auditing and testing these algorithms as broadly as possible and making sure that the datasets used to train the algorithms were as representative as possible.328 However, she warned that there would need to be legal and regulatory incentives in order to persuade companies to undergo this testing.329 In other areas this sort of testing is being done by regulators. Guy Parker, CEO of the ASA, told us that they were conducting avatar monitoring where they were creating online profiles to resemble those of different aged children and testing whether they were receiving adverts which were inappropriate for their age group.330 If YouTube undertook this sort of auditing it is unlikely that it would have discriminated against LGBT+ creators in the way that is alleged.

198.There is already law in this area that is relevant. The Equality Act 2010 prevents direct discrimination against people with certain protected characteristics (including sexual orientation and gender reassignment). It also prevents indirect discrimination. Indirect discrimination occurs when a provision, criterion or practice which applies in the same way for everybody has an effect which disadvantages people with a protected characteristic. Where a particular group is disadvantaged in this way, a person in that group is indirectly discriminated against if he or she is put at that disadvantage, unless the person applying the policy can justify it as a proportionate means of achieving a legitimate aim. In the absence of further evidence, it is unclear whether the alleged algorithmic demonetisation is direct or indirect discrimination and to what extent it is a proportionate means of achieving a legitimate aim. There is a case that could be made however that in order to comply with existing equality law, platforms should already be engaged in algorithmic audits to ensure against discrimination on the basis of characteristics protected in the Act. This may be an area that the Equality and Human Rights Commission should investigate.

199.The Equality Act 2010 includes the characteristics of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex and sexual orientation.331 There may be a case for platforms to audit their algorithms for characteristics beyond this. One potential area is political bias. However, we acknowledge that there are clear challenges to auditing for this. There are political opinions that are banned on online platforms. Facebook has an explicit ban on praising, supporting and representing white nationalism and white separatism.332 Policy aiming to reduce political bias in algorithms should not seek to reverse this. It would therefore be difficult to establish a spectrum of acceptable political viewpoints on which platforms should be audited, especially at the global scale that these platforms operate. Another possible area would be auditing to ensure against bias on the basis of socio-economic class. This would be less straightforward than auditing for characteristics under the Equality Act which have more established practices partly due to having existed in legislation for a decade. However, it is possible that methods could be developed.

200.Ofcom should issue a code of practice on algorithmic recommending. This should require platforms to conduct audits on all substantial changes to their algorithmic recommending facilities for their effects on users with characteristics protected under the Equality Act 2010. Ofcom should work with platforms to establish audits on other relevant and appropriate characteristics. Platforms should be required to share the results of these audits with Ofcom and the Equalities and Human Rights Commission if requested.

201.Ofcom should be given the powers and be properly resourced in order to undertake periodic audits of the algorithmic recommending systems used by technology platforms, including accessing the training data used to train the systems and comprehensive information from the platforms on what content is being recommended.

202.There is a common thread between the need for transparency of algorithmic processes and researchers’ access to platforms. Platforms must be entirely open to the regulators to ensure proper oversight. Ofcom can only ensure that platforms are meeting their duty of care if it has access to all data from these platforms and the ability to use additional research expertise to better understand what that data means. The exact details of what data Ofcom will need will change as technology develops therefore these powers must be suitably broad.

203.Ofcom should have the power to request any data relevant to ensure that platforms are acting in accordance with their duty of care.

Transparency in content moderation

204.In the previous Chapter we recommended accountability measures to improve content moderation; however, in order for these to be effective there must be more transparency in the moderation processes they seek to improve. Professor Daniel Kriess, Associate Professor at the University of North Carolina, argued that accountable moderation decisions require a clear justification framework. This would need to include a moral argument for the democratic case for removing content and providing evidence of how this is done in practice.333 He told us that, in practice, platforms have been highly reactive to negative press coverage and public pressure and have changed to ameliorate bad news coverage. This has resulted in an ongoing and confusing set of changes in the content moderation approaches of the major platforms. Professor Kriess explained that it was difficult to find clear explanations of changes in policies, rationales for content takedowns, or even to confirm if changes in policy took place. He also told us that individual incidences of content moderation often required external pressure to ensure platforms honour and enforce their own policies.

205.Katie O’Donovan from Google suggested that Google and YouTube were transparent in their moderation policy and practice. She told us that they made it clear to their creator community when they changed community guidelines and that these guidelines set out in in detail and plain English what their policy was. However, it is unclear how this statement can be reconciled with reality. For example, last summer, Steven Crowder, a comedian with 3.8 million subscribers, was reported to YouTube for repeated racist and homophobic abuse of Carlos Maza, a gay Cuban-American journalist.334 This was followed by large quantities of abusive messages across social media platforms and phone messages on Mr Maza’s personal mobile from Mr Crowder’s supporters. YouTube decided that Mr Crowder’s videos should not be removed from their platform due to the fact that it was not the primary purpose of the video to incite hatred.335 However, until YouTube published this blog post, the company’s community guidelines did not mention whether incitement was the primary purpose of the video, only stating that incitement was against the guidelines.336 It has since been updated but it is unclear if this was a change in policy or a change in how YouTube’s previous policy was explained to the public.337 YouTube employees have anonymously spoken to the press to indicate that they are prevented from enforcing the rules consistently and that more senior employees stop sanctions from being applied to high profile creators.338

206.This lack of clarity and consistency in moderation policies and practices has consequences for public trust in those systems. Professor Sarah Roberts told us that research with users who had had their content removed by platforms found that almost everyone surveyed believed that they were being personally targeted and persecuted due to their political beliefs.339 It is unlikely that platforms are persecuting all of the different political beliefs of those covered in the study but a lack of clarity over platforms’ moderation activity helps create this perception.

207.Additional transparency could improve this. Caroline Elsom of the Centre for Policy Studies told us that it is important for content that has been taken down to be kept somewhere and be publicly available so that independent researchers can examine platforms’ moderation practices.340 Researchers and concerned civil society organisations have asked platforms to keep a database of misinformation they have removed about COVID-19 to help individuals working in public health and human rights to understand the effect of online information on health outcomes.341

208.There is a model for more transparent content moderation in Facebook’s Third-Party Fact Checking Network. Fact checkers in this programme publish an article explaining which parts of the content that they moderate are misinformation and why this is the case.342 This means that it is possible to go to the fact checkers’ websites and find out what misinformation they have marked. However, this programme does not mandate fact checkers to include a copy of the misinformation itself and as a result is only of limited use for this purpose. We discuss this programme in more detail in Chapter 2 which focuses on misinformation.

209.The need for additional transparency in this area has been highlighted by the actions of platforms in response to the COVID-19 crisis. Multiple platforms have removed content posted by Jair Bolsonaro, the current president of Brazil.343 This content broke platforms’ policy by promoting misinformation about a possible cure for COVID-19. However, platforms have not published a prominent fact check with detailed reasoning for why they have removed this content. Instead they have given statements to the press indicating that the content broke their terms of service. There are worrying questions about transparency and accountability when a platform removes the content of a nation’s leader without explaining in detail why they have done so.

Box 5: President Trump and content moderation study

On 26 May 2020 Twitter took the decision to add a link giving additional context to a Tweet from President Trump on the subject of postal voting. President Trump in his Tweet asserted that postal voting forms would be sent to all individuals living in California and that it was an attempt to rig the 2020 presidential election through fraud. This was untrue. The proposal in California was to send them to individuals who were registered to vote in the state. Twitter’s response gave accurate details of the scheme alongside citing political journalists stating that evidence does not suggest that postal voting will be used for fraud. Twitter did this as its civic integrity policy states that it will take action against misleading claims about electoral processes.344

The response from Twitter was not a clear fact check and did not fully meet the standards set by the IFCN. The IFCN’s code of principles requires that fact checkers use the best available primary source or if that is not available, that they explain the use of a secondary source. By quoting from political journalists rather than citing the actual research on postal voting, Twitter failed to live up to this standard. Twitter’s response is also not transparent about its methodology nor does it have a clear corrections policy, both of which are required by the IFCN.

President Trump responded to Twitter by criticising it for relying on “Fake News CNN and the Amazon Washington Post” and argued that Twitter was stifling his free speech. As we set out in Chapter 3 on accountability, freedom of expression is not unjustly infringed by reducing the spread of harmful speech and as we argue in Chapter 2, fact checking, when done to a high standard and transparently, only adds to the quality of public debate.

On 29 May 2020 Twitter took action again against President Trump for a Tweet about protests in the US that they believed broke their rules on glorifying violence. Twitter placed the Tweet behind a warning stating it broke the rules for glorifying violence but that it was in the public interest for the Tweet to stay on their service.345 This is an effective approach in keeping with the principles we have set out.

President Trump posted the same content on mail fraud and US protests to his Facebook page however, Facebook chose to not take action against either post because they believed that it did not breach their relevant policies. It treats threats of state use of force differently from non-state force and only removes threats from non-state actors. Facebook also have different policies on electoral misinformation to Twitter. It excludes elected officials from its third-party-fact-checking initiative but will still remove content that they view as voter suppression. In response to criticism from inside and outside of the organisation Facebook has committed to reviewing its policies on voter suppression and state use of force.346

Facebook’s response encapsulates the failures that we have outlined in the previous three Chapters. Their restriction of their fact checking programme to not include elected individuals creates the impression of unequal treatment. Although both have been discussed publicly by Facebook neither their voter suppression policy nor their state force policy are clearly articulated within the community standards including examples of what would and would not count.347 This has led to individuals inside and outside of the company believing that President Trump broke these rules whilst Facebook’s official judgement is that he did not. The overall effect of this is that a critical decision about public debate is made by unaccountable individuals on the basis of rules that are not transparent.

210.Mackenzie Common, an academic at LSE, has suggested that content moderation systems could be improved by platforms publishing a collection of decisions that act as precedent.348 Terms and conditions would be given shape by expanding on the various categories of prohibited content and indicating how borderline cases are decided. Those examples could be anonymised and could include a short explanation of why each decision was made. Crucially, the database of previous decisions could be used to ensure consistency in content moderation decisions and improve accountability for these decisions. Users or civil society organisations could identify problematic individual decisions or trends in decisions and challenge them.

211.Such a database would fit well with the ombudsman system recommended in the previous Chapter. Civil society groups could raise specific cases that they believe pose a problem with the content ombudsman either because that case does not resemble previously published examples or because the published examples are problematic. If there were a broader issue identified with moderation policy, then civil society groups could raise it with Ofcom. The system would be more meaningful and effective with more representative example decisions and if platforms included high profile decisions without anonymisation.

212.The proposed database should not include all types of content moderation decisions. Karim Palant of Facebook told us that Facebook contributed to shared databases of inappropriate child abuse and terrorism material and that these were rightly only used by law enforcement agencies and other platforms.349 As discussed in the previous Chapter, the key decisions that should be publicly available for democratic discussion are those taken around impersonation, misinformation, hate speech and abuse and these are the decisions that any database should focus on.

213.Katy Minshall of Twitter told us that their rules already contained hypothetical examples of Tweets that would not be allowed on their platform.350 An example of this is their hateful content policy, which forbids conduct that promotes violence against, directly attacks, or threatens another person on the basis of race, ethnicity, national origin, caste, sexual orientation, gender, gender identity, religious affiliation, age, disability or serious disease. Their policy includes the case of“ Hoping that someone dies as a result of a serious disease, for example, ‘I hope you get cancer and die.’”.351 Whilst this is more helpful than a vague statement forbidding hateful content, it does not provide the detail needed for civil society to understand what is ‘hateful’ to the point that it breaks their community rules and would be taken down. This prevents an informed debate on whether this line is in the right place to preserve freedom of expression whilst also ensuring that violence is not incited against people with protected characteristics. Critically, no platforms provided an example of the types of content which would not be removed. This means that it is difficult to understand when something has not been taken down whether this is due to it not breaking the rules of the platform or if it has simply not been seen by a content moderator. In turn, it is difficult to assess the quality or consistency of decision making.

214.When questioned on this subject, representatives from Facebook, Twitter and Google did not comment on the feasibility of this approach. However, previous research on the subject has suggested that solutions being scalable is key to improving moderation policies and processes. An anonymous Facebook employee explained to Dr Tarleton Gillespie that unless policy is repeatable at scale then it is not really a policy and all that remains are good aspirations and chaos.352 However, the model of a database of previous examples guiding future decisions can fit with existing processes at these platforms. As explained in Box 4, platforms train the content moderation algorithms based on a sample dataset of moderation decisions. The sample dataset could form the basis of a public database of decisions. Similarly, leaked slides from internal human moderators show extensive use of examples in training their training.353 Twitter told us that they use anonymised example Tweets to train their content moderators.354 From reporting, we know that Facebook moderators make their decisions on the basis of a known questions document and a mixture of constantly changing pieces of contradictory advice.355 Facebook views its moderators as making simple binary decisions based on policy. A senior lawyer at Facebook explained it to the Harvard Law Review as moderators being asked to tell the difference between red or blue rather than deciding between beautiful and ugly.356 A searchable database of previous decisions, showing what content should be removed, no longer recommended or kept as is, could be an improvement for internal procedures, and scale better than existing workflows, as well as having the benefits for democracy we have outlined.

215.Throughout this section we have referred to content as singular, however not all moderation decisions are based on a single piece of content. Some of the examples used could include a pattern of content showing abuse or other misuse of the platform rather than solely focusing on a single piece of content.

216.Ofcom should issue a code of practice on content moderation. This should require companies to clearly state what they do not allow on their platforms and give useful examples of how this applies in practice. These policies should also make clear how individual decisions can be appealed. Platforms should be obligated to ensure that their content moderation decisions are consistent with their published terms and conditions, community standards and privacy rules.

217.The code of practice on content moderation should also include the requirement that all technology platforms publish an anonymised database of archetypes of content moderation decisions on impersonation, misinformation, hate speech and abuse. Where decisions differ from existing published examples the platform should be obliged to explain the decision to the individuals affected and to create a new anonymised decision. Failure to ensure consistency between content moderation practices and published examples should be seen as a failure in the duty of care and result in sanctions against the platforms. An archive of removed content should be made available to researchers for analysis.

249 Committee on Standards in Public Life, ‘The 7 principles of public life’ (May 1995): [accessed 13 May 2020]

250 Q 8 (Baroness O’Neill of Bengarve)

251 37 (Alex Krasodomski-Jones)

252 Q 285 (Tony Close)

253 Q 143 (Professor Rasmus Kleis Nielsen)

254 Q 49 (Dr Martin Moore)

255 Q 48 (Dr Martin Moore)

256 71 (Paul Bainsfair)

257 Written evidence from Dr Ana Langer and Dr Luke Temple (DAD0048)

258 Q 71 (Eric Salama)

259 Q 75 (Eric Salama)

260 Q 74 (Eric Salama)

261 Q 116 (Paddy McGuinness)

262 Q 116 (Ben Scott)

263 International Institute for Democracy and Electoral Assistance, Digital Microtargeting: Political Party Innovation Primer 1, (June 2018): [accessed 13 May 2020]

264 Felix Simon, ‘“We power democracy”: Exploring the promises of the political data analytics industry’, The Information Society, vol 35, (2019): [accessed 13 May 2020]

265 Written evidence from the Institute for Practitioners in Advertising (DAD0026)

266 Google The Keyword, ‘An update on our political ads policy’ (November 2019): [accessed 13 May 2020]

267 Written evidence from Facebook (DAD0081)

268 Written evidence HM Government (DAD0034)

269 Q 116 (Elisabeth Braw)

270 Q 116 (Lisa-Maria Neudert)

271 Q 115 (Ben Scott)

272 Q 115 (Paddy McGuinness)

273 Q 234 (Sir Julian King)

274 Written evidence from HM Government (DAD0034)

275 Q 47 (Professor Helen Margetts)

276 Q 47 (Professor Cristian Vaccari)

277 Elizabeth Dubois and Grant Blank, ‘The echo chamber is overstated: the moderating effect of political interest and diverse media’, Information, Communication & Society, vol 21 (2018): [accessed 13 May 2020]

278 Q 47 (Dr Martin Moore)

279 Q 47 (Professor Helen Margetts)

280 Written evidence from the Oxford Internet Institute (DAD0060)

281 DCMS and Home Office, Online Harms White Paper, CP57, April 2019, p 71: [accessed 27 May 2020]

282 Written evidence from Dr Ysabel Gerrard (DAD0093)

283 Mark Zuckerberg, ‘A Blueprint for Content Governance and Enforcement’, (November 2018): [accessed 13 May 2020]

284 Q 47 (Professor Cristian Vaccari)

285 Q 170 (Professor Safiya Noble)

286 Q 163 (Dr Jennifer Cobbe)

287 Q 163 (Christoph Schott)

288 Q 166 (Dr Jennifer Cobbe, Christoph Schott)

289 Q 80 (Keith Weed, Paul Bainsfair)

290 Hertie School of Governance, ‘How to regulate the internet? Nick Clegg speech’, (June 2019): [accessed 13 May 2020]

291 115 (Ben Scott)

292 Q 115 (Paddy McGuinness)

293 37 (Alex Krasodomski-Jones)

294 Q 48 (Professor Helen Margetts)

295 Lynge Asbjørn Møller and Anja Bechmann, Research Data Exchange Solution (2019):, [accessed 13 May 2020]

296 Q 51 (Professor Helen Margetts)

297 54 (Professor Cristian Vaccari)

298 Q 54 (Professor Cristian Vaccari)

299 Uniform Resource Locator, a unique web address for online content.

300 Gary King and Nathaniel Persily, ‘Unprecedented Facebook URLS Dataset now Available for Academic Research through Social Science Once’, Social Science One, Harvard University, (February 2020): [accessed 13 May 2020]

301 Q 298 (Karim Palant), supplementary written evidence from Facebook (DAD0108)

302 Written evidence from Dr Rebekah Tromble (DAD0104)

303 Data Protection Act 2018, section 128

304 Written evidence from Dr Rebekah Tromble (DAD0104)

305 ad exchanger, ‘Facebook Shares More audience Data Via Carefully Controlled ‘Clean Rooms’ (18 July 2017): [accessed 13 May 2020]

306 Q 121 (Ben Scott)

307 Q 251 (Vint Cerf)

308 Q 38 (Caroline Elsom)

309 Q 110 (Matthew D’Ancona)

310 Q 246 (Vint Cerf)

311 Q 241 (Vint Cerf)

312 YouTube Help, ‘External evaluators and recommendations’: [accessed 13 May 2020]

313 255 (Vint Cerf)

314 Written evidence from Google (DAD0086)

315 Q 241 (Vint Cerf)

316 Q 241 (Katie O’Donovan)

317 FFWD, ‘YouTube’s Deradicalisation Argument is Really a Fight about Transparency’ (December 2019): [accessed 13 May 2020]

319 Q 246 (Katie O’Donovan)

320 Q 166 (Dr Jennifer Cobbe)

321 Q 166 (Alaphia Zoyab)

322 Q 192 (Roger Taylor)

323 Q 167 (Dr Jennifer Cobbe)

324 FFWD, ‘Creators are forcing YouTube’s LGBTQ problem out into the open’ (14 August 2019): [accessed 13 May 2020]

325 Beurling, ‘Demonetization report’, [accessed 13 May 2020]

326 Q 245 (Katie O’Donovan)

327 ‘There is no Algorithm for Truth – with Tom Scott’ (24 October 2019), YouTube video, added by The Royal Institution, [accessed 13 May 2020]

328 167 (Dr Jennifer Cobbe)

329 Q 164 (Dr Jennifer Cobbe)

330 Q 58 (Guy Parker)

331 Equality Act 2010, section 4

332 Facebook, ‘Standing against hate’ (27 March 2019): [accessed 13 May 2020]

333 Written evidence from Professor Daniel Kreiss (DAD0098)

334 ‘A right-wing YouTuber hurled racist, homophobic taunts at a gay reporter. The company did nothing’, The Washington Post (5 June 2019): [accessed 13 May 2020]

335 YouTube Official Blog, ‘Taking a harder look at harassment’ (5 June 2019): [accessed 13 May 2020]

337 Ibid.

338 ‘YouTube’s arbitrary standards mean stars keep making money even after breaking the rules’, The Washington Post (9 August 2019): [accessed 13 May 2020]

339 Q 172 (Professor Sarah Roberts)

340 Q 43 (Caroline Elsom)

341 Centre for Democratic Technology, ‘Covid-19 Content Moderation Research Letter’ (April 2020): [accessed 11 June 2020]

342 Full Fact, Report on the Facebook Third-Party Fact Checking Programme Jan-June 2019 (July 2019): [accessed 13 May 2020]

343 The Verge, ‘Twitter removes tweets by Brazil, Venezuela presidents for violating Covid-19 content rules’ (30 March 2020): [accessed 13 May 2020]

344 Twitter, ‘Trump makes unsubstantiated claim that mail-in ballots will lead to voter fraud’ (26 May 2020): [accessed 3 June 2020]

347 Facebook ‘Community Standards’: [accessed 10 June 2020]

348 Mackenzie Common, ‘Fear the Reaper: How Content Moderation Rules are Enforced on Social Media’, (January 2019): [accessed 13 May 2020]

349 Q 306 (Karim Palant)

350 Q 314 (Katy Minshall)

351 Twitter Help Centre, ‘Hateful conduct policy’: [accessed 13 May 2020]

352 Dr Tarleton Gillespie, Custodians of the Internet: platforms, content moderation, and the hidden decisions that shape social media, (New Haven: Yale University Press 2018) p 138

353 ‘Hate speech and anti-migrant posts: Facebook’s rules’, The Guardian (24 May 2017) [accessed 13 May 2020]

354 Supplementary Written Evidence from Twitter (DAD0103)

355 The Verge, ‘The Trauma Floor: The secret lives of Facebook moderators in America’ (25 February 2019): [accessed 13 May 2020]

356 Marvin Ammori, ‘The ‘New’ New York Times: Free speech lawyering in the age of Google and Twitter, Harvard Law Review, vol 127:2259 p 2278: [accessed 13 May 2020]

© Parliamentary copyright 2018