148.Aggregate IQ is a Canadian digital advertising web and software development company incorporated in 2012 by its owners Jeff Silvester and Zack Massingham. Jeff Silvester told us that he had known Christopher Wylie, the Cambridge Analytica whistleblower, since 2005, and met Alexander Nix, the then SCO of Cambridge Analytica, “around the beginning of 2014.”161
149.AIQ worked for SCL to “create a political customer relationship management software tool” for the Trinidad and Tobago election campaign in 2014, and then went on to develop a software tool—the Ripon tool—commissioned and owned by SCL”.162 According to the ICO, in early 2014, SCL Elections approached AIQ to “help it build a new political Customer Relations Management (CRM) tool for use during the American 2014 midterm elections”.163 The AIQ repository files contain a substantial amount of development work, with vast amounts of personal data, in plain text, of the residents of Trinidad and Tobago.
150.The Ripon tool was described by Jeff Silvester as “a political customer relationship management tool focused on the US market”164 and it was described by Christopher Wylie as “the software that utilised the algorithms from the Facebook data”.165 As a result of developing the Ripon tool, so that voters could be sent micro-targetted adverts, AIQ also worked on political campaigns in the US.166 This work was still ongoing when they also got involved in Brexit-related campaigns in the UK’s EU Referendum. According to Facebook, “AIQ ran 1,390 ads on behalf of the pages linked to the referendum campaign between February 2016 and 23 June 2016 inclusive”.167
151.Chris Vickery, Director, Cyber Risk Research, at the UpGuard consultancy, works as a data breach hunter, locating exposed data and finding common threads. After The Observer, Channel 4 and New York Times coverage of Cambridge Analytica (and associated companies), Upguard published four papers that explained connections between AIQ, Cambridge Analytica, and SCL, and AIQ’s work during the UK Referendum.168 These papers were based on the data that Chris Vickery had found through the insecure AIQ website. When he appeared before the Committee, he presented Gitlab169 data containing over 20,000 folders and 113,000 files, which he had downloaded from the insecure AIQ website.170 The Committee has made the full data set available to the ICO.
152.Mr Vickery described the evidence that he submitted to the inquiry:
This repository is a set of sophisticated applications, data management programs, advertising trackers, and information databases that collectively could be used to target and influence individuals through a variety of methods, including automated phone calls, emails, political websites, volunteer canvassing, and Facebook ads.171
153.The following chart supplied in evidence by Chris Vickery highlights the relationship that AIQ had between Cambridge Analytica, SCL and other clients. The full list of AIQ-repository-present projects known to involve UK entities is:
154.According to Chris Vickery, “the 15 nodes shown below are corroborated with documentation and credible testimony. This is not an exhaustive list of every data gateway and relevant flow, but I do remain confident in stating that this is a reasonable depiction of what has transpired”:172
Source: Chris Vickery173
155.In its July 2018 report, the ICO confirmed that AIQ had access to the personal data of UK voters, given by the Vote Leave campaign, and that AIQ “held UK data that they should not have continued to hold”.174 This Chapter will explore the AIQ unsecured data discovered by Chris Vickery, studying: the relationship between AIQ, SCL and Cambridge Analytica; the work that AIQ carried out for the EU referendum; and the capabilities that were open to AIQ, by the types of tools that were exposed in the repository. We commissioned the communications agency, 89up,175 to carry out analysis of this data and have also used the expertise of Chris Vickery in our work.
156.According to Jeff Silvester, CEO of AggregateIQ, roughly 80% of AIQ’s revenue came from SCL, from 2013 until mid-2015.176 It would appear from Chris Vickery’s files that, where AIQ and SCL worked together, staff had mutual access to some of the same databases.177 Alexander Nix told us that “after they had built the software platform for us, AIQ continued to work with us as consultants, not least to help some of our clients to interface with the product that they had built and to teach them how to use it”.178
157.In our Interim Report, we described the Ripon software—a political customer relationship management software tool—developed by AIQ, which was commissioned and owned by SCL.179 The files obtained by Chris Vickery illustrate clear collaboration between Cambridge Analytica and AIQ, with the importing of the original Ripon development project from a Cambridge Analytica-controlled domain to the AIQ repository. AIQ’s involvement with the Ripon software came from a source repository located at “scl.ripon.us”. The domain was registered to the then CEO of Cambridge Analytica, Alexander Nix. The ICO discovered financial transactions and contacts between the organisations, and also concluded that it was purely a contractual relationship:
We found no evidence of unlawful activity in relation to the personal data of UK citizens and AIQ’s work with SCLE. To date, we have no evidence that SCLE [SCL Elections] and CA [Cambridge Analytica] were involved in any data analytics work with the EU Referendum campaigns.180
158.AIQ’s lawyers, Borden Ladnew Gervais, wrote to the Committee to state: “AggregateIQ is not an associated company of Cambridge Analytica, SCL, or any other company for that matter. AggregateIQ is 100% Canadian owned and operated. AggregateIQ wrote software for SCL. AggregateIQ did not manipulate micro-targeting, nor facilitate its manipulation.181
159.According to the files we obtained, there was certainly data exchanged between both AIQ and SCL, as well as between AIQ and Cambridge Analytica. The repository files include stray ‘debug’ logs, which document the importing of data, including OCEAN psychographic scores, which Jeff Silvester openly stated, in his second appearance before the Canadian Parliament, had come from Cambridge Analytica and had been used in the AIQ-developed side of the Ripon software.182 After being asked by Nathanial Erskine-Smith MP, Vice-Chair of the Canadian Standing Committee on Access to Information, Privacy and Ethics, and member of the ‘International Grand Committee’, whether AIQ should have exercised more due diligence in taking information from SCL and converting it into advertising and targeting, Jeff Silvester said:
We did ask questions about where it came from, but the information we got was that it was from public data sources, and there are tons of them in the United States. We were unaware they were obtaining information improperly at the time. […] With respect to everything that’s transpired after having worked with SCL, would I do it again? I probably wouldn’t.183
160.Mr Vickery was able to find AIQ’s repository only after a SCL developer left a software script open on his own private Github account.184 The script file has a header stating that it originated from an AIQ developer. SCL staff had access to AIQ data and the two businesses seemed unusually closely linked. According to Chris Vickery, the available evidence would weigh heavily towards there being more to the AIQ, Cambridge Analytica, SCL relationship than is usually seen in an arm’s length relationship.
161.Within the AIQ repository are references to the “The Database of Truth”, a system that obtains and integrates data from disparate sources, collating information from hundreds of thousands, and potentially millions, of voters.185 Some of this came from the RNC database—the Republican National Committee Data Trust is the Republican party’s primary voter file provider—and some came from the Ted Cruz campaign. This database can be interrogated using a number of parameters, including, but not limited to: first name, last name, birth year, age, age range, registration address, whether they were Trump supporters and whether they would vote.
162.The full voter data stores were held elsewhere from the code repository, although the repository did include the means through which anyone could have accessed the full voter data stores. The information included in the ‘Database of Truth’ could have been used to target specific users on Facebook, using its demographic targeting feature when creating adverts on the Facebook platform. According to Chris Vickery, the credentials contained within the ‘Database of Truth’ could have been used by anyone finding them. In other words, anyone could find exposed passwords on the site and then access millions of individuals’ private details.
163.References in the repository explain how the ‘Database of Truth’ was used by WPAi, a company which describes itself as “a leading provider of political intelligence for campaigns from President to Governor and U.S. Senate to Mayor and City Council in all 50 states and several foreign countries”.186 The repository also shows that WPAi worked with AIQ for the Osnova party in Ukraine.187 WPAi was described as a partner of AIQ.
164.With detailed information about voters available to AIQ, the company would have been able to create highly targeted ads on Facebook to reach potential voters. More specifically, they would have been able to use this information to target users by: age; gender; location, within a designated one-mile radius (using Facebook’s hyperlocal targeting); and race (in 2018, Facebook removed over 5,000 options that could have been used to exclude certain religious and ethnic minority groups).188
165.Chris Vickery uncovered a “config” file, which illustrated the interplay between AIQ, Cambridge Analytica, and right-wing news website Breitbart, run by the ultra-conservative campaigner Steve Bannon. A config file is a collection of settings that software refers to during execution, in order to fill in variables. It is a file that describes the preferences of the user on how a programme should run, but it can only ask for things that the programme knows how to do. As we said in the Interim Report, Steve Bannon served as White House chief strategist at the start of President Trump’s term, having previously served as Chief Executive of Trump’s election campaign. He was the Executive Chairman of Breitbart News, a website he described as ‘the platform of the alt-right’. He was also the former Vice President of Cambridge Analytica.189
166.There is clear evidence that there was a close working relationship between Cambridge Analytica, SCL and AIQ. There was certainly a contractual relationship, but we believe that the information revealed from the repository would imply something closer, with data exchanged between both AIQ and SCL, as well as between AIQ and Cambridge Analytica.
167.AIQ carried out online advertising work for Brexit-supporting organisations Vote Leave, Veterans for Britain, Be Leave and DUP Vote to Leave, who all, according to Jeff Silvester, approached AIQ independently of each other.190 The majority of the adverts—2,529 out of a total of 2,823—were created by AIQ on behalf of Vote Leave.191 The value of this advertising carried out by and paid to AIQ for the Brexit campaigns included:
168.AIQ written evidence denies the fact that AIQ held individuals’ information relating to the EU referendum:
The information accessed by the security researcher is primarily software code, but also included contact information for supporters and voters from a few of our past clients. None of these files contained any individual financial, password or other sensitive information, and none of the personal information came from the Brexit campaign.193
169.This concurs with the work of the Federal Office of the Privacy Commissioner of Canada (OPC) and the Office of the Information and Privacy Commissioner of British Columbia (OIPC), which are conducting a joint investigation and, according to the ICO “have not yet made findings. […] they have advised us that they have not located any UK personal data, other than that identified within the scope of our enforcement notice.”194
170.We believe that AIQ handled, collected, stored and shared UK citizen data, in the context of their work on the EU referendum. There is an entire AIQ project area— “Brexit Sync”—devoted to synchronising UK and Brexit-relevant data, including personal individuals’ information, from multiple pro-Brexit client entities. The processing scripts contained in the AIQ repository also show that Facebook Account IDs were being harvested and attached to voter profiles for people living in the UK.
171.Mr Vickery told us that:
In the time since my testimony before the Committee, I have located a spreadsheet in the AIQ repository files containing the first name, last name, and email addresses for 1,438 apparently UK citizens. I am making the nationality assumption based upon the email address domain names (examples: yahoo.co.uk, btinternet.com, hotmail.co.uk, sky.com).195
172.As we stated previously, AIQ used data scraped by Aleksandr Kogan to target voters in the US election. Therefore, AIQ had the capability to email potential voters during the EU referendum and also to target people via Facebook. By uploading the emails to Facebook to a “custom audience”, all the users whose emails were uploaded and matched the emails used to register accounts on Facebook could be precisely targeted via the platform.
173.In response to the Electoral Commission’s request for information concerning Vote Leave, Darren Grimes and Veterans for Britain, Facebook told the Electoral Commission in May 2018 that AIQ had made use of data file custom audiences—enabling AIQ to reach existing customers on Facebook or to reach users on Facebook who were not existing customers—website custom audiences and lookalike audiences.196 AIQ stated that it was an administrative error, which was quickly corrected.197
174.Furthermore, Facebook wrote to the Electoral Commission in May 2018, in response to a request for information connected with pro-Brexit campaign groups. The letter states that “SCL Elections is listed as the contact for at least one AIQ Facebook ad account”. The provided email address belongs to an SCL employee.198 No explanation has been given to why this should be the case.
175.James Dipple-Johnstone, Deputy Information Commissioner, told us that the email addresses in the repository “came from other work that the company had done for UK companies and organisations and it had been retained by them following those other contracts that it had”.199 In July 2018, the ICO confirmed that AIQ had access to the personal data of UK voters, given by the Vote Leave campaign and have established “that [AIQ] hold UK data which they should not continue to hold.”200 This data was discovered in the AIQ Gitlab repository that was presented to the ICO by the Committee. In October 2018, issued an Enforcement Notice, stating that “the Commissioner is satisfied that the controller has failed to comply with Articles 5(1)(a)-(c) and Article 6 of the GDPR”.201 Mr Dipple-Johnstone told us that the ICO “have asked them to delete that data as part of the enforcement notice”.202 AIQ had the capability to use the data scraped by Dr. Kogan. We know that they did this during the US elections in 2014. Dr Kogan’s data also included UK citizens’ data and the question arises whether this was used during the EU referendum. We know from Facebook that data matching Dr Kogan’s was found in the data used by AIQ’s leave campaign audience files. Facebook believe that this is a coincidence, or, in the words of Mike Schroepfer, CTO of Facebook, an “effectively random chance”.203 It is not known whether the Kogan data was destroyed by AIQ.
176.Among the AggregateIQ repositories exposed are those relating to four pro-leave EU referendum campaign groups: ChangeBritain; Vote Leave; DUP; and VeteransForBritain:204
In July 2018, the Committee published Facebook adverts that had been run by AIQ during the EU referendum, which illustrates the fact that multiple adverts were being run and targeted by AIQ for different audiences. The series of PDFs highlighted adverts run by AIQ during the referendum on behalf of Vote Leave and the ‘50 Million’ prediction competition.
177.The £50 million competition was a data-harvesting initiative run for Vote Leave, which offered football fans the chance to win £50m. To enter the competition, fans had to input their name, address, email and telephone number, and also how they intended to vote in the Referendum. But working out what message to send to which audience was absolutely crucial. Screenshots published on our website prove that AIQ processed all the data from the £50 million football predicted contest that they hosted, and harvested Facebook IDs and emails from signups for the contest.205
178.Furthermore, a blog written by Dominic Cummings admitted that the competition was a data-harvesting exercise: “Data flowed in on the ground and was then analysed by the data science team and integrated with all the other data streaming in. This was the point of our £50m prize for predicting the results of the European football championships, which gathered data from people who usually ignore politics.”206 If people engaged with the quiz, their data was harvested. There is no evidence to show that this was fraudulent, but one could question whether data gathered in this way was ethical. Furthermore, the odds of winning the £50 million prize were estimated as one in 9.2 quintillion (billion, billion).207
179.The repository data submitted by Chris Vickery highlight the capabilities that AIQ had built, in obtaining and using people’s personal data. It is unclear whether these tools were actually used, but they were obviously developed with the potential to extract and manipulate data. The inclusion of debugging logs within the repository show that the tools were used. The entire extent of their use is not known.
180.Three machine learning pipelines were used to process both text and images. The software could be used to read photographs of people on websites, match them to their Facebook profiles, and then target advertising at these individual profiles.
181.The Facebook Pixel is a piece of code placed on websites. The Pixel can be used to register when Facebook users visit the site. Facebook can use the information gathered by the Pixel to allow advertisers to target Facebook users who had visited that given site. AIQ definitely utilized Pixels and other tools to help in data collection and targeting efforts.
182.For example, if a user visited a website during the referendum campaign that was using a Facebook tracking pixel, placed there by Vote Leave/AIQ, then those users could be unknowingly served adverts by that campaign through Facebook. Those users could be served adverts by other leave campaigns if they had access to the same pixel data. This would be possible if all social adverts were being managed by the same entity for all campaigns (it is easy to share pixel information between different Facebook advertising campaigns). From the repository, it is clear that AIQ staff had access to more than one campaign.
183.Chris Vickery, based on his analysis of the contents of the Gitlab repository, believes that AIQ’s capabilities went much further:
“AIQ harvested, and attached to profiles, the Facebook IDs of registered UK voters who had a Facebook account. This data was all synchronized (matched together from many data sources) and collected through the platform offered by Nationbuilder.com. […] I have the synchronization scripts showing the aggregation on that platform as well as the names of databases controlled directly by AggregateIQ containing this data (along with the usernames and passwords to access the databases)”.208
184.Facebook told us of the number of different tools that it provides that third parties can choose to integrate into their websites or other products. We asked Mike Schroepfer, Chief Technology Officer for Facebook, for the percentage of sites on the internet on which Facebook tracks users.209 He did not provide an adequate answer, so we asked again in writing.
185.In its subsequent letter, Facebook again failed to give a figure, but did give examples of other tools that it uses to track users, for example, social plugins (software that enables a customised service) such as the Like button and Share button. Facebook told us that these plugins “enrich users’ experience of Facebook by allowing them to see what their Facebook friends have liked, shared, or commented on across the Web”.210 Such plugins also benefit Facebook, as it receives information when a site with the plugin is visited. Its servers log: that a device visited a website or app; and any additional information about the person’s activities on that site that the website chooses to share with Facebook. Facebook told us that, between 9 April and 16 April 2018, the Facebook Like button appeared on 8.4 million websites, the Share button appeared on 931k websites, and there were 2.2 million Facebook Pixels installed on websites.211
186.Given the fact that AIQ maintained several British interests websites, it would have been easy to install the Facebook pixels on each of the sites that AIQ built and then to share the information from one campaign with another. As well as building websites for UK-based campaigns, there is evidence of wider campaign tool building and deployment targeting of UK voters. For example, a folder called ‘ChangeBritain-MailSend-master’ contains a library of applications as well as a folder called ‘test’. Within this folder is a document that appears to be a template letter for voters to send to their MPs, encouraging them to vote for the triggering of Article 50 is there were a parliamentary vote.
187.There is a data scraper tool, within the repository, which has the capacity to extract data from LinkedIn. There is a folder called ‘LinkedIn-person-fondler-master’, which is an application that scrapes LinkedIn user data. Within the repository is a file containing information on 92,000 individuals on LinkedIn. These names could then have been used to gather the user’s location, position and place of work via the LinkedIn scraper tool. Using Facebook’s ad targeting, AIQ would then have been able to reach these users via targeting of locations, place of work and job positions.
188.The ‘LInbot.py’ (LinkedIn bot) script contains commentary from whoever wrote it explaining that it scrapes LinkedIn accounts. There is even a stray log in the same directory suggesting that this bot was run at least from October 8th, 2017 to October 19th, 2017. Scraping data from LinkedIn in this manner violates LinkedIn’s terms of User Agreement, which states: “we don’t permit the use of any third party software, including “crawlers”, bots, browser plug-ins, or browser extensions (also called “add-ons”), that scrape”.212
189.AIQ’s lawyers, Borden Ladner Gervais, wrote to the Committee on 20 September 2019, stating that “AggregateIQ developed a tool to search for users on LinkedIn and open their profile in such a way as to appear as though a candidate in their local election looked at their profile. This was not a scraping tool. This tool was never deployed”.213
190.However, according to Chris Vickery, the commentary within the tool explicitly claims to scrape data, “It is not even nuanced”. Sophisticated data matching between LinkedIn and Facebook, when combined with a detailed databased of scraped contacts, could have been used by AIQ to give their political clients a major edge in running high-targeted political adverts, when the target of those adverts had not consented to their data being used in this way. We believe that AIQ certainly developed a tool on LinkedIn that was intended to scrape data from the social network.
191.The ICO Report, “Investigation into the use of data analytics in political campaigns” highlighted its work on investigating the relationship between Cambridge Analytica, SCLE and AIQ,214 describing “a permeability” between the companies above and beyond what would normally be expected to be seen”.215 The ICO states that broader concerns about the close collaboration of the companies are understandable and cites the following financial transactions and contacts:
On 24 October 2014, SCLE Elections Limited made payments to Facebook of approximately $270,000 for an AIQ ad account; on 4 November 2014, SCLE made a payment of $14,000 for the same AIQ ad account; A refund for unused AIQ ads was later made to SCLE, with the explanation that SCLE had made pre-payments for its campaigns under AIQ. SCLE was listed as one of the main contacts for at least one of the AIQ. Facebook accounts, and the email address for that contact belonged to an SCLE employee who was also involved in a number of payments.216
However, the ICO’s investigations showed that, while there was a close working relationship, “we have no evidence that AIQ has been anything other than a separate legal entity”.217
192.From the files obtained by Chris Vickery, and from evidence we received, there seems to be more to the AIQ/Cambridge Analytica/SCL relationship than is usually seen in a strictly contractual relationship. AIQ worked on both the US Presidential primaries and for Brexit-related organisations, including the designated Vote Leave group, during the EU Referendum. The work of AIQ highlights the fact that data has been and is still being used extensively by private companies to target people, often in a political context, in order to influence their decisions. It is far more common than people think. The next chapter highlights the widespread nature of this targeting.
162 Disinformation and ‘fake news’: Interim Report, DCMS Committee, Fifth Report of Session 2017–19, HC 363, 29 July 2018, para 117.
163 Investigation into the use of data analytics in political campaigns, A report to Parliament, ICO, 6 November 2018.
166 Q2784. As we said in para 110 of the Interim Report, in August 2014, Dr Kogan worked with SCL to provide data on individual voters to support US candidates being promoted by the John Bolton Super Pac in the mid-term elections in November of that year. Psychographic profiling was used to micro-target adverts at voters across five distinct personality groups.
167 Letter from Rebecca Stimson, Facebook to Louise Edwards, The Electoral Commission, 14 May 2018, p1.
168 The Aggregate IQ Files: Part one: How a political engineering firm exposed their code base, UpGuard, 30 April 2018; Part two: The Brexit Connection, UpGuard, 30 April 2018; Part three: A Monarch, a Peasant, and a Saga, 30 April 2018; Part Four: Northwest passage, 1 November 2018
169 Gitlab is an online platform on which developers write and share code.
170 Chris Vickery oral evidence session, 2 May 2018.
171 The Aggregate IQ Files, Part One: How a political engineering firm exposed their code base, 30 April 2018.
174 Investigation into data analytics for political purposes: investigation update, ICO, July 2018, p4.
175 89up website.
179 Disinformation and ‘fake news’: Interim Report, DCMS Committee, Fifth Report of Session 2017–19, HC 363, para 117.
180 As above, p42.
181 Letter from Borden Ladner Gervais LLP to Damian Collins MP, 20 September 2018.
182 Evidence on Tuesday 12 June 2018, Standing Committee on access to information, privacy and ethics, Parliament of Canada, Q1045.
183 Same as above.
184 Github is a web-based hosting service.
185 In the repository, there is access to the search results only, so the number of users is unknown.
186 WPAi website, accessed 1 February 2019.
187 The Osnova party will be discussed further in Chapter 7.
188 Keeping advertising safe and civil, Facebook blog post, 21 August 2018.
189 Disinformation and ‘fake news’: Interim Report, DCMS Committee, Fifth Report of Session 2017–19, HC 363, para 96.
191 Investigation into the use of data analytics in political campaigns, a report to Parliament, 6 November 2018, para 3.6.
194 ICO, p42.
195 Correspondence between the Committee and Chris Vickery, 22 January 2019.
196 Facebook’s explanations of the different custom audiences can be found here: Letter from Rebecca Stimson, Facebook to Louise Edwards, The Electoral Commission, 14 May 2018, p2.
197 Letter from Borden Ladner Gervais to the Committee, re testimony of AggregateIQ Data Services Limited before the DCMS Committee, 20 September 2018.
198 Letter from Rebecca Stimson, Facebook to Louise Edwards, The Electoral Commission, 14 May 2018, p4.
200 Disinformation and ‘fake news’: Interim Report, DCMS Committee, Fifth Report of Session 2017–19, HC 363, 29 July 2018, para 119.
204 Change Britain was founded as a successor to the Vote Leave campaign.
205 Relevant screenshots of AIQ repository, submitted by Chris Vickery
207 Vote Leave launches £50m football prediction competition, Andrew Sparrow, The Guardian, 27 May 2016.
208 Written evidence, submitted by Chris Vickery
211 Same as above.
212 Prohibitive software and extensions, LinkedIn’s terms of agreement, accessed 2 December 2018.
213 Letter from Borden Ladner Gervais LLP to Damian Collins MP, 20 September 2018.
214 Investigation into the use of data analytics in political campaigning: a report to Parliament, ICO, 6 November 2018, p40–43.
215 Investigation into the use of data analytics in political campaigning: a report to Parliament, ICO, 6 November 2018, p40.
216 Same as above, p41.
217 Investigation into the use of data analytics in political campaigning: a report to Parliament, ICO, 6 November 2018, p41.
Published: 18 February 2019