108.The nature, likelihood and impact of risks arising from large language models (LLMs) remains subject to much debate. The complexity stems in part from the extensive literature,173 lack of agreed definitions, hype around rapid developments,174 and the possibility that some organisations may have interests in emphasising or downplaying risk.175
109.This chapter examines a selection of security and societal risks.176 We sought to distinguish hype from reality and provide some reference points to ground our review. We found credible evidence of both immediate and longer-term risks from LLMs to security, financial stability and societal values.
110.The first section of this chapter sets out our understanding of risk categories. The next section sets out near-term security risks that require immediate attention, followed by a discussion on longer-term concerns around catastrophic risk and then existential risk. Near-term societal risks such as bias and discrimination are discussed at the end of the chapter.
111.There are numerous frameworks for evaluating risk used by domestic and international authorities.177 We found little consistency in terms or methods across the literature.178 We adopt the framework from the Government’s National Risk Register (NRR), set out in the table below, to help describe impacts of LLM-related security risks. Our categorisation is approximate only and we do not attempt to replicate the full National Security Risk Assessment process. It nevertheless provides a helpful yardstick to anchor discussion using a recognised framework.179 This table does not cover existential risk, which we describe as a separate category later in this chapter.
Risk Level |
Fatalities |
Casualties |
Economic impact |
Minor |
1–8 |
1–17 |
£ millions |
Limited |
9–40 |
18–80 |
£ tens of millions |
Moderate |
41-200 |
81-400 |
£ hundreds of millions |
Significant |
201–1000 |
400–2000 |
£ billions |
Catastrophic |
More than 1,000 |
More than 2,000 |
£ tens of billions |
Source: HM Government, National Risk Register (2023): https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1175834/2023_NATIONAL_RISK_REGISTER_NRR.pdf [accessed 20 December 2023]
112.There are also various ways of categorising societal risk and conducting impact assessments.180 We draw on these to inform our review of societal risk, noting that the issues are highly context-dependent.
113.Risks may arise from both open and closed models, for example through:
114.Our evidence was clear that LLMs will act as a force multiplier enhancing malicious capabilities in the first instance, rather than introducing qualitatively new risks.182 Most models have some safeguards but these are not robust and can be circumvented.183 We believe the most immediate security risks over the next three years are likely to include the (non-exhaustive) list below, with indicative impacts ranging from minor to moderate, rather than catastrophic.
115.Cyber: LLMs are likely to be of interest to hostile states, organised crime, and low-sophistication actors.184 Some LLMs are reportedly being developed to create code for cyber attacks at increased scale and pace.185 LLMs and multi-modal models will make it easier to create phishing campaigns, fraudulent websites and voice cloning to bypass security protocols.186 Malicious actors may use prompt injection attacks to obtain sensitive information, or target models themselves to influence the outputs, poison training data or induce system malfunction.187 Current security standards are unlikely to withstand attacks from sophisticated threat actors.188
116.Tools to mass produce high quality and openly available destructive cyber weapons appear limited at present. Chris Anley, Chief Scientist at the cyber security firm NCC Group, said LLMs currently provide efficiency and lower barriers to entry, rather than game-changing capability leaps.189 Even moderate gains could however prove costly when deployed against under-prepared systems, as previous attacks on the NHS have shown.190
117.A reasonable worst case scenario might involve malicious actors using LLMs to produce attacks achieving higher cyber infection rates in critical public services or national infrastructure.191
118.Terrorism: A recent report by Europol found that LLM capabilities are useful for terrorism and propaganda.192 Options include generating and automating multilingual translation of propaganda, and instructions for committing acts of terror.193 In future, openly available models might be fine-tuned to provide more specific hate speech or terrorist content capabilities, perhaps using archives of propaganda and instruction manuals.194 The leak of Meta’s model (called LLaMa) on 4chan, a controversial online platform, is instructive. Users reportedly customised it within two weeks to produce hate speech chatbots, and evaded take-down notices.195
119.National Statistics data show 93 victim deaths due to terrorism in England and Wales between April 2003 and 31 March 2021.196 A reasonable worst case scenario might involve a rise in attacks directly attributable to LLM-generated propaganda or made possible through LLM-generated instructions for building weapons.197
120.Synthetic child sexual abuse material: Image generation models are already being used to generate realistic child sexual abuse material (CSAM).198 The Stanford Internet Observatory predicts that in under a year “it will become significantly easier to generate adult images that are indistinguishable from actual images”.199 The Internet Watch Foundation has confirmed this is “happening right now”,200 and stated legal software can be downloaded and used offline to produce illegal content “with no opportunity for detection”.201 This suggests more abuse imagery will be in circulation, law enforcement agencies may find it more difficult to identify and help real-world victims, and opportunities to groom and coerce vulnerable individuals will grow.202
121.AI CSAM currently represents a small proportion of the total amount of CSAM (reportedly 255,000 webpages last year with potentially millions of images).203 A reasonable worst case scenario might involve widespread availability of illegal materials which overwhelms law enforcement ability to respond.204
122.Mis/disinformation: LLMs are well placed to generate text-based disinformation at previously unfeasible scale, while multi-modal models can create audio and visual deepfakes which even experts find increasingly difficult to identify.205 LLMs’ propensity to hallucinate also means they can unintentionally misinform users.206 The National Cyber Security Centre assesses that large language models will “almost certainly be used to generate fabricated content; that hyper-realistic bots will make the spread of disinformation easier; and that deepfake campaigns are likely to become more advanced in the run up to the next nationwide vote, scheduled to take place by January 2025”.207
123.Professor Dame Angela McLean, Government Chief Scientific Adviser, said she was “extremely worried” and called for a public awareness campaign.208 Dr Jean Innes, CEO of the Alan Turing Institute, similarly warned about “mass disinformation”.209 Professor Phil Blunsom, Chief Scientist at Cohere, likewise highlighted “disinformation [and] election security” as issues of concern.210
124.Lyric Jain, CEO of the counter-disinformation firm Logically, said one of the main impacts of generative AI was increased efficiency and lower costs. He estimated the Internet Research Agency’s disinformation campaign targeting the US 2016 election cost at least $10 million,211 whereas generating comparable disinformation materials could now be done for $1,000 by private individuals. He further noted that model safeguards were preventing only 15 per cent of disinformation-related prompts.212
125.A reasonable worst case scenario might involve state and non-state interference undermining confidence in the integrity of a national election, and long-term disagreement about the validity of the result.213
126.A range of mitigation work is underway across Government and industry. The main issue remains scale and speed: malicious actors enjoy first-mover advantages whereas it will take time to upgrade public and private sector mitigations, including public awareness.214 And as the Government’s AI Safety Summit paper noted, there are limited market incentives to provide safety guardrails and no standardised safety benchmarks.215
127.We wrote to the Government seeking more information. It declined to provide details on whether mitigations were being expanded. But it did confirm workstreams included:
128.The most immediate security concerns from LLMs come from making existing malicious activities easier, rather than qualitatively new risks. The Government should work with industry at pace to scale existing mitigations in the areas of cyber security (including systems vulnerable to voice cloning), child sexual abuse material, counter-terror, and counter-disinformation. It should set out progress and future plans in response to this report, with a particular focus on disinformation in the context of upcoming elections.
129.The Government has made welcome progress on understanding AI risks and catalysing international co-operation. There is however no publicly agreed assessment framework and shared terminology is limited. It is therefore difficult to judge the magnitude of the issues and priorities. The Government should publish an AI risk taxonomy and risk register. It would be helpful for this to be aligned with the National Security Risk Assessment.
130.Catastrophic risks might arise from the deployment of a model with highly advanced capabilities without sufficient safeguards.217 As outlined in the previous table, indicative impacts might involve over 1,000 fatalities, 2,000 casualties and/or financial damages exceeding £10 billion.
131.There are threat models of varying plausibility.218 The majority of our evidence suggests these are less likely within the next three years but should not be ruled out—particularly as the capabilities of next-generation models become clearer and open access models more widespread.219 We outline some of the most plausible risks below.
132.Biological or chemical release: A model might be used to lower the barriers to malicious actors creating and releasing a chemical or biological agent. There is evidence that LLMs can already identify pandemic-class pathogens, explain how to engineer them, and even suggest suppliers who are unlikely to raise security alerts.220 Such capabilities may be attractive to sophisticated terror groups, non-state armed groups, and hostile states. This scenario would still require a degree of expertise, access to requisite materials and, probably, sophisticated facilities.221
133.Destructive cyber tools: Next generation LLMs and more extensive fine tuning may yield models capable of much more advanced malicious activity.222 These may be integrated into systems capable of autonomous self-improvement and a degree of replication.223 Such advances would raise the possibility of advanced language model agents navigating the internet semi-autonomously, performing sophisticated exploits, using resources such as payment systems, and generating snowball effects created by self-improvement techniques.224 Recent research suggests such capabilities do not yet exist, though progress on the component parts of such tools is already underway and capability leaps cannot be ruled out.225
134.Critical infrastructure failure: Models may in time be linked to systems powering critical national infrastructure (CNI) such as water, gas and electricity transmission, or security platforms (for example in military planning or intelligence analysis systems). This might occur either through direct integration of models with the infrastructure platform itself, or through software used in the supply chain.226 In the absence of safeguards, a sudden model failure may trigger a CNI outage or sudden security lapse, and could be extremely difficult to rectify given the black-box nature of LLM processes.
135.Professor Dame Angela McLean, Government Chief Scientific Adviser, confirmed that there were no agreed warning indicators for catastrophic risk. She said warning indicators for pandemics and similar were well understood, but:
“we do not have that spelled out for the more catastrophic versions of these risks. That is part of the work of the AI Safety Institute: to make better descriptions of things that might go wrong, and scientific descriptions of how we would measure that.”227
136.OpenAI told us work was underway to evaluate “dangerous capabilities” and appropriate safety features but noted “science-based measurements of frontier system risks … are still nascent”.228
137.Professor John McDermid OBE, Professor of Safety-Critical Systems at the University of York, said industries like civil aviation designed software with fault-detection in mind so that sudden failures could be fixed with speed and confidence.229 He did not believe such safety-critical system analysis was possible yet for LLMs and believed it should be a research priority.230
138.Professor Stuart Russell OBE, Professor of Computer Science at the University of California, Berkeley, was sceptical that the biggest safety challenges could be addressed without fundamental design changes. He noted that high-stakes industries like nuclear power had to show the likelihood of sudden catastrophic failure rates, which LLM developers could not. He also noted it was straightforward to bypass a model’s safety guardrails by prefixing a harmful question with something unintelligible to confuse it, and maintained that:
“The security methods that exist are ineffective and they come from an approach that is basically trying to make AI systems safe as opposed to trying to make safe AI systems. It just does not work to do it after the fact”.231
139.Ian Hogarth, Chair of the (then) Frontier AI Taskforce, told us that the Government took catastrophic risk very seriously. Viscount Camrose, Minister for AI and Intellectual Property, said the AI Safety Institute was focusing on frontier AI safety and driving “foundational” research.232
140.Catastrophic risks resulting in thousands of UK fatalities and tens of billions in financial damages are not likely within three years, though this cannot be ruled out as next generation capabilities become clearer and open access models more widespread.
141.There are however no warning indicators for a rapid and uncontrollable escalation of capabilities resulting in catastrophic risk. There is no cause for panic, but the implications of this intelligence blind spot deserve sober consideration.
142.The AI Safety Institute should publish an assessment of engineering pathways to catastrophic risk and warning indicators as an immediate priority. It should then set out plans for developing scalable mitigations. (We set out recommendations on powers and take-down requirements in Chapter 7). The Institute should further set out options for encouraging developers to build systems that are safe by design, rather than focusing on retrospective guardrails.
143.There is a clear trend towards faster development, release and customisation of increasingly capable open access models.233 Some can already be trained in just 6 hours and cost a few hundred dollars on public cloud computing platforms.234
144.We heard widespread concern about the ease of customisation leading to a rapid and uncontrollable proliferation of models which may be exploited by malicious actors, or contain safety defects affecting businesses and service users.235
145.Google DeepMind told us that that “once a model is openly available, it is possible to circumvent any safeguards, and the proliferation of capabilities is irreversible.”236 There is no ‘undo’ function if major safety or legal compliance issues subsequently emerge,237 and no central registry to determine model provenance once released. It may be possible to embed identifying features in models to help track them, though such research remains at an early stage.238 The Royal Academy of Engineering emphasised that many models will be hosted overseas, posing major challenges to oversight and regulation.239
146.As we set out in Chapter 3, open access models can provide speedy community-led improvements, including to security issues, but those same characteristics can also drive proliferation in malicious use.240
147.Closed models are not a security panacea, however. Previous breaches from hack and leak operations, espionage and disgruntled employees suggest that even well-protected systems may not remain closed forever.241 The Minister said the AI Safety Institute was working on the issues but believed the risks around open access proliferation remained an “extremely complex problem”.242
148.There is a credible security risk from the rapid and uncontrollable proliferation of highly capable openly available models which may be misused or malfunction. Banning them entirely would be disproportionate and likely ineffective. But a concerted effort is needed to monitor and mitigate the cumulative impacts. The AI Safety Institute should develop new ways to identify and track models once released, standardise expectations of documentation, and review the extent to which it is safe for some types of model to publish the underlying software code, weights and training data.
149.The threat model for existential risk remains highly disputed. A baseline scenario involves the gradual integration of hyper intelligent AI into high-impact systems to achieve political, economic or military advantage, followed by loss of human control. This might occur because humans gradually hand over control to highly capable systems that vastly exceed our understanding; and/or the AI system pursues goals which are not aligned with human welfare and reduce human agency.243 Humans might also increasingly rely on AI evaluations in high-stakes areas such as nuclear strategy, for example.244
150.Long-term indicative impacts have been compared to outcomes in other fields, including pandemics and nuclear.245 At the most extreme end, the first- and second-order consequences of uncontrolled nuclear exchange between superpowers have been variously estimated at 2–5 billion fatalities.246 A biosecurity extinction event might involve above 7 billion fatalities.247
151.Systems capable of posing such risks do not yet exist and there is no consensus about their long-term likelihood. Professor Phil Blunsom, Chief Scientist at the LLM firm Cohere, did “not see a strong existential risk from large language models”.248
152.Professor Stuart Russell OBE argued that “large language models are not on the direct path to the super intelligent system … but they are a piece of the puzzle”. He maintained current systems lacked features including “the ability to construct and execute long-term plans, which seems to be a prerequisite” to overcome human resistance, but “could not say with any certainty that it will take more than 20 years” for researchers to address those shortcomings.249
153.Some surveys of industry respondents predict a 10 per cent chance of human-level intelligence by 2035, while others say such developments are not likely and do not believe it is a concern.250 Researchers at the Oxford Internet Institute emphasised that current capabilities were “meaningfully different” to those required for existential risk.251 Owen Larter, Director of Public Policy at Microsoft’s Office for Responsible AI, anticipated a “further maturation of AI safety” in the coming years.252
154.This indicates a non-zero likelihood (remote chance) of existential risks materialising, though it is almost certain that these will not occur within the next three years and it seems highly likely that they will not materialise within the next decade. We note the possibility and (longer-term) timing remains a matter of debate and concern for some in the expert community.253 Several stakeholders suggested concerns about existential risk were distracting from efforts to address limited but more immediate risks,254 as well as from the opportunities LLMs may provide.255
155.It is almost certain existential risks will not manifest within three years and highly likely not within the next decade. As our understanding of this technology grows and responsible development increases, we hope concerns about existential risk will decline. The Government retains a duty to monitor all eventualities. But this must not distract it from capitalising on opportunities and addressing more limited immediate risks.
156.LLMs may amplify any number of existing societal problems, including inequality, environmental harm, declining human agency and routes for redress, digital divides, loss of privacy, economic displacement, and growing concentrations of power.256
157.Bias and discrimination are particular concerns, as LLM training data is likely to reflect either direct biases or underlying inequalities.257 Depending on the use, this might entrench discrimination (for example in recruitment practices, credit scoring or predictive policing); sway political opinion (if using a system to identify and rank news stories); or lead to casualties (if AI systematically misdiagnoses healthcare patients from minority groups).258 Professor Neil Lawrence cautioned that emergent societal risks could arise in unforeseen ways from mass deployment, as has been the case with social media.259
158.Such issues predate LLMs but, as Sense About Science warned, economic logic is driving competition for early adoption of LLMs before adequate guardrails are in place.260 The Post Office Horizon scandal provides a cautionary tale about the risks of relying on faulty technology systems.261
159.We heard that longstanding recommendations remain pertinent: educate developers and users, and embed explainability, transparency, accuracy and accountability throughout the AI lifecycle.262 This appears particularly difficult for LLMs. They are very complex and poorly understood; operate black-box decision-making; datasets are so large that meaningful transparency is difficult; hallucinations are common;263 and accountability remains highly disputed.264
160.Irene Solaiman, Head of Global Policy at Hugging Face, said efforts to improve model design and post-deployment practices were underway, but emphasised “how difficult, and frankly impossible, complex social issues are to quantify or to robustly evaluate”.265 Dr Koshiyama, CEO of the audit firm Holistic AI, noted there were limited market incentives to prioritise ethics, and said many earlier AI systems had well-known bias problems but remained in widespread use.266 Some jurisdictions are introducing mandatory ethics impact assessments.267 Sam Cannicott, Deputy Director of AI Enablers and Institutions at DSIT, said the AI Safety Institute would examine “societal harms” and would engage professional ethicists in its work.268
161.LLMs may amplify numerous existing societal problems and are particularly prone to discrimination and bias. The economic impetus to use them before adequate guardrails have been developed risks deepening inequality.
162.The AI Safety Institute should develop robust techniques to identify and mitigate societal risks. The Government’s AI risk register should include a range of societal risks, developed in consultation with civil society. DSIT should also use its White Paper response to propose market-oriented measures which incentivise ethical development from the outset, rather than retrospective guardrails. Options include using Government procurement and accredited standards, as set out in Chapter 7.
163.LLMs may have personal data in their training sets, drawn from proprietary sources or information online. Safeguards to prevent inappropriate regurgitation are being developed but are not robust.269
164.Arnav Joshi, Senior Associate at Clifford Chance, did not believe there was currently widespread non-compliance with data protection legislation but thought “that might happen [ … without] sufficient guardrails”.270 He said the General Data Protection Regulation (GDPR) provided “an incredibly powerful tool” to guide responsible innovation, but noted measures in the Data Protection and Digital Information Bill would, if enacted, have a “dilutive effect on rightsholders”, for example around rights to contest decisions made by AI.271
165.Data protection in healthcare will attract particular scrutiny. Some firms are already using the technology on NHS data, which may yield major benefits.272 But equally, models cannot easily unlearn data, including protected personal data.273 There may be concerns about these businesses being acquired by large overseas corporations involved in related areas, for example insurance or credit scoring.274
166.Stephen Almond, Executive Director at the Information Commissioner’s Office, told us data protection was complex and much depended on who was doing the processing, why, how and where. He said the ICO would “clarify our rules on this and our interpretation of the law to ensure that it is crystal clear”.275
167.Further clarity on data protection law is needed. The Information Commissioner’s Office should work with DSIT to provide clear guidance on how data protection law applies to the complexity of LLM processes, including the extent to which individuals can seek redress if a model has already been trained on their data and released.
168.The Department for Health and Social Care should work with NHS bodies to ensure future proof data protection provisions are embedded in licensing terms. This would help reassure patients given the possibility of LLM businesses working with NHS data being acquired by overseas corporations.
173 Our analysis draws on evidence submitted to this inquiry alongside Government publications, industry assessments, academic reviews and stakeholder engagements.
174 MIT Technology Review, ‘AI hype is built on high test scores’ (30 August 2023): https://www.technologyreview.com/2023/08/30/1078670/large-language-models-arent-people-lets-stop-testing-them-like-they-were/ [accessed 20 December 2023]
175 ‘How the UK’s emphasis on apocalyptic AI risk helps business’, The Guardian (31 October 2023): https://www.theguardian.com/technology/2023/oct/31/uk-ai-summit-tech-regulation [accessed 20 December 2023]
176 The distinction is made here for ease of analysis, noting that many of the risks and outcomes overlap. We describe bias as a societal risk, though a biased LLM used for defence-related decision-making might introduce security risks. Similarly a poorly calibrated LLM used in healthcare might result in fatalities. Our assessments are indicative only.
177 For a discussion on determining acceptable fatality rates see written evidence from Matthew Feeney (LLM047). For frameworks on risk see for example the US National Institute of Standards and Technology, Artificial Intelligence Risk Management Framework (January 2023): https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf [accessed 20 December 2023] and European Commission, ‘Regulatory framework proposal on artificial intelligence’: https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai [accessed 20 December 2023]. See also National Cyber Security Centre, ‘Guidelines for secure AI System development’ (November 2023): https://www.ncsc.gov.uk/collection/guidelines-secure-ai-system-development [accessed 8 January 2024].
178 See for example the AI Safety Summit discussion paper, alongside Annex A and Annex B, available at DSIT, ‘Frontier AI’ (25 October 2023): https://www.gov.uk/government/publications/frontier-ai-capabilities-and-risks-discussion-paper [accessed 8 January 2024], ‘The Bletchley Declaration by Countries Attending the AI Safety Summit’ (1 November 2023): https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023 [accessed 8 January 2024], ‘Introducing the AI Safety Institute’ (2 November 2023): https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute [accessed 8 January 2024], Department for Digital, Culture, Media and Sport, National AI Strategy, Cp 525 (September 2021): https://assets.publishing.service.gov.uk/media/614db4d1e90e077a2cbdf3c4/National_AI_Strategy_-_PDF_version.pdf [accessed 20 December 2023] and National Institute of Standards and Technology, Artificial Intelligence Risk Management Framework (January 2023): https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf [accessed 8 January 2023].
179 Note the NRR evaluation timeframe is assessed over two years for malicious risks and five years for non-malicious risks. We acknowledge AI may be treated as both a chronic and acute risk.
180 See for example Cabinet Office, ‘Ethics, Transparency and Accountability Framework for Automated Decision-Making’ (November 2023): https://www.gov.uk/government/publications/ethics-transparency-and-accountability-framework-for-automated-decision-making/ethics-transparency-and-accountability-framework-for-automated-decision-making [accessed 20 December 2023], Central Digital and Data Office, ‘Data Ethics Framework’ (September 2020): https://www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework-2020 [accessed 20 December 2023], CDEI, ‘Review into bias in algorithmic decision-making’ (November 2020): https://www.gov.uk/government/publications/cdei-publishes-review-into-bias-in-algorithmic-decision-making [accessed 20 December 2023], Information Commissioner’s Office, ‘Data protection impact assessments’: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/accountability-and-governance/guide-to-accountability-and-governance/accountability-and-governance/data-protection-impact-assessments/ [accessed 20 December 2023] and House of Commons Library, ‘The Public Sector Equality Duty and Equality Impact Assessments’, Research Briefing SN06591, July 2020.
181 Written evidence from the Alan Turing Institute (LLM0081), Martin Hosken (LLM0009), Royal Academy of Engineering (LLM0063) and DSIT, ‘Frontier AI’ (25 October 2023): https://www.gov.uk/government/publications/frontier-ai-capabilities-and-risks-discussion-paper [accessed 8 January 2024]
182 Q 27 (Professor Phil Blunsom), Q 24 (Chris Anley), Q 24 (Lyric Jain), written evidence from Ofcom (LLM0104), Competition and Markets Authority (LLM0100), Financial Conduct Authority (LLM0102), Open Data Institute (LLM0083), Alan Turing Institute (LLM0081) and HM Government, Safety and Security Risks of Generative Artificial Intelligence to 2025 (2023): https://assets.publishing.service.gov.uk/media/653932db80884d0013f71b15/generative-ai-safety-security-risks-2025-annex-b.pdf [accessed 21 December 2023]
183 Q 26 (Lyric Jain) and ‘GPT-4 gave advice on planning terrorist attacks when asked in Zulu’, New Scientist (October 2023): https://www.newscientist.com/article/2398656-gpt-4-gave-advice-on-planning-terrorist-attacks-when-asked-in-zulu/ [accessed 20 December 2023]
184 Written evidence from NCC Group (LLM0014), Q 22 (Professor Phil Blunsom) and NCSC, ‘Annual Review 2023’ (2023): https://www.ncsc.gov.uk/collection/annual-review-2023/technology/case-study-cyber-security-ai [accessed 20 December 2023]
185 Check Point Research, ‘OPWNAI: cyber criminals starting to use ChatGPT’ (January 2023): https://research.checkpoint.com/2023/opwnai-cybercriminals-starting-to-use-chatgpt/ [accessed 20 December 2023] and ‘WormGPT: AI tool designed to help cybercriminals will let hackers develop attacks on large scale, experts warn’, Sky (September 2023): https://news.sky.com/story/wormgpt-ai-tool-designed-to-help-cybercriminals-will-let-hackers-develop-attacks-on-large-scale-experts-warn-12964220 [accessed 20 December 2023]
187 A prompt injection involves entering a text prompt into an LLM which then enables the actor to bypass safety protocols. See written evidence from NCC Group (LLM0014), Q 24 (Chris Anley) and National Cyber Security Centre, ‘Exercise caution when building off LLMs’ (August 2023): https://www.ncsc.gov.uk/blog-post/exercise-caution-building-off-llms [accessed 20 December 2023].
188 DSIT, Capabilities and risks from frontier AI (October 2023), p 18: https://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf [accessed 20 December 2023]
190 The 2017 WannaCry cyber-attack for example affected 30 per cent of NHS Trusts, costing £92 million. See ‘Cost of WannaCry cyber-attack to the NHS revealed’, Sky, 11 October 2018: https://news.sky.com/story/cost-of-wannacry-cyber-attack-to-the-nhs-revealed-11523784 [accessed 20 December 2023].
191 Cabinet Office, ‘National Risk Register’ (2023), p 15: https://www.gov.uk/government/publications/national-risk-register-2023 [accessed 20 December 2023]
192 EUROPOL, ChatGPT—The impact of Large Language Models on Law Enforcement (March 2023): https://www.europol.europa.eu/cms/sites/default/files/documents/Tech%20Watch%20Flash%20-%20The%20Impact%20of%20Large%20Language%20Models%20on%20Law%20Enforcement.pdf [accessed 20 December 2023]
193 Tech Against Terrorism, ‘Early Terrorist Adoption of Generative AI’ (November 2023): https://techagainstterrorism.org/news/early-terrorist-adoption-of-generative-ai [accessed 20 December 2023]
194 Global Network on Extremism and Technology, ‘‘RedPilled AI’: A New Weapon for Online Radicalisation on 4chan’ (June 2023): https://gnet-research.org/2023/06/07/redpilled-ai-a-new-weapon-for-online-radicalisation-on-4chan/ [accessed 20 December 2023]
195 Ibid.
196 House of Commons Library, ‘Terrorism in Great Britain: the statistics’, Research Briefing, CBP7613, 19 July 2022
197 HM Government, National Risk Register 2023 Edition (2023): https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1175834/2023_NATIONAL_RISK_REGISTER_NRR.pdf [accessed 20 December 2023]. See section on terrorism pp 30–54.
199 David Thiel, Melissa Stroebel and Rebecca Portnoff, Generative ML and CSAM: Implications and Mitigations (June 2023): https://stacks.stanford.edu/file/druid:jv206yg3793/20230624-sio-cg-csam-report.pdf [accessed 21 December 2023]
200 Matt O’Brien and Haleluya Hadero, ‘AI-generated child sexual abuse images could flood the internet’, AP (October 2023): https://apnews.com/article/ai-artificial-intelligence-child-sexual-abuse-c8f17de56d41f05f55286eb6177138d2 [accessed 21 December 2023]
201 Internet Watch Foundation, How AI is being abused to create child sexual abuse imagery (October 2023): https://www.iwf.org.uk/media/q4zll2ya/iwf-ai-csam-report_public-oct23v1.pdf [accessed 21 December 2023)
202 David Thiel, Melissa Stroebel and Rebecca Portnoff, Generative ML and CSAM: Implications and Mitigations (June 2023): https://stacks.stanford.edu/file/druid:jv206yg3793/20230624-sio-cg-csam-report.pdf [accessed 21 December 2023]
203 Internet Watch Foundation, How AI is being abused to create child sexual abuse imagery (October 2023): https://www.iwf.org.uk/media/q4zll2ya/iwf-ai-csam-report_public-oct23v1.pdf [accessed 21 December 2023]
204 Ibid.
205 Written evidence from the Alan Turing Institute (LLM0081), Logically AI (LLM0062), Dr Jeffrey Howard et al (LLM0049) and Full Fact (LLM0058)
207 NCSC, ‘NCSC warns of enduring and significant threat to UK’s critical infrastructure’ (14 November 2023): https://www.ncsc.gov.uk/news/ncsc-warns-enduring-significant-threat-to-uks-critical-infrastructure [accessed 21 December 2023]
211 For details of the US intelligence community assessment of activities conducted by the Russian Federation see The Director of National Intelligence, ‘Assessing Russian Activities and Intentions in Recent US Elections’ (January 2017): https://www.dni.gov/files/documents/ICA_2017_01.pdf [accessed 20 December 2023].
213 For further details of disinformation affecting elections and other Government priorities see HM Government, National Risk Register 2020 Edition (2020): https://assets.publishing.service.gov.uk/media/6001b2688fa8f55f6978561a/6.6920_CO_CCS_s_National_Risk_Register_2020_11-1-21-FINAL.pdf [accessed 21 December 2023].
214 Written evidence from NCC Group (LLM0014) and letter from Viscount Camrose, Parliamentary Under Secretary of State Department for Science, Innovation & Technology to Baroness Stowell of Beeston, Chair of the Communications and Digital Committee (16 January 2024): https://committees.parliament.uk/work/7827/large-language-models/publications/3/correspondence/
215 DSIT, ‘Frontier AI’ (25 October 2023): https://www.gov.uk/government/publications/frontier-ai-capabilities-and-risks-discussion-paper [accessed 8 January 2024]
216 Letter from Viscount Camrose, Parliamentary Under Secretary of State Department for Science, Innovation & Technology to Baroness Stowell of Beeston, Chair of the Communications and Digital Committee (16 January 2023): https://committees.parliament.uk/work/7827/large-language-models/publications/3/correspondence/
217 HM Government, Safety and Security Risks of Generative Artificial Intelligence to 2025 (2023): https://assets.publishing.service.gov.uk/media/653932db80884d0013f71b15/generative-ai-safety-security-risks-2025-annex-b.pdf [accessed 21 December 2023]
218 Center for AI Safety, ‘An overview of catastrophic AI risks’: https://www.safe.ai/ai-risk [accessed 20 December 2023]
219 QQ 22–23, written evidence from Royal Academy of Engineering (LLM0063), Microsoft (LLM0087), Google and Google DeepMind (LLM0095), OpenAI (LLM0013) and DSIT (LLM0079)
220 Kevin Esvelt et al, ‘Can large language models democratize access to dual-use biotechnology?’ (June 2023): https://arxiv.org/abs/2306.03809[accessed 21 December 2023]
221 Andrew D White et al, ‘ChemCrow: Augmenting large-language models with chemistry tools’ (April 2023): https://arxiv.org/abs/2304.05376 [accessed 8 January 2024] and Nuclear Threat Initiative, The Convergence of Artificial Intelligence and the Life Sciences (October 2023): https://www.nti.org/wp-content/uploads/2023/10/NTIBIO_AI_FINAL.pdf [accessed 21 December 2023]
222 Effective Altruism Forum, ‘ Possible OpenAI’s Q* breakthrough and DeepMind’s AlphaGo-type systems plus LLMs’ (November 2023): https://forum.effectivealtruism.org/posts/3diski3inLfPrWsDz/possible-openai-s-q-breakthrough-and-deepmind-s-alphago-type [accessed 21 December 2023]
223 Note that the Government assesses generative AI is unlikely to fully automate computer hacking by 2025. See HM Government, Safety and Security Risks of Generative Artificial Intelligence to 2025 (2023): https://assets.publishing.service.gov.uk/media/653932db80884d0013f71b15/generative-ai-safety-security-risks-2025-annex-b.pdf [accessed 21 December 2023].
224 Megan Kinniment et al, Evaluating Language-Model Agents on Realistic Autonomous Tasks: https://evals.alignment.org/Evaluating_LMAs_Realistic_Tasks.pdf [accessed 21 December 2023]
226 See for example Adam C, Dr Richard J. Carter, ‘Large Language Models and Intelligence Analysis’: https://cetas.turing.ac.uk/publications/large-language-models-and-intelligence-analysis [accessed 21 December 2023], War on the Rocks, ‘How large language models can revolutinise military planning (12 April 2023): https://warontherocks.com/2023/04/how-large-language-models-can-revolutionize-military-planning/ [accessed 9 January 2024] and National Cyber Security Centre, ‘NCSC CAF guidance’: https://www.ncsc.gov.uk/collection/caf/cni-introduction [accessed 21 December 2023].
229 The bug responsible for the 2014 UK air traffic control failure was found within 45 minutes, for example. See ‘Flights disrupted after computer failure at UK control centre’, BBC (12 December 2014): https://www.bbc.co.uk/news/uk-30454240 [accessed 20 December 2023].
233 Written evidence from Hugging Face (LLM0019), Advertising Association (LLM0056) and Meta (LLM0093)
234 Xinyang Geng et al, ‘Koala: A Dialogue Model for Academic Research’ (April 2023): https://bair.berkeley.edu/blog/2023/04/03/koala/ [accessed 21 December 2023]
235 Q 10 (Ian Hogarth), written evidence from British Copyright Council (LLM0043), Dr Baoli Zhao (LLM0008), Google DeepMind (LLM0095) and IEEE, ‘Protesters Decry Meta’s “Irreversible Proliferation” of AI’ (October 2023): https://spectrum.ieee.org/meta-ai [accessed 21 December 2023]
237 Centre for the Governance of AI, Open-Sourcing Highly Capable Foundation Models: https://cdn.governance.ai/Open-Sourcing_Highly_Capable_Foundation_Models_2023_GovAI.pdf [accessed 21 December 2023]
238 See for example C2PA, Guidance for Artificial Intelligence and Machine Learning: https://c2pa.org/specifications/specifications/1.3/ai-ml/ai_ml.html#_attestation_for_ai_ml_models [accessed 21 December 2023].
241 See for example Foreign, Commonwealth and Development Office, ‘Russia: UK exposes Russian involvement in SolarWinds cyber compromise’ (April 2021): https://www.gov.uk/government/news/russia-uk-exposes-russian-involvement-in-solarwinds-cyber-compromise [accessed 8 January 2023].
243 Q 22 (Professor Stuart Russell) and DSIT, Capabilities and risks from frontier AI (October 2023): https://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf [accessed 21 December 2023]
244 AI in Weapon Systems Committee, Proceed with Caution: Artificial Intelligence in Weapon Systems (Report of Session 2023–24, HL Paper 16), paras 157–158
245 Center for AI Safety, ‘Statement on AI risk’: https://www.safe.ai/statement-on-ai-risk [accessed 25 January 2024]
246 See ‘Global food insecurity and famine from reduced crop, marine fishery and livestock production due to climate disruption from nuclear war soot injection’ Nature Food (August 2022): https://www.nature.com/articles/s43016–022-00573-0 [accessed 23 December 2023], Cold War estimates of deaths in nuclear conflict’, Bulletin of the Atomic Scientists (January 2023): https://thebulletin.org/2023/01/cold-war-estimates-of-deaths-in-nuclear-conflict/ [accessed 21 December 2023] and Department of Homeland Security, ‘Nuclear Attack’ : https://www.dhs.gov/publication/nuclear-attack-fact-sheet [accessed 8 January 2024].
247 Piers Millett et al, ‘Existential Risk and Cost-Effective Biosecurity’, Health Security (August 2017): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5576214/ [accessed 8 January 2023]
249 Ibid.
250 DSIT, ‘Frontier AI: capabilities and risks—discussion paper’ (October 2023): https://www.gov.uk/government/publications/frontier-ai-capabilities-and-risks-discussion-paper/frontier-ai-capabilities-and-risks-discussion-paper [accessed 21 December 2023].
253 DSIT, Capabilities and risks from frontier AI (October 2023): https://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf [accessed 21 December 2023] and Reuters, ‘AI pioneer says its threat to world may be ‘more urgent’ than climate change’ (9 May 2023): https://www.reuters.com/technology/ai-pioneer-says-its-threat-world-may-be-more-urgent-than-climate-change-2023–05-05/ [accessed 24 January 2024]
256 See for example Emily M Bender et al, ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?’ (March 2021): https://dl.acm.org/doi/pdf/10.1145/3442188.3445922 [accessed 21 December 2023] and House of Lords Library, ‘Artificial intelligence: Development, risks and regulation’ (July 2023): https://lordslibrary.parliament.uk/artificial-intelligence-development-risks-and-regulation/ [accessed 8 January 2024].
257 Emily M Bender et al, ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?’ (March 2021): https://dl.acm.org/doi/pdf/10.1145/3442188.3445922 [accessed 21 December 2023]
258 Written evidence from Sense about Science (LLM0046), the Advertising Association (LLM0056), Dr Jeffrey Howard (LLM0049), Society of Authors (LLM0044) and British Copyright Council (LLM0043)
261 BBC, ‘Post Office scandal explained’ (16 January 2024): https://www.bbc.co.uk/news/business-56718036 [accessed 18 January 2024]
262 Written evidence from the Committee on Standards in Public Life (LLM0052), Copyright Clearance Center (LLM0018), Cambridge Language Sciences (LLM0053), DMG Media (LLM0068), Guardian Media Group (LLM0108)
263 Hallucinations refer to the phenomenon of LLMs producing plausible-sounding but inaccurate responses.
264 Written evidence from the Alan Turing Institute (LLM0081) and Royal Society of Statisticians (LLM0055)
267 Written evidence from the Committee on Standards in Public Life (LLM0052) and Oxford Internet Institute (LLM0074)
269 Haoran Li et al, ‘Privacy in Large Language Models: Attacks, Defenses and Future Directions’ (October 2023): https://arxiv.org/abs/2310.10383 [accessed 8 January 2024]
271 Written evidence from Arnav Joshi (LLM0112). We noted further concerns from the Public Law Project about the Bill’s proposals to “weaken” protections around automated decision-making, as well as uncertainty around the extent to which models ‘hold’ personal data and hence how far data protection duties apply. See for example Public Law Project, ‘How the new Data Bill waters down protections’ (November 2023): https://publiclawproject.org.uk/resources/how-the-new-data-bill-waters-down-protections/ [accessed 21 December 2023], and Q 56.
272 Cogstack, ‘Unlock the power of healthcare data with CogStack’: https://cogstack.org/ [accessed 21 December 2023]
274 See recent debates on related topics, for example ‘Palantir NHS contract doubted by public for data privacy’, The Times (November 2023): https://www.thetimes.co.uk/article/palantir-nhs-contract-doubted-by-public-for-data-privacy-q9sccsmln [accessed 8 January 2024].
275 Q 86. The ICO already provides extensive guidance on data protection. See for example: Information Commissioner’s Office, ‘Generative AI: eight questions that developers and users need to ask’ (April 2023): https://ico.org.uk/about-the-ico/media-centre/blog-generative-ai-eight-questions-that-developers-and-users-need-to-ask/ [accessed 21 December 2023].