The big data dilemma Contents

Summary

We are living in the data age. Since Sir Tim Berners-Lee proposed his “vague but exciting” plan for a ‘distributed information system’ at CERN — and in the process inadvertently launched the information revolution — the amount of data we share has exploded.

The Data Centre for the Large Hadron Collider at CERN (the world’s largest and most powerful particle accelerator) processes about one petabyte of data every day — the equivalent of around 210,000 DVDs, and distributes this data across the world via a grid which gives over 8,000 physicists near real-time access to LHC data. In the future, the Square Kilometre Array (the world’s largest radio telescope, run from the UK’s Jodrell Bank Observatory) will require supercomputers faster than any in existence in 2015, and network technology that will generate more data traffic than the entire Internet. The computer power it will need will be about three times more powerful than the most powerful supercomputer available in 2013, equivalent to the processing power of about 100 million 2013-era PCs.

Properly exploited, this data should be transformative, increasing efficiency, unlocking new avenues in life-saving research and creating as yet unimagined opportunities for innovation. But even existing datasets are nowhere near fully exploited. Despite data-driven companies being 10% more productive than those that do not operationalise their data, most companies estimate they are analysing just 12% of their data.

The stakes for the UK economy are massive. Big data is already a UK success story but it has huge unrealised potential, both as a driver of productivity and as a way of offering better products and services to citizens. An analysis in 2012 calculated that big data could create 58,000 new jobs over five years, and contribute £216 billion to the UK economy, or 2.3% of GDP, over that period. In the public sector, big data can increase the operational efficiency and targeting of service delivery.

Big data depends crucially on developing the necessary skills, providing infrastructure and setting parameters for sharing data to ensure valid privacy and security concerns are addressed. It is essential that the Government’s forthcoming Digital Strategy sets a clear course to address these matters not only so that UK plc can capitalise on our world-leading data capabilities but also so our public sector can develop the sustainable solutions promised by big data within a secure regulatory and practical framework.

No Digital Strategy will succeed, however, without immediate action to tackle the crisis of our digital skills shortage. The Government should urgently commit to further supporting the development of ‘data analytics’ skills — a mix of technical skills, analytical and industry knowledge, and the business sense and soft skills to turn data into value for employers — in businesses as well as in Government departments, and promoting more extensively the application of big data at local government level. But the Government must also address the wider context of its policies on apprenticeships and immigration control, including widespread concerns that these could jeopardise the necessary big data skills-base that the UK will increasingly need.

On infrastructure, the Government facilitates industrial access to academic infrastructure for research, and small business access to advanced software and hardware. Together with the Digital Catapult and the Open Data Institute, there is help for making datasets ‘open’ for researchers and analysts, or available as ‘shared data’. The Government has a key role to play in making its own data ‘open’ and ‘shared’.

Its work in this area has put the UK in a world-leading position, but there is still more to do, particularly in breaking down departmental data silos and improving data quality. The Government should examine how it can build capacity to deliver more datasets, increasingly in real-time, both to decision-makers in Government and to external users. It should map out how the Digital Catapult’s work and the Government’s plans to open and share its own data could be dovetailed. The Government should also consider the scope for giving the Office for National Statistics greater access both to Government departments’ data and private sector data. It should charge the Government Digital Service, the Office for National Statistics or another expert body with auditing the quality of data within Government departments amenable for big data applications, and for proactively identifying data sharing opportunities to break departmental data silos. Healthcare interventions can be more precisely tailored to individual patients’ circumstances using big data. The momentum for this was reduced, however, by the experience of bringing patient data together under the ‘care.data’ initiative. After the programme was delayed, the Spending Review has now raised the prospect of progress on this front, but the Government cannot afford a second failure from a re-launched scheme. It should take careful account of the lessons from a similar, successful, scheme in Scotland. In particular, to help bring patients onside and to streamline healthcare across different NHS providers — hospitals, GPs, pharmacists and paramedics — it should give them easy, online access to their own health records.

There are risks, as well as opportunities, from big data. Personal data is only a small proportion of big data, with huge potential from non-personal datasets for transport and weather forecasting, for example. Given the scale and pace of data gathering and sharing, however, distrust and concerns about privacy and security is often well founded and must be resolved by industry and Government if the full value of big data is to be realised. The benefits therefore have to be weighed against such potential loss of privacy and the risks of our data being lost or misused. Controls are covered by the Data Protection Act 1998, but will need to be overhauled within the next two years or so as a result of the agreement of an EU General Data Protection Regulation in December 2015.

The new Regulation will increase potential fines, but the Government should immediately go further by introducing a criminal penalty — already provided for in existing UK legislation — for serious data protection breaches. The Government and Information Commissioner should also ensure that the UK’s already developed kitemark, to acknowledge and encourage good practice, is adopted as soon as possible along with a campaign to raise public awareness of it.

We do not share the Government’s view that current UK data protections can simply be left until the Data Protection Act will have to be revised to take account of the new EU Regulation. Some areas need to be addressed straightaway — introducing the Information Commissioner’s kitemark and introducing criminal penalties. And there remain concerns that big data techniques which ‘re-identify’ individuals from previously anonymised data may be outside the scope of the current UK legislation. The way the new EU Regulation is framed appears to leave it open for data to be potentially de-anonymised if “legitimate interests” or “public interest” considerations are invoked. It is particularly important therefore that the Government set out its anonymisation strategy for big data in its upcoming Digital Strategy, including a clear funding commitment, a plan to engage industry with the work of the UK Anonymisation Network and core anonymisation priorities.

The anonymisation and re-use of data is becoming an issue that urgently needs to be addressed as big data becomes increasingly a part of our lives. There are arguments on both sides of this issue: Seeking to balance the potential benefits of processing data (some collected many years before and no longer with a clear consent trail) and people’s justified privacy concerns will not be straightforward. It is unsatisfactory, however, for the matter to be left unaddressed by Government and without a clear public-policy position set out. The Government should clarify its interpretation of the EU Regulation on the re-use and de-anonymisation of personal data, and after consultation introduce changes to the 1998 Act as soon as possible to strike a transparent and appropriate balance between those benefits and privacy concerns.

Such clarity is needed to give big data users the confidence they need to drive forward an increasingly big data economy, and individuals that their personal data will be respected. The Government should establish a Council of Data Ethics as a means of addressing the growing legal and ethical challenges associated with balancing privacy, anonymisation, security and public benefit. Ensuring that such a Council is established, with appropriate terms of reference, offers the clarity, stability and direction which has so far been lacking from the European debate on data issues.




© Parliamentary copyright 2015

Prepared 11 February 2016