Statistics and Open Data: Harvesting unused knowledge, empowering citizens and improving public services - Public Administration Committee Contents

2  Improving accountability through open data

The Government's aims

5. One main aim of the Government's policy on open data is to improve the ability of citizens to hold Government to account. The Prime Minister made this clear in 2010:

    We're going to rip off that cloak of secrecy and extend transparency as far and as wide as possible. By bringing information out into the open, you'll be able to hold government and public services to account. You'll be able to see how your taxes are being spent. Judge standards in your local schools and hospitals. Find out just how effective the police are at fighting crime in your community. Now I think that's going to do great things. It's certainly going to save us money. With a whole army of effective armchair auditors looking over the books, ministers in this government are not going to be able to get away with all the waste, the expensive vanity projects and pointless schemes that we've had in the past.[6]

6. Ministers have regularly restated their support for open data as an aid to accountability; in 2012 the Rt Hon Francis Maude MP, Minister for the Cabinet Office, said:

    "These are the first formative years of this new Age of Open Data. [...] the prize is effective personalised 21st century democracy. Transparency will create empowered citizens that can expose corruption, get the best value out of their governments and have equal access to valuable raw data."[7]

7. The Open Data White Paper, 'Unleashing the Potential', published in June 2012, gave more detail on the Government's aspirations, including ways in which it could help in "Building a transparent society" [8]. Opening up public service data was creating a "living library of information" to help people hold Government to account.[9] The Government called this "a completely different way of governing", for instance giving anyone in the country the means to challenge public authorities on how public money is being spent.[10] More data would be put into the public domain, there would be significant improvements to the website, which brings data released by Government together in one searchable website, and amendments to the Freedom of Information Act would make it much easier for citizens to get access to public sector datasets in a useful form.[11]

Good examples

8. To what extent is this transparent vision becoming a practical reality? There are several examples of the use of data on public services to increase the ability of the public to keep a check on how they are performing. Many witnesses mentioned, which shows reported crimes and their outcomes in detail, down to street level. This has been well-used by the public, with more than 53 million visits from 22 per cent of all family households in England and Wales since its launch in 2011.[12] Public debate about a variety of public services has also been informed by other recent releases of open public sector data. Greater public transparency in health spending has, for instance, been created by Mastodon C, a start-up company whose work was welcomed by several of our witnesses.[13] One of Mastodon C's projects looked at GPs' prescription data and demonstrated that, for example, the NHS could have saved more than £200 million a year if the generic version of statins had been prescribed rather than the patented version.

9. On environmental issues, the greenhouse gas emissions statistics released annually by the Department of Energy and Climate Change were welcomed by Ruth Dixon and Professor Christopher Hood of Oxford University as "a consistent and informative dataset" that allows meaningful comparisons over time.[14] Openly Local is a project which is attempting, it says, "to develop an open and unified way of accessing Local Government information" including numerical data. At the time of our inquiry the project offered access to data on spending, councillors' expenses and planning applications for over 140 local authorities.[15]

Barriers to accountability

10. Along with such promising examples of good practice, we had some evidence that there were barriers to the achievement of greater accountability through open government data. There was some evidence that the Government's original clear focus on accountability as the key goal of open data had recently been diluted by other priorities. Dr Ben Worthy, a lecturer in politics at Birkbeck, University of London, told us that there was uncertainty about "which of the 'economic, social and political' aims [for open data] the Government supports. Observers have noted a shift in emphasis from the democratic aims of government transparency and accountability to the economic aims of encouraging growth."[16]

11. Full Fact, an independent organisation which provides advice and information to help people check the facts against claims made by politicians and the media, acknowledged that the Government's agenda of "open by default" had helped to "set the right ambition for open data generally".[17] But it was critical of much of the Government's actual performance in providing greater accountability: "Open data is a great thing of which we have seen too little, too late, too poorly done. Because it has been poorly done the take up has been limited."[18]

12. The Institute for Government observed that "the accessibility, quality, and presentation of government data varies widely between departments and datasets."[19] There is certainly evidence of accuracy and quality issues with some of the data releases. The UK Data Service (UKDS), a data resource funded by the Economic and Social Research Council to support researchers, teachers and policymakers, told us that there were "countless examples of (avoidable) errors" in government data releases. One example, according to UKDS, was the recent publication by the Treasury of the Bona Vacantia Unclaimed Estates List, which gives details of property which has passed to the Crown because the previous owner has died with no will or known family. UKDS noted that the date of birth of at least 132 people (more than 1% of the total) was "reported as being after their date of death. (And the marital status of some suggests that other ages are wrong too - for example, there is a two month old widow!)". UKDS criticised the "lack of quality control mechanisms".[20] Owen Boswarva, a data consultant and open data activist, who is a non-executive member of the Defra Network Transparency Panel and submitted his evidence to us in a personal capacity, was critical of what he called "the indiscriminate dumping of small, low-value datasets on" which had "created the illusion of progress - 9,000 datasets [at the time of writing] sounds like a lot, but what proportion is that of what total?"[21]

13. Accountability can also be hampered by over-cautious official attitudes, according to Heather Savory, Chair of the Open Data User Group, which exists to help Government understand the requirements of people who are using, or could use, the datasets it collects. Ms Savory told us that there were "perceived risks among civil servants" in relation to open data, who could be "concerned because their data is not perfect" but, Ms Savory observed "no data set is perfect."[22] She also identified "a lack of belief that the technical community can deal with this stuff" whereas an outsider keen on making use of open data might say to Government "Just give me big, dirty data. I'll deal with it."[23] Similar points were made by Tom Steinberg, who drew attention to an apparent inconsistency of approach across Government. He noted that the figure for GDP, "which is probably the single number that people in Whitehall care about more than anything else" is frequently published then revised "yet there have been much less important data sets that have not been released because we cannot make a mistake."[24]

14. Stephan Shakespeare believed that these dilemmas could be resolved, and his Review report set out:

    a twin-track policy for data-release, which recognises that the perfect should not be the enemy of the good: a simultaneous "publish early even if imperfect" imperative AND a commitment to a "high quality core". This twin-track policy will maximise the benefit within practical constraints. It will reduce the excuses for poor or slow delivery; it says "get it all out and then improve".[25]

15. The intention, the Review continued, "is that as much as possible is published to a high quality standard, with departments and wider public sector bodies taking pride in moving their data from track 1 to track 2."[26] Mr Shakespeare explained in oral evidence to us that this entails identifying:

    the data sets that need to be clean and need to be published to certain high standards, and that should be track [two]. All the rest is published as track [one]—quick and dirty, as one might say, so long as one knows that it is dirty—and left to the data scientists to do what they can.[27]

16. Stephan Shakespeare's proposal that the Government should adopt a "twin-track" approach to data release is a practical and realistic way of maintaining the momentum on open data, which recognises that "the perfect should not be the enemy of the good: a simultaneous 'publish early even if imperfect' imperative AND a commitment to a 'high quality core'". Regular publication of imperfect government data will provide Departments with a powerful incentive to improve it. We recommend that the Government should adopt the twin-track approach to data release advocated by Stephan Shakespeare. Government should 'publish early even if imperfect', as well as being committed to a 'high quality core'. As long as Government is clear about its limitations, there will always be a role for data that is imperfect but improvable.

17. Other witnesses identified limitations with that made it less accessible, and therefore less useful for the general public. The National Statistician, Jil Matheson, complained that "does not yet have the functionality that we would like to see for accessing statistics. One of the really important ways of people being able to understand what is there is to be able to visualise it."[28]

18. It is very difficult to assess the performance of Government in enhancing accountability through opening up its data. The concept of open data is poorly defined and there are no accepted measures of what is published. This allows supporters of open data to claim the revolution is well under way and the sceptics to say nothing has changed.

19. It is often pointed out that more than 13,000 datasets can now be found on, but it is unclear how many of these represent simple republishing of data already published on other government sites. Some data sets are small and others large. And it is possible for departments to get more data out by publishing it in smaller bundles or updating it more frequently, in such a way that there is little or no extra public benefit. In these circumstances, measuring progress on this important agenda is difficult if not impossible. Simply putting data "out there" is not enough to keep Government accountable.

20. We invite the Government to publish a clear list of open data, indicating when each data series became open in each case.

Outsourcing and transparency

21. Several witnesses argued that the principles of open data should be applied consistently to all organisations that provide public services, including those in the private or voluntary sectors. They observed that, in the new world of frequent outsourcing of public service delivery, this was particularly important.

22. The Information Commissioner, Christopher Graham, told us that he was concerned that outsourcing potentially undermined the principles of understanding, accountability and open data, arguing that "it is important that outsourcing does not lead to a decrease in transparency in public services."[29] Dr Rufus Pollock, Director of the Open Knowledge Foundation, also raised this issue, noting with concern that open data principles are not "embedded in [government] procurement rules".[30] He continued: "One of the biggest risks and dangers we have seen evidence of both in the US and here is that you outsource some service and, bam, all your information is gone."[31] Tom Steinberg also saw this as a key issue for the future, telling us that, there could be "a real problem to public accountability in situations where companies are used to provide public services, instead of government bodies."[32]

23. More positively, it was suggested to us that open data could help Government at all levels to improve its performance in commissioning private and voluntary sector providers, ODI arguing that "choice and competition would be enhanced if data on the performance of public service providers were published in a consistent fashion and made available to service users and external experts."[33]

24. There was support from the Information Commissioner for the idea of Government releasing detailed data about the performance of private providers in delivering public services. He told us: "Opening this information as fully as possible to public scrutiny would promote efficiency in the use of public funds and also help to build public confidence in outsourcing."[34] The Commissioner welcomed the Government's commitment in the National Action Plan to "take steps to ensure transparency about outsourced services is provided in response to freedom of information requests."[35]

Procurement and Open Data

25. Several witnesses called for the whole range of public sector procurement processes to be reformed to encourage open data, especially in the case of IT contracts. Tom Steinberg for example told us that "open data will only become widespread if its provision is tied to the procurement of information systems."[36] The Information Commissioner made a similar point, telling us that he is encouraging 'transparency by design', advising public authorities to "think about open data requirements when they are procuring and designing new IT systems."[37]

26. The Cabinet Office Minister, Nick Hurd MP, told us that Cabinet Office is working with the Government Digital Service "on a piece of work to include open data clauses in IT procurement"; money has been made available for Departments, agencies and local authorities to "release data where there are short-term technical barriers—i.e. where someone is saying, 'We are going to have to charge you to get those datasets out'."[38]

27. The ODI went further, calling for open data publication to be "written into every government contract", whether for IT or not.[39] Stephan Shakespeare urged that in government procurement, there should always be, for tendering companies,

    a box that says, "What is your open-data strategy?" so they are required to say in advance what their attitude to this is. That could then make them feel that it may be detrimental to their getting the contract if they state that they will not share the data.[40]

Sir Nigel Shadbolt observed that there was "a stronger view that procurement should have a clause that says, 'It shall be produced as open data.'"[41]

28. Open data principles should be applied not only to government departments but also to the private companies with which they make contracts.

29. We recommend that companies contracting with the Government to provide contracted or outsourced goods and services should be required to make all data open on the same terms as the sponsoring department. This stipulation should be included in a universal standard contract clause which should be introduced and enforced across Government from the beginning of the financial year 2015-16.

The right to data?

30. Several witnesses were uneasy with the current position in which the Government and local authorities decide whether to make data available or not. Sir Nigel Shadbolt observed that there are currently "public data principles in the White Paper that are endorsed as Government policy. The question is whether they are being implemented routinely."[42] Sir Nigel told us that "The presumption to publish has some way to go."[43] In these circumstances, he said "People think it is sufficiently difficult and challenging that you might need to legislate for it."[44]

31. There were other suggestions that present arrangements for open data release lacked the necessary clout. Dr Rufus Pollock of the Open Knowledge Foundation was concerned that even the Open Data User Group had to persuade organisations to provide data in an open way: "Heather [Savory] is doing a sterling job, but it was rather interesting that she had to go to persuade the Land Registry to do this or persuade X to do that."[45] He said that in theory "things like that are in FOI, but they should be operationalised more effectively."[46]

32. The Information Commissioner's Office set out its understanding of the current statutory provisions on open data, noting that on 1 September 2013 amendments to FOIA came into force. These are:

    intended to enable open data - giving requesters the right to receive datasets in open, re-usable formats (if reasonably practicable) and under an open licence, though public authorities can use a charged licence in certain circumstances. The amendments also place an obligation on public authorities to publish previously requested datasets proactively, as part of their Freedom of Information publication scheme.[47]

33. But there was some confusion among our witnesses as to what difference this makes to the current legal position of open data. Tom Steinberg raised the issue of whether the Freedom of Information Act should be "expanded ... so that people have similar powers to access data sets to those they have to access paper documents."[48]

34. Heather Savory of the Open Data User Group said that she believed that recent legislation had effectively "established an enhanced right to data because it introduces a statutory duty for public authorities to publish their data for re-use."[49] Although she agreed that "we do not have clean legislation [...] if you look at the complex network of legislation that we have, there is already a presumption to publish data and there are duties for public bodies to make their data available for re-use."[50]

35. The Information Commissioner considered that open data and the right to information, were "mutually supportive".[51] This was because "there must be a right for the public to 'pull' information from government as well as a government commitment to 'push' data out proactively."[52] Stephan Shakespeare and Sir Nigel Shadbolt both advocated explicit legislation to set out a right to data; Sir Nigel observed that policies come and go but "legislation has a way of sticking around."[53]

36. Mr Hurd was clear that there was no statutory right to open data, confirming that he was at the moment "against any further legislation in this area other than evolution of the Freedom of Information Act and the transposition of the EU directive [Directive 2003/98/EC which encourages the re-use of public sector information]."[54]

37. There is confusion about the concept of the 'right' to data held by Government. On the one hand, the Minister told us that there is no right to data, but there is evidence to suggest that, in effect, a presumption already exists that government data will be published in an open format.

38. The Government needs to recognise that the public has the inherent 'right to data', like Freedom of Information. The Government should clarify its policy and bring forward the necessary legislation, without delay.

Privacy and open data: managing the risks

39. If there is to be the 'right to data', the 'right to privacy' must also be recognised. We heard substantial evidence of the risks to individual privacy that could be created by ill-considered open data releases. Full Fact noted the risk that the reputation of open data might be vulnerable to public anxiety over privacy and the state. There will soon be "far greater volumes of far more personal information stored by public bodies than we would have thought possible not long ago."[55] Open data would "serve as a constant reminder of this and occasional mistakes will bring it crashing into public debate."[56]

40. As we were completing the inquiry, the potential for data release to cause such public concern was demonstrated by the case of At the beginning of 2014 there were a number of reports of opposition from campaigners, and in some cases medical practitioners, to the programme in England.[57] The programme, as explained by NHS England:

    will make increased use of information from medical records with the intention of improving healthcare, for example by ensuring that timely and accurate data are made available to NHS commissioners and providers so that they can better design integrated services for patients. In the future, approved researchers may also benefit. The Health and Social Care Information Centre will link personal confidential data (PCD) extracted from GP systems with PCD from other health and social care settings.[58]

41. The main concerns about were said to be the risk that personal medical details would become publicly available, and that data collected for public purposes would be exploited for profit by the private sector. Even strong advocates of open data, such as Stephan Shakespeare, were in no doubt about the sensitivity of medical information. Mr Shakespeare, commenting on the general issue of medical data, told us: "I want to make it quite clear that the revealing of personal medical data could be extremely painful to the person and that it is incredibly important to avoid that."[59]

42. Sir Nigel Shadbolt observed that young people were sometimes said to be "giving up on privacy" with the spread of social media and other digital developments. However he identified a new caution about privacy among young people, with the development of:

    a very nuanced view of what is available to open and what is not. As they grow up—I have seen this process—from no concern at all to a recognition that this will stay with them in their interview process as they go for jobs in the future, they become much more concerned about the issues and limits of privacy.[60]

43. Dr Pollock raised the complex issue of crime statistics. He said that in the UK when crime data was first published "there was this whole debate that I could work out where this had occurred—had a rape happened in a house, I would know something very significant personally about someone. There is clearly going to be ongoing debate."[61]

44. Dr Pollock accepted that "some of the most interesting data will have a relationship with personal information."[62] The default position, he said,

    has to be that we protect privacy in the first instance, but it is important that there are cases where we make public interest tradeoffs. We think that we are entitled to know the directors of public companies; it is not something that is private.[63]

45. We were assured that there were ways of ensuring privacy is maintained in the right cases. Mr Shakespeare referred to "safe-haven technology, which means you can make data available in a way that you cannot take it out of the box, if you like, and you can access it remotely without removing it from the database."[64]

46. Ministers were also confident that a satisfactory balance could be achieved between open data and individual privacy. While re-iterating that "the government's position is that data should be open by default," they made it clear that "by definition open data is not personal data. The government takes the issue of privacy seriously."[65] The Ministers made it clear that "anonymisation techniques mean that data can still be released while providing protection to the individual citizen".[66] They give the example of crime data which is "grouped at the level of a few streets to prevent victims being identified".[67]

47. When releasing data, it is the responsibility of Government to avoid risk that individuals may be identified against their will. There has been an effective campaign to highlight unease about the release of anonymised NHS patient data for academic and pharmaceutical research as part of the programme. There is a clear need to reassure the public about personal privacy. However, it is also important to explain what open data can do to make public services more accountable and responsive to the needs of society. The recent controversy over demonstrates the danger that concerns about privacy will unduly undermine the case for open data.

Increasing engagement

48. Some witnesses argued that Government should take bold steps to promote widespread public use of data to hold Government to account. Owen Boswarva for example welcomed the increased availability of spending and performance data in a reusable format, but told us that it was "no substitute for meaningful public consultation and open decision-making."[68]

49. Dr Worthy told us that in order to bring about real accountability and participation the data also needed to be linked to "clear and functioning accountability mechanisms." But what he called the "eye-catching idea" of the "Armchair Auditor" had, he said, failed to become a reality, despite some successes: "there are few signs of a wider 'army'."[69] He says that this is in part because "the information is not yet consistent, so questioning and understanding it is not easy."[70] The armchair auditor also, Dr Worthy observes, "needs to be a particular type of person: engaged and interested in local government, with a good grasp of how government works and motivation and skills to dedicate time to it. To have all these traits in combination is rare."[71]

50. Involve, a body which promotes wider participation in public life through a mixture of research and practical action, made similar points, telling us that

    the public currently do not understand how open data applies to them or what they care about; research into public awareness of open data has found that awareness is low in part because open data is perceived as an abstract issue, with unclear benefits to everyday life.[72]

51. While experts may make extensive use of open data repositories, such as, such repositories, according to Involve, "are unlikely to be visited by the average citizen."[73] Instead Involve argue that there is "potential for government and civil society to get information to citizens in the places that they already visit - be it online (e.g. paying for their TV licence) or offline (e.g. in a GP surgery waiting room)."[74]

52. Involve also urged Government to promote a set of data engagement guidelines developed by a group led by Tim Davies, co-director of Practical Participation, and open data research coordinator at the Web Foundation. These are known as the 'five stars of open data engagement', a system of rating the usefulness and accessibility of data to the general public.[75] If introduced in Government, the system would be intended to encourage publishers of data to make it "accessible [to] all without discrimination" and to make information and data usable in a wide range of ways[76]. 'One Star' engagement indicates where organisations' releases are driven solely by need and demand, while 'Five Star' engagement indicates that there is close collaboration with users and that the organisation is working with other organisations to integrate data sources.

53. It is clear that using open data to encourage engagement is not a simple matter. What appears to some to be neutral can be seen by others as politically motivated, as Dr Worthy warned us "although technology is often presented as a neutral good", it could be "extremely political."[77] He cited local government spending data, which he saw as "very politicised. It is about local versus central Government."[78]

54. There is no sign of the promised emergence of an army of armchair auditors. There is little or no evidence that the Cabinet Office is succeeding in encouraging greater public engagement in using data to hold the public sector to account.

55. Open data is important and touches people's lives at many points. Yet Government and some of the experts sometimes make too much use of jargon and so can alienate and confuse people who do not have expert knowledge of the technical terms. This can undermine efforts to encourage more people to get involved in holding Government to account.

56. The Government should adopt a star-rating system for engagement, as recommended by Involve, for measuring, and reporting to Parliament on, Departments' progress on increasing accountability through open data. The Government should expect Departments to set out plans to move towards Five Star Engagement for all their data releases.

General conclusions on accountability

57. We welcome the clear lead on open data that has come from successive Governments. There have been some useful moves to improve accountability and engagement in recent years, with positive developments such as the establishment of the Open Data User Group. However there is much still to be done.

58. There should be a presumption that restrictions on government data releases should be abolished. It may be necessary to exempt certain data sets from this presumption, but this should be on a case-by-case basis, to provide for such imperatives as the preservation of national security or the protection of personal privacy.

59. The Cabinet Office must give a much higher priority to ensuring that more interesting and relevant data is made open, and that the release mechanisms encourage people to use it and, where appropriate, hold Government and local authorities to account. Beginning in April 2014, targets should be set for the release of totally new government datasets - not the republishing of existing ones.

6   Podcast by the Prime Minister, 29 May 2010 Back

7   Speech to the World Bank by the Rt Hon Francis Maude MP, 30 January 2012 Back

8   Cabinet Office, Open Data White Paper: Unleashing the Potential CM 8353, June 2012 Back

9   As above Back

10   As above Back

11   As above pp 11-12 Back

12   See for instance "The geeky revolution that will change our lives", The Times, 28 October 2013 Back

13   For example Nick Hurd MP and Rt Hon Michael Fallon MP (OD 28); Open Data Institute (OD 09) para 30; Q21 ff  Back

14   Ruth Dixon and Professor Christopher Hood (OD 04) para 8.2 Back

15   Openly Local website Back

16   Dr Ben Worthy (OD 27) Back

17   Full Fact (OD11) Back

18   As above Back

19   Institute for Government (OD 17) Back

20   UK Data service (OD 08) para 18 Back

21   Owen Boswarva (OD 06) para 7 Back

22   Q51 Back

23   As above Back

24   Q69 Back

25   Shakespeare Review, p 11 Back

26   As above Back

27   Q94 Back

28   Q226 Back

29   Information Commissioner's Office (OD 26) para 16 Back

30   Q40 Back

31   As above Back

32   Tom Steinberg (OD 24) Back

33   Institute for Government (OD 17) Back

34   Information Commissioner's Office (OD 26) para 15 Back

35   As above Back

36   Tom Steinberg (OD 24) Back

37   Information Commissioner's Office (OD 12) para 16 Back

38   Q174 Back

39   Open Data Institute (OD 25) para 16 Back

40   Q99 Back

41   As above Back

42   Q103 Back

43   Q103 Back

44   As above Back

45   Q45 Back

46   As above Back

47   Information Commissioner's Office (OD 12) para 6  Back

48   Q43 Back

49   Q48 Back

50   As above Back

51   Information Commissioner's Office (OD 26) para 6 Back

52   As above Back

53   Q108 Back

54   Q172 Back

55   Full Fact (OD 11)  Back

56   As above Back

57   For example, "NHS Patient Data to be made available for sale to drug and insurance firms" The Guardian, 20 January 2014 and "Four in 10 GPs to opt out of NHS database", The Telegraph, 24 January 2014 Back

58   NHS England: Guide for GP Practices Back

59   Q120 Back

60   Q123 Back

61   Q71 Back

62   As above Back

63   As above Back

64   Q120 Back

65   Nick Hurd MP and Rt Hon Michael Fallon MP (OD 28) Back

66   As above Back

67   As above Back

68   Owen Boswarva (OD 06) para 4 Back

69   Dr Ben Worthy (OD 03) para 7.1 Back

70   As above, para 7.2 Back

71   As above  Back

72   Involve (OD 10) para 3.9 Back

73   As above para 3.10 Back

74   As above Back

75   Open Data Engagement website Back

76   As above Back

77   Q25 Back

78   As above Back

previous page contents next page

© Parliamentary copyright 2014
Prepared 17 March 2014