Education CommitteeWritten evidence submitted by Andrew Hunt
1. Relevant Expertise of the Submitter
1.1 I am chair of examiners for the AQA A-level “Science in Society”. I have helped to devise the OCR GCSE Twenty First Century Science specifications. I act as a reviser of the papers set for these qualifications.
1.2 In 2010 I carried out a study for SCORE on the assessment of “how science works” in GCSE science specifications and reported to the Awarding Bodies on the findings
(http://www.score-education.org/media/7376/finalhsw.pdf).
1.3 My examining experience dates back to the late 1960s when, as a teacher, I helped with the assessments for the new Nuffield Chemistry courses. I was chief examiner for Nuffield O-level Chemistry with the London Board from 1971–78. While still a teacher, I was the chief examiner for the SW Herts area CSE mode 3 in Chemistry from 1980–81. When running the Nuffield Curriculum Centre, from 1992 to 2007, I liaised closely with all three English Awarding Bodies (and their precursors) about Nuffield specifications. I was chief examiner for AS Science in Society, first with OCR and then with AQA.
1.4 I have produced textbooks and multimedia resources for GCSE, vocational and A-level science courses both as a private author and as project director for the Nuffield Foundation and the Association for Science Education. I have worked with most of the leading educational publishers in England.
2. Features of Examining in England
2.1 The ancestors of the current Awarding Bodies were linked to universities. So part of their inheritance is a commitment to high academic standards. The links to universities are now very much reduced, nevertheless the commitment to accurate and up-to-date subject knowledge should still be valued.
2.2 Examining in England has also relied on close links between examiners and teachers in schools. This is important because it means that examiners understand the impact, for good or ill, that the assessment process can have on teaching and learning. The aim is to achieve a positive “backwash effect” so that the examinations encourage good practice in schools.
2.3 Close links between examining and teaching were a feature of the CSE examinations. The dozen, or so, regional examining boards offering mode 1, mode 2 and mode 3 examinations allowed for innovation by teachers at a time when there was much uncertainty about the nature of education for the population of young people that the Newsom Report had described as “half our future”.1 The diversity of courses on offer recognised that the interests and needs of learners in schools across the country are not all the same.
2.4 The last great flowering of teacher-led innovations in science courses was in the 1980s when the Secondary Science Curriculum Review (SSCR) supported local development groups. The most popular GCSE Science courses today are offered by AQA. The success of these courses can be traced back to the work of an SSCR group in Cumbria. Similarly a group of advisers and teachers in Suffolk developed an approach to teaching and assessment that has evolved into the current OCR Gateway suite of specifications. More recent developments have diminished the contribution that practising teachers can make to curriculum innovation.
2.5 Even now many examiners are also teachers. However it has also become much more difficult for full-time teachers to take on senior examining roles because the pressure of accountability in schools. Many chief and principal examiners no longer teach full time or have retired from teaching.
2.6 It is important that teachers trust that Awarding Bodies will treat their students fairly. Equally examiners hope to be able to trust teachers to work fairly within the guidelines when it comes to school-based assessments. The opportunities for teachers to meet senior examiners and attend training courses is necessary in fostering mutual understanding and trust. However, in recent years, it has become increasingly difficult for teachers to get out of school, even for meetings directly related to student exam preparation.
2.7 The growth of forms of accountability linked to exam results has put “trust” under strain in England. Specifications have become much more detailed. Many teachers expect to see that all exam questions are very accurately targeted to assess very precisely the specified knowledge and understanding. The danger is that this narrows the curriculum, makes examination papers more routine and so inhibits imaginative teaching. Another symptom of the decline in trust has been the introduction of “controlled assessment” to limit abuses of the accreditation of coursework.
2.8 The reputation of English examination for rigour and trustworthiness helps to account for the international work of the Awarding Bodies. They provide examinations in other school systems, and work with local educators in many parts of the world to write specifications, devise assessments and train examiners. These activities in other countries are possible because the Awarding Bodies have fostered the expertise and built up substantial capacity for good examining.
2.9 This section has identified these long-standing features of examining in England which have value:
Links to research and scholarship in Universities to ensure that courses are up-to-date and in line with current knowledge.
An approach to assessment which encourages good practice in the classroom.
Diversity to meet the interests and needs of different teachers and learners.
Mutual trust between teachers and examiners.
A capacity for innovation and change.
An international reputation for an assessment regime that is resistant to corruption.
The important characteristics of external assessment in schools and colleges have been undervalued in an era of league tables and performance management based on a narrowly defined range of examination results.
3. The Arguments in Favour of and against having a Range of Awarding Bodies for Academic and Applied Qualifications and the Merits of Alternative Arrangements
3.1 In 1997 a spokesman for David Blunkett, then shadow education minister, said that: “The case for a single exam board becomes stronger by the week. Parents, pupils, business and universities want to know exactly what a grade means and they want assurance that all exam papers are being set and marked in the same way.”
3.2 The main case for one national Awarding Organisation is based on the view that this is the only way to avoid standards being debased in a competitive environment. Having one organisation, it is argued, would ensure consistency and fairness.
3.3 Unfortunately the track record of national organisations for central control of curriculum and assessment has not been encouraging in England. Variously there have been one or more organisations concerned with these matters in the period between the founding of the National Curriculum Council (NCC) and the School Examinations and Assessment Council (SEAC) to the recent demise of QCDA. These organisations have serially failed to establish their authority when faced with the large scale and diversity of school education in a context where many educational issues are contested politically.
3.4 It seems much easier for a single agency to operate successfully, and establish a constructive consensus between schools, examination boards, universities, employers and politicians, where the population is much smaller than in England as is the case in Scotland, and—when I visited the country some years ago—Israel.
3.5 The notion of setting up a single Awarding Body is tempting, but would be mistaken. It would strengthen a dangerous centralising tendency at a time when the school system is becoming more diverse. There would be political dangers too as stated by the Sykes Review2 of qualifications and assessment (2010) in its discussion of GCSE qualifications in English and Mathematics: “The review group considered whether a national examination, set and administered from the centre, would be advisable in mathematics and English. However we believe that any government would be tempted to use that examination to justify its own performance, and confidence in its reliability would suffer as a result.”
3.6 Central control of qualifications by QCDA and Ofqual (and their precursors) has often had harmful effects. Awarding Bodies have been forced to work to inappropriate bureaucratic rules and unrealistic timetables (see paragraph 4.5 below). At the same time, links with users of qualifications such as universities have been weakened. Hence the important recommendations 5, 6 and 7 of the Sykes Review3 which suggest that there should be greater diversity, not less. The report on this Review sums up the discussion of regulation up by stating4 that: “the development of examinations and qualifications should be in response to the demands and needs of its end-users. Any role for government regulation should follow from, not precede this”. A single, national Awarding Body would not be able to cope with the diversity implied by the recommendations of this report.
3.7 A complementary commentary in the Wolf Report5 on Vocational Qualification (2011) makes the point that: “central attempts to impose a neat, uniform and “logical” structure (in a complex modern economy) always fail”. This report emphasizes that “the great strength of the English system of independent awarding bodies is that it allows for multiple direct links between qualification development, the labour market and higher education”. However, this feature has been systematically undermined by government policies and regulatory changes.”
3.8 The Wolf Report6 also states that: “First we should take seriously the mass of evidence showing that what really matters is teachers, and stop over-estimating what can be achieved through a written qualification outline. Second, that if an excellent teacher has a strong preference for one qualification over another, that should be respected. And third, that no single centrally defined option is every likely to suit everyone.”
3.9 Another report that has commented on the possibility of reducing the number of Awarding Bodies is the Walport Report (2010) called “Science and Mathematics Secondary Education for the 21st Century”.7 The group which worked on the report “considered whether to recommend a reduction in the number of awarding bodies—but recognising that the availability of choice is in principle, important—decided to recommend instead that the planned stronger regulation by Ofqual, the new regulator, is given a chance”.
3.10 The links that once existed between schools, teachers and examiners have become weaker. This too is noted by the Walport Report8 which found that “teachers and lecturers feel disengaged from the design and delivery of the examination system, and disempowered from influencing it”. The group was told repeatedly “that the best teachers who are active in schools, FE colleges and HEIs no longer participate in the design of qualifications or examination processes.” This disengagement would be made worse by centralisation based on one Awarding Body.
3.11 Cutting the number of Awarding Bodies would reduce opportunities for teachers and others to take an active part in the development of specifications and assessments and thus it would also reduce the national capacity and pool of expertise for examining in each main subject area.
3.12 The change would also inhibit innovation. There has been a continuous programme of influential and important innovations in examinations and assessment in England over the last 40 years. In general these have not been funded by Awarding Bodies or government. In science and mathematics much of the work has been supported by charitable institutions (such as Nuffield Foundation, the Gatsby Charitable Trust, the Salters Institute and the Wellcome Trust), professional institutions (including the Royal Society of Chemistry and the Institute of Physics) and industry. The existence of several examining boards (now Awarding Bodies) has helped rather than hindered these innovations which have been taken up and put to the test by enthusiastic teachers before leading to more widespread adoption.
3.13 Setting up a single Awarding Body is not the solution to the problems of consistency and fairness. The challenge is to establish an appropriate approach to quality assurance and regulation as discussed., for example, in the context of the vocational qualifications, by the Wolf Report9 as well as in the Walport Report.
4. How to Ensure Accuracy in Setting Papers, Marking Scripts, and Awarding Grades
4.1 Policy-makers and regulators need to take responsibility for the implications of any changes they introduce. The benefits of change need to be balanced against the large costs of implementation which too often seem to be ignored. Errors arise when the demands exceed the resources available to do the job well.
4.2 The system struggles to cope if increases in the scale and scope of external examining mean that there are not enough experienced and skilled examiners to set and mark high-quality papers. It takes time to develop good ideas for examination questions so that quality declines if senior examiners have to set and mark too many papers.
4.3 Shortage of appropriately skilled markers, and the desire to keep costs down, has meant that Awarding Bodies increasingly set up “marker centres” with non-specialist markers. This should not affect reliability but it can affect the validity10 of examinations if questions have to be contrived to fit formats that can be marked correctly by general markers.
4.4 Crucially there needs to be a manageable timetable for implementation of changes that takes into account not only the regulatory requirements, but also the need for Awarding Bodies to work to a high standard when:
developing specifications and assessment models;
training examiners and then working with them to set specimen and the first operational papers; and
preparing support materials to disseminate the significance of the changes to teachers in schools.
The timetable should also allow for the need to develop new teaching and learning programmes and resources both by teachers in schools and also by commercial publishers.
4.5 The stages in the introduction of the new GCSE Science specifications for first teaching from 2011 illustrate how things can go wrong. The process started with writing of National Criteria by QCA (then QCDA). The process was badly organised and, thanks to staff changes at QCA, was led by people who had lost sight of the thinking underpinning the changes to the National Curriculum and National Criteria five years earlier. Far too much of the time available was taken up over the drafting of, and consultation on, the National Criteria so that the Awarding Bodies had too little time to make a good job of preparing new specifications, assessment methods and sample assessment materials. During the process accreditation was taken over by Ofqual leading to desirable but challenging alterations to the definition of assessment objectives and other reinterpretations of the criteria at a late stage in the process. As a result the procedure of accreditation was drawn out and the dissemination of the changes to schools delayed.
4.6 Fundamentally the processes for setting papers, marking scripts and awarding grades are sound. The Ofqual code of practice is appropriate and summarises the good practice inherent in the English examining regimes that has developed over the last 40 years.
4.7 Errors arise if pressures in the system become excessive as happened in the last twelve months in the lead up to the start of teaching of the new GCSE Science specifications from September 2011. Awarding Organisation staff and senior examiners had to work in parallel on specimen assessment materials for the new courses, legacy papers for the old specifications as well as new operational papers. The pressure was immense with staff working excessive hours.
4.9 A five-year cycle for revising criteria and specifications is too short. It makes it hard to learn from experience—given that it is necessary to papers to be set two-years in advance of the date when they will be sat by candidates. Constant change means that there is very little stability in the system and this makes errors more likely as all involved have to accommodate to new content, new criteria and new objectives too often while continuing to service the previous regime.
4.8 Financial and other pressures put the Awarding Bodies under pressure to streamline processes too much and reduce the number of checks on the stages of the preparation of examination papers. The move from hard-copy to electronic processes (for most stages of examining) is not complete and the danger is that the checking processes are not appropriately “reinvented” for digital methods.
4.9 In some Awarding Bodies, “subject officers” with expertise in the subject being assessed have been replaced with “qualification managers” who rely on the examining team for subject expertise. This is a mistake because it means that the staff are much less likely to identify errors at key stages when the system is working under pressure.
4.10 Finally, in this section, it is worth making the point that all users of examination results need to have a better understanding that examining is a form of measurement and so, inevitably, subject to a degree of uncertainty. There has to be a compromise between reliability and validity. Overemphasis on reliability makes assessments less “accurate” if they lose validity.
5. The Commercial Activities of Awarding Bodies, including Examination Fees and Textbooks, and their Impact on Schools and Pupils
5.1 The Walport Report11 recommended that the practice of Awarding Bodies endorsing textbooks should be stopped. The group found that Awarding Bodies now produce textbooks aligned closely to their examination specifications. The authors of the report were advised that “these texts were directed at helping candidates to pass exams, rather than to understand the subject in depth and took the view that the alignment of an examination with a textbook business represented a conflict of interests”.
5.2 Particularly undesirable is the increasingly common situation in which an Awarding Body comes to an agreement with one publisher such that this publisher will produce the only endorsed texts and other resources. Thus the resources are endorsed before they have been produced. Other publisher may produce better texts and multimedia resources but the Awarding Body has contracted not to endorse them.
5.3 Often the tie-up between an Awarding Body and publisher leads to the senior examiners being invited to help write the texts and resources, despite the fact that the skills involved in being an effective examiner are not the same as those needed to devise good curriculum materials.
5.4 Schools have felt obliged to buy the endorsed publications, whatever their quality, because they suspect that the examiners will refer to them when setting examination questions. There is a clear conflict of interest if the senior examiners are also receiving royalties from the sale of texts and other resources.
5.5 Interestingly, in the latest round of publishing for GCSE Sciences, several non-endorsed publishers have produced resources for all the popular specifications. This suggests that publishers are finding that teachers have begun to appreciate the dangers of too close a relationship between an Awarding Body and one publisher and considering endorsed publications more critically. Even so, the practice is undesirable.
November 2011
1 Ministry of Education (1963) Half Our Future, London: HMSO (available on line at http://www.educationengland.org.uk/documents/newsom/).
2 The report from the Review chaired by Sir Richard Sykes can be downloaded at: http://goo.gl/8wfmz. See page 23.
3 Sykes Review page 20.
4 Sykes Review page 28.
5 The report from the Review of Vocational Qualifications led by Professor Alison Wolf can be downloaded from http://goo.gl/iY8pr. See pages 57-58.
6 Wolf Report page 116.
7 The Walport Report can be downloaded from: http://goo.gl/zLYXs. See page 48.
8 The Walport Report pages 41-42.
9 Wolf Report page 96ff.
10 Reliability refers to the consistency of a measure. A test is considered reliable if it gives the same result repeatedly. If a test is reliable the results are approximately the same each time the test is administered. Validity is the extent to which a test measures what it claims to measure. It is vital for a test to be valid in order for the results to be accurately applied and interpreted. Validity is crucial to “accuracy”; reliability is not enough.
11 The Walport Report pages 49.