Select Committee on Children, Schools and Families Third Report


4  THE CONSEQUENCES OF HIGH-STAKES USES OF TESTING

108. In previous chapters, we have alluded to the concept of high-stakes uses of testing and a variety of unintended consequences, including distortions of the education experience of pupils, which are said to result. In this Chapter, we shall consider, first, what is meant by "high-stakes" and then examine in more detail the claims which are made about the consequences resulting from high-stakes testing.

109. The NAHT argues that the tests themselves are not inherently problematic, but the use of the results of those tests for high-stakes purposes is. NAHT members do not take issue with the principle of testing, but with the emphasis on published performance tables and the links between test results and inspection outcomes.[162] The QCA agrees that, when evaluating an assessment system, there is a need to distinguish "the impacts attributable to testing, per se, and the impacts attributable to the high-stakes uses of results which the testing is designed to support".[163]

110. Ralph Tabberer told us that he questioned the premise that Key Stage tests were high-stakes in conventional terms. In his view, high-stakes tests were those, such as the 11-plus, which determine which school pupils will attend within a selective system. He considers that Key Stage tests are "medium-stakes", allowing pupils to demonstrate their attainment and giving them and their parents a sense of their level of achievement. It is incidental, according to Mr Tabberer, that Key Stage tests "also happen to give us very useful information […] for policy development and accountability".[164]

111. We think that the stakes of the national testing system are particularly high for schools and teachers at all levels and for young people at 16+ who need qualifications. Children need to do well in tests and, later on, 16+ qualifications, in order to move successfully on to the next level of schooling and to get the grades they need for their chosen higher education course or employment. Teachers and headteachers need their pupils to do well in tests in order to demonstrate that they are effective teachers, to win promotion and, perhaps, financial reward. Perceived failure of teachers in this respect may lead to demoralisation, being passed over for promotion, difficulty in finding employment in other schools or leaving the profession altogether.[165] Schools are held accountable according to the test results achieved by their pupils and a poor set of results may result in public humiliation in published performance tables, being perceived as a "failing school", interventions by Ofsted and even closure in extreme cases.[166] Local authorities may also be perceived as "failing" if the schools in their area do not demonstrate results measuring up to the expected standards. Finally, the Government itself, having put in place accountability mechanisms for driving up standards from the centre, stands to lose a considerable amount of political capital if targets are not met and standards are not seen to be improving according to the measures set out in performance tables.[167]

112. When "high-stakes" are mentioned in the evidence we have received in the course of this inquiry, the term most often refers to high-stakes for the school and teachers, rather than children. This is because it is the use of test results for the purpose of school accountability which is blamed for rendering the tests high-stakes in the first place. As should be apparent from the previous Chapter, it is the link between test results, performance targets and performance tables which raises the stakes for schools and, therefore, teachers. Under the current testing regime, pupil performance in tests is inextricably linked with school accountability across the board. It seems relatively clear to us that, even if this link is broken, national tests and public examinations will remain high-stakes for individual children because their own futures are at stake. In fairness to schools and teachers, their objection to the current, high-stakes accountability regime appears to us to be as much because it distorts the education experience of children as out of self-interest. Jerry Jarvis, Managing Director of Edexcel, told us that:

    Unfortunately, there are huge issues at stake in most schools, and teachers are human. Having said that, the huge overwhelming majority of teachers aim to deliver on education—that aim comes across strongly in what they do. However, there is no question that there is pressure.[168]

113. However, we have received considerable evidence of problematic practice, illustrated by this account from a teacher:

    Last year my Headteacher asked me for my reaction to the KS3 Maths results so I started talking about the progress of individual students and how pleased I was for them. He brushed these comments aside, merely wanting to talk about percentages at each level and comparisons with other year groups/other schools.[169]

Clearly, there are serious problems which need to be addressed. The following sections will analyse some of these issues in more detail.

Teaching to the test

114. AQA, an Awarding Body accredited by the QCA, identifies that:

It is this tension which is apparent in any discussion of "teaching to the test". Essentially, teaching to the test amounts to teachers drilling their pupils in a subject on which they will face a test or examination. In extreme cases, a high proportion of teaching time will be given over to test preparation. The focus of lessons will be narrow, with teachers coaching their pupils on examination technique, question spotting, going over sample questions similar to those likely to be set in the test and generally focussing teaching of the substance of a subject in a way best calculated to maximise marks in the test. The IPPR identifies a range of classroom practices which are intuitively of low educational value but which may, nevertheless, improve test results, including narrow learning, where teachers concentrate on aspects of the curriculum likely to be tested; shallow learning, where teachers focus on the way in which a component of the curriculum is likely to be tested; question spotting; and risk-averse teaching with low levels of innovation.[171]

115. The DfES stated in its memorandum its view of what should be happening in classrooms. Children should acquire deep knowledge and understanding of a concept through extended experience and practice. At the same time, they should certainly have an appreciation of what examiners are looking for and how to present their answers in a test. Preparation for tests should be wholly integrated into the classroom experience, with teachers and pupils agreeing targets for the next stage of learning and discussing what the pupil needs to do to reach those targets. Progress should be tracked continuously through assessment, benchmarked periodically with formal tests such as the optional tests provided by the QCA. There should be no cramming in Years 6 and 9 as the Key Stage tests approach.[172] The DfES concluded that:

    The teacher who prepares pupils for a test without developing the deeper understanding or more extended experience required may be fortunate enough to enjoy some short-term success, but will not be likely to maintain that performance over time.[173]

However, as Key Stage 2 (and often Key Stage 1) testing occurs as a child leaves one institution to go to another, this in practice is not a problem for the teacher who is responsible for teaching the child who takes that test. This increases the need for the government to find alternative measures of the effectiveness of teaching which encourage teachers to ensure that learning achieves more than a short term ability to pass a test.

116. The Government's statement of what should be happening (paragraph 115) seems to us rather out of touch with what appears to be happening in classrooms according to the evidence we have received. It has been argued by a great number of witnesses to this inquiry that the high-stakes attached to national testing increases pressure to teach to the test. The IPPR noted that it is difficult to prove this causal link[174]; and the QCA noted that there is little, systematically documented evidence of the phenomenon, only considerable anecdotal evidence[175]. However, the vast majority of other witnesses have had no such reticence: they are clear that teaching to the test happens, that it is prevalent and that it is caused by the high-stakes (for schools) which are attached to the results of the tests.[176] Innovation and creativity in teaching approach is considered "too risky"[177] and teaching to the test is displacing healthy classroom practice, such as informal, formative teacher assessment.[178] The ASCL's view is typical of the evidence we have received:

    Teachers have been criticised for teaching to the test but, if the system is geared to constantly monitoring progress and judging teachers and institutions by outcomes, it is hardly surprising that the focus is on ensuring that students produce the best results.[179]

117. The effect of high-stakes on teachers appears to be profound. Mick Brookes gave the example of a hypothetical young headteacher or deputy head with a young family and a mortgage:

    […] a system of fear has been inculcated throughout education […] You do not want to go to headship because you know that you will carry the can. Unfair and unfounded decisions are made on the performance of schools because Ofsted is now relying far too heavily on the data that we have discredited during this presentation. It is having a profound effect not only on the curriculum, but on the recruitment and retention of head teachers, in particular, who carry this can and do not survive being put into a category, on many occasions quite unfairly.

118. Cambridge Assessment identified teaching to the test as "a very serious issue" which may be one significant factor, although not the only one, in the so-called 'plateau-effect' which has been associated with "the majority of innovations such as the Primary Literacy and Numeracy Strategies". As Cambridge Assessment puts it, "A succession of well-intended and seemingly robust initiatives repeatedly run out of steam".[180] The phenomenon is especially prevalent as the time for the test approaches. The NAHT refers to a recent survey which indicates that, in Year 6, for four months of the school year teachers are spending nearly half their teaching time preparing pupils for Key Stage 2 tests.[181] It has been argued that in Wales, where testing is no longer high-stakes, healthy practice in classrooms, such as cross-phase moderation, is now being adopted.[182]

SHALLOW LEARNING

119. One serious consequence of teaching to the test is that it tends to lead to shallow learning and short-term retention of knowledge.[183] The Mathematical Association pointed to "a narrow focus on mark-winning behaviours rather than teaching [pupils] a coherent understanding of the subject".[184] Another witness had said that the positive changes to the science curriculum have been undermined by "a system which values factual recall and superficial conceptualisation over deeper understanding and engagement".[185] Hampshire County Council states that it has identified widespread teaching to the test, leading to a reduction in the time spent exploring more imaginative and creative aspects of the curriculum and an emphasis on short-term memorisation and 'test tactics' rather than deep learning and understanding.[186] In an evaluation study of 14-19 mathematics teaching, Ofsted identified:

    A narrow focus on meeting examination requirements by 'teaching to the test', so that, although students are able to pass the examinations, they are not able to apply their knowledge independently to new contexts, and they are not well prepared for further study.[187]

120. The ATL also referred to research which found that teaching to the test led to many pupils not possessing the skills or understanding which the test was designed to assess and that "the focus of teaching in this environment is to teach students to pass tests even where they do not have the skills or understanding".[188] The reason is that tests do not usually test the full range of what is taught. When the tests are high-stakes for schools, the pressure on schools and teachers is to focus on pupils' performance in those areas most likely to be tested. In the context of an overburdened curriculum, those areas will dominate classroom time. Where resources are focussed in this way, the breadth and depth of subject-coverage suffers.[189] The ATL considers that "the purpose of assessment as an aid to the development of learning is shunted into second place", with maximising test results to enhance accountability measures promoted to first place.[190]

121. This is extremely worrying and may provide a partial explanation for the apparent decline in attainment of pupils moving from Year 6 into Year 7.[191] The NAHT states that:

    There are many examples of year 6 students who have obtained high levels, particularly in Science SATs, who are not able to replicate this performance within the secondary curriculum. The results are not wrong. They merely indicate that the students have learned how to pass Science SATs and not developed scientific skills and absorbed scientific content. This can be extremely unhelpful for the receiving secondary school.[192]

122. The QCA reports that schools often mistrust the results from the previous Key Stage and re-test using different measures.[193] The Chartered Institute of Educational Assessors suggests that the Year 6 to Year 7 dip may, indeed, be caused by pupils being hothoused and studying a limited curriculum in the final term of primary education in preparation for the Key Stage 2 tests. On arriving in a new institution, there is no immediate public examination in prospect, so pupils are inclined not to work as efficiently or effectively. There, however, are other factors at play. Pupils are also moving from a regime in which they were taught by the same teacher for all subjects, to a regime in which "they are taught by specialist teachers using specialist equipment in discrete physical locations for each curriculum area". They are with new peers in a new social environment in which they are now the youngest, having been the oldest and top of the hierarchy in their primary schools.[194] It is also worth noting that, in Science, the Key Stage 3 curriculum is much broader than the Key Stage 2 curriculum so that, although a Level 4 at Key Stage 2 might be equivalent in difficulty to a Level 4 at Key Stage 3, it is not equivalent in scope.

QUALIFICATIONS

123. In qualifications, the Awarding Bodies arguably encourage teaching to the test through factors such as increasingly more closed questions, the provision of sample questions and answers and of extensive syllabus training to teachers, including comprehensive teaching and learning materials.[195] The AQA prides itself on its extensive programme of "teacher support", including support at the beginning of a new syllabus, regular review opportunities, rapid feedback from examiner reports and the provision of "comprehensive, innovative and motivating teaching and learning materials" to schools selecting AQA is their examination provider. The provision of teaching and learning support is, apparently, a major and growing part of AQA's work.[196] The AQA's Deputy Director General, Dr Andrew Bird, thought that the provision of curriculum support materials and teacher training was important, but emphasised that exams were set by a separate part of the organisation and that it guards against predictability.[197] Greg Watson, Chief Executive of OCR, told us that:

    By being a bit more open, a bit more transparent, and providing a few more clues, we enable people to feel well prepared, and what we are actually [doing] is assessing what they can do not how successfully they have guessed what they are about to do, or their ability to cope with the surprise of what they have been faced with. I think that that has also been a positive development. But I would set against that the fact that there is a challenge and we employ expert people in the field of assessment to make sure that we keep the level of challenge right, and that it does not become too formulaic and too predictable, which obviously would mean beginning to lose some of that effect.[198]

124. Others take a different view of these practices. Warwick Mansell refers in his memorandum to his observation of two senior examiners giving teachers "tricks on how to boost results" and sees this as evidence of "the cynical lengths to which test preparation and the search for shortcuts to improve pass rates can be taken".[199]

125. In addition to the training and materials detailed above, electronic tools from the Government (RAISEonline) and Edexcel (ResultsPlus) make available to schools data on test results which can be broken down by question, by pupil, by teacher, by year group and by school.[200] City & Guilds favours this approach, stating that, "Considering the significant effort that goes into the final examining process by all parties the current under-use of this data is a travesty" and welcomes products such as Edexcel's ResultsPlus.[201] AQA may be piloting a similar system this Summer, although OCR apparently has no plans along these lines.[202]

126. These data are intended to enable schools, teachers, parents and pupils to track the progress of individual pupils and year groups, to evaluate the effectiveness of teaching and to identify areas of strength and weakness.[203] However, there is suspicion that the data will encourage further teaching to the test.[204] Edexcel counters that "the genie [is] out of the bottle" and that trials of the system demonstrate that it can improve teaching and grades quickly and shows headteachers how well syllabuses are being taught.[205] Jerry Jarvis told us that the system actually allows teachers to spend less time on revision and more time on true, personalised learning, tailored to the needs of an individual child.[206] Greg Watson thought that this was nothing new and that teachers have been able, for some time now, to access examination answers and see how their pupils performed.[207]

IS TEACHING TO THE TEST DETRIMENTAL?

127. It has been argued, especially by the Awarding Bodies, that teaching to the test is no bad thing if the tests assess a curriculum and are worth teaching to.[208] The Government has largely avoided this issue and has not provided a definitive statement one way or the other, apart from its statement of what it deems to be proper classroom practice, set out in paragraph 115 above. In his evidence to us, the Minister more or less rehearsed this statement, adding that "the vast swathe of teachers and schools … use tests appropriately". He added that £150 million was being invested over three years "In order to help those who do not [use tests appropriately] and to improve best practice generally".[209] The Minister did not have any statistics on the average amount of time spent in schools on preparing for tests, a situation which we find surprising considering the seriousness of the issues at stake and the strength of his assertions, and those of his officials, that teaching to the test is not a problem. When we asked him about the amount of time being spent by Key Stage 2 pupils on revision for the Key Stage 2 tests, the Minister replied that he saw nothing wrong with children learning what they needed to learn to pass the tests.[210] Ralph Tabberer also thought that revision was not "wasted time" and that it was important that pupils were prepared for Level 4 in order that they could access the secondary curriculum.[211] Mr Tabberer did state, however, that:

    […] we do not want to see children being drilled so that they can just repeat low-level processes accurately and get marks for that—we are all clear that we do not want that.[212]

He went on to say that, when he talks to teachers, he does not hear that they are drilling pupils, but that they are preparing them so that they can do their best in the tests. He thought that there was a "good balance".[213] David Bell and Jon Coles took a similar view.[214] David Bell in particular thought that claims made about teaching to the test were overblown and did not match his experience, having visited hundreds of schools.[215]

128. In reply, other witnesses have repeatedly pointed to the narrow range of knowledge and skills tested and the unreliable nature of test outcomes.[216] Warwick Mansell of the Times Education Supplement told us that teaching to the test cannot possibly be a positive phenomenon. The Awarding Bodies, for example, are becoming quite explicit about what is going to be in the examinations and publishes detailed marking schemes. He states that, as a result, "Pupils are being rewarded for dogmatic rule-following", a situation which will not help to develop them as independent thinkers. Predictability of the examination content is, according to Warwick Mansell, the enemy of in-depth study of a subject. He believes that, although teaching to the test may always have been a feature in education, it is now far more prevalent due to the pressures on schools and teachers to raise results "more or less come what may". He concludes that teaching to the test is fundamental to the learning experience of children and that "it is changing dramatically the character of education in this country".[217]

129. The effects of this approach are felt long after children have finished their school education. Professor Madeleine Atkins, Vice-Chancellor of the University of Coventry, told us that teaching to the test and a test mentality on the part of students arriving at university leave them unprepared for the rigours of higher education. Students, particularly on vocational courses, arrive at university having learned techniques and how to apply them by rote. The consequent lack of deep understanding of the subjects they have studied at school leaves them unable to solve problems in real-world situations. Professor Atkins said that students find the transition to higher education difficult:

    It does not mean to say that that they cannot do it, but it does mean that we have to teach in a rather different way to begin with in order that that synoptic understanding is developed and that understanding of connections between tools, techniques and methodologies is really in place.[218]

Professor Steve Smith, Vice-Chancellor of the University of Exeter, agreed with Professor Atkins that school leavers often came to university unprepared, tending to be unable to think critically or independently:

    The problem we have with A-levels is that students come very assessment-oriented: they mark-hunt; they are reluctant to take risks; they tend not to take a critical stance; and they tend not to take responsibility for their own learning. But the crucial point is the independent thinking. It is common in our institution that students go to the lecture tutor and say, "What is the right answer?" That is creating quite a gap between how they come to us with A-levels and what is needed at university.[219]

130. We received substantial evidence that teaching to the test, to an extent which narrows the curriculum and puts sustained learning at risk, is widespread. Whilst the Government has allocated resources to tackle this phenomenon and improve practice they fail to accept the extent to which teaching to the test exists and the damage it can do to a child's learning. We have no doubt that teachers generally have the very best intentions in terms of providing the best education they can for their pupils. However, the way that many teachers have responded to the Government's approach to accountability has meant that test results are pursued at the expense of a rounded education for children.

131. We believe that teaching to the test and this inappropriate focus on test results may leave young people unprepared for higher education and employment. We recommend that the Government reconsiders the evidence on teaching to the test and that it commissions systematic and wide-ranging research to discover the nature and full extent of the problem.

Narrowing of the curriculum

132. The phenomenon described as 'narrowing of the curriculum' is strongly related to teaching to the test and many of the same arguments apply. There are essentially two elements to this concept. First, there is evidence that the overall curriculum is narrowed so that the majority of time and resources is directed at those subjects which will be tested and other subjects in the broader curriculum, such as sport, art and music, are neglected.[220] Second, within those subjects which are tested, the taught curriculum is narrowed to focus on those areas which are most likely to be tested ('narrow learning') and on the manner in which a component of the curriculum is likely to be tested ('shallow learning').[221]

133. Doug French of the University of Hull gave his view of the problem:

    [Narrowing of the curriculum] is observed particularly in year 6 in primary schools and years 9 and 11 in secondary schools when national tests are taken. In year 6 far too little time is spent on subjects other than those being tested and too much teaching time is devoted to a narrow focus on practising test questions. In secondary schools each subject has its own time allocation, but a narrow test-oriented focus within each subject is commonplace. At sixth form level, the situation, if anything, is even worse with module assessments twice a year leading to AS level after one year followed by A-level in the second year.[222]

134. The QCA observed that the focus on the core subjects leads to relative neglect of the full range of the national curriculum. 90% of primary and 79% of secondary schools reported to the QCA that national testing has led to pupils being offered a narrower curriculum.[223] Dr Ken Boston also told us that "all the evidence that I hear in my position is about the narrowing of the curriculum that results from these tests".[224]

135. The Government, however, states that it makes "no apology for the focus on the core subjects of English, maths and science" as mastery of these disciplines is the key to future success. Pupils who arrive in secondary school without a secure grasp of these subjects to a Level 4 or better will be hampered in their learning of these and other subjects at the higher levels.[225] Sir Michael Barber echoed this view when he gave evidence to us, as did Sue Hackman.[226] The view of the DfES was that:

    There is nothing that narrows a pupil's experience of the curriculum so quickly as a poor preparation for the level of literacy and numeracy that the subject demands.[227]

136. Whilst the DfES evidence is common sense, it does not really address the concerns raised by other witnesses, including the findings of the QCA reported above. It is true that mastery of the core subjects is vital but, as a consequence of the evidence we cite above that mastery of the examination is given priority over mastery of the subject and that time taken to prepare for these examinations is taken from the broader curriculum, children risk missing out on access to a broader range of skills and knowledge. Mick Brookes said that he understood from his colleagues that national testing had narrowed the curriculum and he endorsed a comment attributed to Anthony Seldon, Master of Wellington College, who is quoted as saying:

    Children are encouraged to develop an attitude that, if it is not in the exam, it doesn't matter. Intellectual curiosity is stifled and young people's deeper cultural, moral, sporting, social and spiritual faculties are marginalised by a system in which all must come second to delivering improving test and exam numbers.[228]

137. The ATL similarly argued that high-stakes testing has had a well-documented narrowing effect on the curriculum and that this has undermined the statutory entitlement of pupils to access to a broad and balanced curriculum, particularly in those schools which fear low scores in the performance tables.[229] Others have deplored focus on core subjects and related tests, leading to the negligence of other subjects of interest to children, and this can be a particular problem in Year 6.[230] The NUT pointed to studies which report that:

    […] high stakes National Curriculum tests had almost wiped out the teaching of some Foundation subjects at Year 6.[231]

As a result, children whose learning styles do not conform to the content and form of the tests are often missing out on areas of the curriculum in which they may have more success.[232] Simply put, the more creative elements of the curriculum are being displaced by the pressure to teach to the test.[233]

138. The method of assessment has also come in for some criticism. The Association of Colleges gives the example of a written test of mechanical skills or musical understanding which diverts the taught curriculum towards those skills and away from mechanics and music. [234] Thus the clear reliance of national tests on the written, externally marked assessment instrument is contributing to the narrowing of the taught curriculum. Important skills and abilities are ignored because the tests emphasise skills and abilities which are more easily measured.[235] The Association of Colleges calls for a range of assessment methods which would allow for more creativity in the curriculum.[236]

139. Given our findings that national tests can only measure a small part of what we might consider valuable in the education of children (Chapter 2) and that teachers are concentrating their efforts in the classroom on teaching what is likely to be tested, it should come as no particular surprise that many witnesses point to narrowing of the taught curriculum as a particular problem.[237]

140. A creative, linked curriculum which addresses the interests, needs and talents of all pupils is the casualty of the narrow focus of teaching which we have identified. Narrowing of the curriculum is problematic in two ways: core subjects are emphasised to the detriment of other, important elements of the broader curriculum; and, for those subjects which are tested in public examinations, the scope and creativity of what is taught is compromised by a focus on the requirements of the test. We are concerned that any efforts the Government makes to introduce more breadth into the school curriculum are likely to be undermined by the enduring imperative for schools, created by the accountability measures, to ensure that their pupils perform well in national tests.

The burden of testing

141. Another theme which manifests strongly in the evidence relates to the quantity of testing[238] and there is concern that the quantity of national testing is displacing real learning and deep understanding of a subject.[239] English school pupils are amongst the most tested in the world.[240] Over time, formal national assessment has been applied to ever younger children, so that now even children of four are tested through foundation stage profiling.[241] Counting foundation stage assessment, a pupil going on to take A-Levels will have been tested in seven of their 13 years of schooling.[242] The GTC stated that:

  • the average pupil in England will take at least 70 tests during a school career;
  • the national testing system employs 54,000 examiners and moderators;
  • they deal with 25 million test scripts annually.[243]

142. In primary schools, testing takes place through teacher observation at the age of 4 (foundation stage), through moderated teacher assessment at 7 (Key Stage 1) and through formal testing at 11 (Key Stage 2). The NAHT considers unhealthy the dominance of Key Stage tests in primary schools.[244] One study estimates that, in Years 5 and 6, the equivalent of three weeks of learning each year is spent on revision and practice tests.[245] Sir Michael Barber did not consider that testing in primary schools is overly burdensome over a six-year period.[246] Professor Peter Tymms agreed that there was not too much testing per se, but that it was preparation for the tests in a high-stakes context which rendered them problematic.[247] Dr Ken Boston of the QCA agreed with this proposition and referred to his concern about the "high stakes put on the assessments because … they carry 14 different functions".[248] OCR stated its belief that "the sustained, unnecessary and inappropriate mass testing of very young people through the key stage national tests … is the single biggest cause of the view that there is too much assessment".[249]

143. The QCA noted that most primary schools prepare pupils extensively for tests. At Key Stage 2:

  • of primary schools employ additional staff;
  • set additional homework;
  • more than 80% have revision classes and use commercial or QCA practice tests;
  • in 80% of primary schools, the amount of time spent on test preparation has increased over the last decade;
  • in the second half of the Spring term, 70% of schools spend more than three hours per week on test preparation.

144. The QCA notes a similar pattern of responses from secondary schools.[250] In addition, Ofsted reports that schools often deploy their most effective teachers to the end of Key Stage year groups—Years 2, 6 and 9—and teachers in other year groups feel less responsibility for assessing pupils' progress.[251] Interestingly, an NUT study found that high-stakes testing causes more concern in the primary sector than in the secondary sector, where long experience of testing and examinations has tended to lead to greater acceptance by teachers and parents.[252] By way of comparison, a study for the Royal Society in 2003 found a substantial difference between the time spent in Scottish and English schools on assessment activities at secondary level. English teachers spent more than twice the amount of time each year on assessment activities when compared with their Scottish counterparts at the equivalent of Key Stages 3 and 4. They spent almost seven times more hours at the equivalent of AS/A2 Level.[253]

145. In secondary schools, formal national testing takes place at the ages of 14 (Key Stage 3 tests), 16 (GCSEs and equivalents), 17 and 18 (A-levels and equivalents) and witnesses have argued that this is excessive.[254] The NAHT is particularly concerned about the dominance of Key Stage 3 tests in secondary schools.[255] At the 14-19 stage, it has been argued in relation to mathematics, for example, that the prevalence of testing in this age group is having a serious, negative effect on maths teaching, reducing it to "little more than sequences of lessons on test preparation".[256] OCR, on the other hand, argued that assessment spread out over a longer period of time and closer to the learning experience is less stressful than a concentrated period of assessment at the end of a two-year course of study.[257]

146. In addition, it is currently possible for AS students to sit retakes in order to maximise their grades at the end of the A-level course. It has been argued that this places too great a burden on pupils, diverting them from study of the course to focus on examinations.[258] Others, however, argue that retakes have been associated with enhanced understanding of a course for pupils whose marks improved.[259]

147. Some witnesses have expressed concern over the balance between teacher assessment on the one hand and national testing on the other.[260] City and Guilds argued that a considerable burden of assessment is placed on 16-18 year-olds with examinations in each year. This, it argues, could be mitigated if greater use were made of teacher assessment, as is the case with NVQs.[261] The Chartered Institute of Educational Assessors points to PISA results showing that other countries, such as Finland, achieve good standards with little resort to external assessment and far more emphasis on teacher assessment.[262] It has been argued that some national testing should be replaced with moderated teacher assessment or the use of tests drawn from a bank of diagnostic assessments provided centrally by the QCA.[263]

148. Contrary to the vast majority of the evidence we have received, the DfES stated in its memorandum that "the statutory assessment system demands relatively little of the child in the eleven years of compulsory schooling". It is summarised as follows:

  • Key Stage 1 tests should be carried out as part of normal lessons and the child will not necessarily notice the difference between the tests and normal classroom tasks.
  • Key Stage 2 tests involve one week of testing in May, most tests lasting 45 minutes and the total lasting less than six hours.
  • Key Stage 3 tests involve one week of testing, with tests mostly an hour in length and totalling less than eight hours.
  • At GCSE, the Government is responding to criticisms and cutting down on coursework.
  • At A-level, the number of units is being reduced from six to four in most subjects.

The Minister told us that no pupil spends more than 0.2% of their time taking tests and stated that "In the end, I flatly reject the argument that there is too much testing".[264]

149. We acknowledge the reforms the Government has made to GCSE and A-level examinations. However, the Government must address the concerns expressed by witnesses, among them Dr Ken Boston of the QCA, who see the burden of assessment more in terms of the amount of time and effort spent in preparation for high-stakes tests than in the time taken to sit the tests themselves. This could be achieved by discouraging some of the most inappropriate forms of preparation and reducing the number of occasions on which a child is tested.

Pupil stress and demotivation

150. Many witnesses argued that testing is stressful for children.[265] Moreover, repeated testing has a negative effect on children, leading to demotivation, reduced learning potential and lower educational outcomes.[266] Testing has even been linked with children's health, including mental health, problems and lower self-esteem.[267] The Association of Colleges, for example, stated that those borderline students who had been assisted with additional resources to get target grades fell victim to false expectations resulting in a sense of inadequacy when they found that they did not have the skills or knowledge to deal with the demands of the next stage of schooling. This, the AoC thought, might account for the high drop-out rate at 17.[268]

151. Witnesses have expressed concern that the levels of accountability in schools are resulting in the disillusionment of children.[269] Children not reaching the target standard at a given stage have the impression that they have 'failed' whilst they may, in fact, have made perfectly acceptable progress.[270] Whilst some children undoubtedly find tests interesting, challenging and even enjoyable, others do not do their best under test conditions and become very distressed.[271] In particular, those children who are not adept at the kind of utilitarian skills and strategies required to do well in tests and who frequently 'fail' find the experience "demoralising, reducing self-esteem, including their belief in their ability to succeed with other tasks".[272]

152. Professor Peter Tymms pointed out that this is a complicated area. Children are likely to fail at some things and succeed at others as they go through life. In general, they have a certain natural resilience and expect that, if they try harder, they will do better. However, he considered that it was a mistake to label as 'failures' children who did not meet the target standards, even though they may have done very well to get a lower grade or level. He thought it was also an error to label their school as a failure, because children identify with that, too. He concluded that a national monitoring system should examine attitudes, self-esteem, welfare and physical growth, a proposition with which Sir Michael Barber agreed.[273]

153. Teaching to the test and narrowing of the curriculum are also thought to have a negative effect on children. The resulting lack of creativity in teaching impacts on children's enjoyment of a subject and their motivation to learn.[274] In the worst cases, teachers may resort to dull and boring methods of teaching, using the looming threat of examinations to motivate pupils rather than inspiring them to learn.[275] The Royal Society used the example of science teaching. It argues that the current testing system constrains creativity by giving a high priority to what can be easily measured through written, externally marked examinations. The ability of teachers to meet the individual needs of pupils in the classroom is compromised and, at worst, this may lead to negative attitudes to science amongst pupils, reduced motivation and lower self-esteem.[276] An over-emphasis on preparation for national tests at primary level led to a negative effect on children's enjoyment of science.[277]

154. The Government has expressed the view that some children will find examinations stressful, but that effective schools will help anxious children to meet the demands which are made of them.[278] The Minister told us that he did not accept the idea that the amount of time taken to prepare for national tests was too stressful:

    I think that life has its stresses and that it is worth teaching a bit about that in school.

Grade inflation

155. The concept of grade inflation is another phenomenon associated with high-stakes uses of testing. However, before we consider the evidence we have received on this subject, we will clarify the use of some terms. Dr Ken Boston distinguished assessment standards from performance standards. Assessment standards denote the degree of difficulty of a test, in his words, the height of the hurdle to be jumped by the student. Through its regulatory role and through its subsidiary, the NAA (National Assessment Agency) which delivers National Curriculum tests, the QCA attempts to maintain the assessment standard constant year on year. Performance standards relate to the distribution of students' grades or levels according to a target standard, in other words, the number of students who clear the hurdle each year. The QCA's data suggest that the performance standard has been rising.[279]

156. The concept of grade inflation relates specifically to a reduction in assessment standards, thereby making it easier for students to achieve higher grades or levels. For example, there is an annual debate following publication of GCSE and A-level results as to whether the steady increase in the proportion of students getting the higher grades is genuinely evidence of an improvement in performance standards or whether it is explained by a lowering of assessment standards. Similar arguments have been rehearsed in relation to Key Stage tests.[280] However, the debate is really much wider than this because it relates to whether or not national tests are an adequate proxy for the underlying learning and achievement of pupils. To access this debate, we need to consider in more detail the concept of 'standards over time'.

STANDARDS OVER TIME

157. The Government notes that the strength and validity of the accountability regime, which is based on performance standards, requires that assessment standards remain consistent over time. The QCA is responsible for this and the Government relates that its processes have been found to be robust and world-class.[281]

158. It is uncontroversial that test scores have improved across the board over time. However, the concept of standards over time is problematic. First, the standards themselves are the result of working groups and various consultations; they embody a series of values, are expressed in everyday language and, as a result of all of this, are open to interpretation by those using them.[282] In addition, the descriptions of the standards themselves have changed over time.[283] Professor Colin Richards states that there is no published evidence on the extent to which national standards (as embodied in the level descriptions) have been reflected in the national tests or in the tests used by researchers to assess children's performance over time. Without such evidence, it is not possible to be certain that any apparent improvement in performance standards is genuine or an artefact.[284]

159. Second, the tests have changed over time, some of them radically.[285] As Cambridge Assessment put it:

    If you want to measure change, don't change the measure. But the nation does—and should—change/update the National Curriculum regularly. Whenever there is change (and sometimes radical overhaul) the maintenance of test standards becomes a particularly aggressive problem.

160. Research suggests that change in the tests does not necessarily mean that there has been a reduction in the assessment standard. It could just be that the things which are measured and the way they are measured are different. It is not a simple matter to establish whether or not an increase in test scores is evidence of a rise in genuine performance standards.[286] Even where the curriculum has apparently not changed very much, the way it is taught may have changed considerably.[287] In addition, an apparent increase in standards over time according to test scores may be misleading, since the tests arguably measure such a narrow part of the whole curriculum that they are no longer a valid proxy for achievement across the whole of that curriculum.[288] The NFER stated that:

    There are difficulties in maintaining a constant standard for the award of a level in a high stakes system where tests or questions cannot be repeated. We do though believe that the methods used for this currently which include year-on-year equating and the use of a constant reference point through an unchanging "anchor test" are the best available. A second consideration is that the curriculum coverage each year is limited to the content of that year's tests.[289]

161. We are persuaded by the evidence that it is entirely possible to improve test scores through mechanisms such as teaching to the test, narrowing the curriculum and concentrating effort and resources on borderline students. It follows that this apparent improvement may not always be evidence of an underlying enhancement of learning and understanding in pupils.

162. We consider that the measurement of standards across the full curriculum is virtually impossible under the current testing regime because national tests measure only a small sample of pupils' achievements; and because teaching to the test means that pupils may not retain, or may not even possess in the first place, the skills which are supposedly evidenced by their test results.[290]

KEY STAGE TESTS

163. The ATL highlighted evidence that an apparent improvement in standards of performance has less to do with an improvement in underlying achievement and more to do with familiarity amongst teachers and students with test requirements. It points to research which has demonstrated that changes in tests lead to a sudden fall in performance standards, followed by an improvement as teachers begin to understand how to teach to the new test.[291] Evidence from the IPPR appears to support this view. It quotes research demonstrating that dramatic improvements in results at Key Stage 2 are not borne out when independent measures are used, which show a much less marked improvement than is suggested by the Key Stage 2 test results.[292] On the other hand, the IPPR finds that research evidence on changes in assessment standard at Key Stage 2 over time are inconclusive.[293] The IPPR believes that there has been real progress in each of the three core subjects, but less than is indicated by Key Stage test results. It does not consider that the tests have become systematically easier, but thinks that teaching and learning has focused increasingly more narrowly on achieving good test results.[294] Professor Peter Tymms broadly agrees with this assessment. His research has led him to the conclusion that the substantial improvements suggested by the test scores were illusory and that there had been some improvement in the underlying attainment in mathematics and writing, but no discernable improvement in reading.[295]

164. Professor Colin Richards submitted a review of a considerable amount of literature relating to performance standards in primary schools. He summarises his findings as follows:

  • The data on performance relate only to three subjects (English, mathematics and science) and only to pupils aged 7 and 11.
  • Key Stage test results show a considerable rise in children's performance in English and mathematics from 1996 to 2001 followed by a general levelling off thereafter.
  • This rise in test scores does not necessarily involve a rise in performance against national standards unless these standards have been embodied in the same way and to the same degree in successive tests. However there is no evidence that this has been the case.
  • Ofsted has published no inspection evidence on either national standards or performance in relation to those standards. It has simply relied on reporting national test data.
  • A number of major research projects throw doubt on the considerable rise in performance shown in the national test data.[296]

Professor Richards concludes that it is not possible to answer with precision whether standards in primary schools are improving. As other witnesses have suggested, he finds evidence to indicate that there was some rise in performance in the core subjects between 1995 and 2001, but not as great as national test data have suggested.[297]

14-19 QUALIFICATIONS

165. In relation to public examinations, the Government points to evidence that A-level standards have remained consistent for at least 20 years, although an increased "breadth of coverage led to a reduced emphasis on some topics".[298] The Government points to other evidence suggesting that A-levels are the most tightly and carefully managed tests at school or any other level; that strategies for maintaining comparable standards across Awarding Bodies are adequate; that Awarding bodies have "broadly consistent and well-regulated systems for setting question papers, managing marking and awarding grades"; and that the QCA has robust systems for monitoring and regulating the work of the Awarding Bodies.[299]

166. It is highly questionable whether a claim can validly be made that A-levels have remained at a consistent standard over a period as long as 20 years, or indeed anything like it. The DfES itself gave an account of the considerable changes which have been made to this qualification over the years, not the least of which is Curriculum 2000 and piloting of tougher questions in A-level papers to stretch candidates and aid differentiation for universities.[300] According to the DfES, the standard required to achieve an A grade will remain the same, but stronger candidates will be able to demonstrate attainment meriting a new A* grade.[301] The DfES memorandum states:

    As our response to criticisms about GCSE and A-level assessment shows, the system has constantly evolved to meet changing needs and it will continue to do so.[302]

Without providing some evidence in relation to the underlying assessment standards and the levels against which they are referenced, the Government cannot have it both ways: either standards have been constant over time, or change has been implemented in response to perceived shortcomings in the system. As Edexcel argued:

    The curriculum has changed over time, new elements have been introduced and different approaches rewarded. To accurately measure such progress, the curriculum would need to be stable and the same test used each year.[303]

167. Research suggests that A-levels have not necessarily become easier, but the examination no longer measures what it used to. Examinations have become more predictable so that teachers have become more effective at coaching pupils and, correspondingly, pupils have become deskilled. It follows, he argues, that one can no longer infer from a top grade that the pupil achieving it has the same skills that a pupil achieving a top grade years ago had.[304]

168. Professor Peter Tymms told us that his research suggested that assessment standards at GCSE appear to be relatively stable over a number of years. However, at A-level they have not, with pupils of a particular ability getting higher grades now than they would have done some years ago. The biggest change has been in mathematics, in which a D grade some years ago would now be the equivalent of a B grade.[305] Again, whereas 30% of A-level candidates used to fail, getting less than an E grade, now just a small percentage fail outright, which Professor Tymms characterises as a "dramatic shift".[306] However, an improvement in overall grades does not necessarily mean that assessment standards have been debased. Professor Tymms' evidence certainly shows that children of the same ability are now getting higher grades than they would have received some years ago, but the reasons for this may be complex. It could, for example, be partially attributed to teaching standards, but the question cannot be answered without an appreciation of equivalent assessment standards.[307] Unfortunately, unlike Key Stage tests, A-levels and GCSEs are not pre-tested by the regulator or Awarding Bodies and no items used in previous years are repeated in following years, so there is no way to reference assessment standards from one year to the next. Standard-setting is done retrospectively on the basis of statistical relationships and judgments.[308]

169. Some witnesses have queried whether a system which allows the existence of multiple Awarding Bodies can ever really ensure that the standards of the assessments produced by those Awarding Bodies are the same.[309] We have received evidence that some schools choose a syllabus from a given exam board on the basis that they consider it easier, and therefore more likely that their pupils will achieve higher grades.[310] In addition, there is some suggestion that some subjects, such as mathematics and the sciences, are 'harder' than others, so that pupils and schools are more likely to choose 'easier' subjects in an effort to maximise grades.[311]

INTERNATIONAL EVIDENCE

170. Witnesses have discussed some of the available international evidence of performance at the secondary stage of education. The essential paradox appears to be that, whilst test scores are improving at home, international rankings are either static or falling. The percentages of pupils achieving the target standards at Key Stages 3 and 4 have risen over time according to domestic test results, yet progress on international attainment measures has stalled. Evidence from TIMSS for Key Stage 3 shows no significant change in performance between 1995 and 2003; and PISA shows that, for a given score at Key Stage 3 or 4, pupils attained on average a higher PISA score in 2000 than in 2003. Although the UK's response rate to the 2003 PISA survey was too low to ensure statistical comparability, the mean score produced was lower than that in 2000 and led to the UK falling down the international ranking.[312] One possible explanation, according to the IPPR, is that Key Stage 3 and 4 test scores are not consistent over time. Its preferred explanation, however, is that "improvements in the key stage results do not accurately mirror improvements in underlying pupil attainment, and that some of the improvement is due to more narrowly focused teaching".[313]

171. It is not possible for us to come to a definitive view on grade inflation in the context of such a wide-ranging inquiry. However, it seems clear to us from the evidence that we have received that the Government has not engaged with the complexity of the technical arguments about grade inflation and standards over time. We recommend that the Government addresses these issues head-on, starting with a mandate to the QCA or the proposed new regulator to undertake a full review of assessment standards.

Accountability through sampling

172. Whilst the use of saturation testing, that is, the testing of each child in a given cohort, is generally agreed to be an appropriate means of ascertaining and certifying individual pupil and, to a certain extent, school achievement[314], there is rather more argument about whether saturation testing is an appropriate method of testing local and national performance and monitoring the effects of changes in policy.[315]

173. Witnesses have argued for the decoupling of measures of pupil attainment from accountability and monitoring measures in order to remove the need for central collection of individual pupil performance data, thereby removing the high-stakes for the school.[316] Implicit in this argument is the hope that, once the stakes are removed, the school can get on with the business of teaching children a full and rounded curriculum without fear of recrimination and the children will benefit from the education to which they should, in any event, be entitled.[317] That the tests would, presumably, remain high-stakes for the individual child has largely been ignored in the evidence we have received.

174. Nevertheless, decoupling accountability and monitoring from a testing system which is primarily designed to measure pupil attainment may have a number of desirable consequences in relation to the issues discussed in this chapter. To summarise the arguments which have been put to us, the incentives for schools and teachers to teach to the test would be reduced considerably.[318] Likewise, schools and teachers may be more inclined to withdraw from the disproportionate focus on the core subjects of English, mathematics and science, important as these are, and give some more attention to other subjects, replacing some of the lost variety in the curriculum. Within the core subjects, teachers may feel more at liberty to take a more creative approach to their teaching which may enhance the enjoyment, satisfaction and even attainment of their pupils.[319] Less time spent on test preparation would reduce the perception of the testing system as burdensome and, perhaps, result in reduced stress and demotivation for pupils. Finally, there would be scope for developing a system of accountability which is fairer to schools, teachers and pupils alike and which can give some reassurance to the public about the maintenance of assessment and performance standards over time.[320] The IPPR warned, however, that as long as individual pupils sit national, summative tests (albeit separated from the accountability system), that data exists and can be compiled and presented in school performance tables, whether or not the government chooses to collate and publish those tables centrally. Much the same data would be available as before.[321]

175. It has been widely argued that national cohort sample testing would be a less onerous and more appropriate means of testing local and national performance and monitoring the effects of changes in policy.[322] However, sample testing would not necessarily yield the type of data currently used for individual school accountability. Presumably, if accountability is to be decoupled from national tests designed to measure pupil attainment, different tests or inspections will be required or the concept of school accountability radically overhauled.

176. Dr Ken Boston said the QCA had given advice to the Government on sample testing, but that the Government was more inclined to go in the direction of single-level tests (as to which, see paragraphs 188-198 below), instead setting great store by international sample tests such as PIRLS (Progress in International Reading Literacy Study), PISA (Programme for International Student Assessments) and TIMSS (Trends in International Mathematics and Science Study).[323] He related that he had told the Government that:

    […] there are many purposes that would be served better by different sorts of tests. Indeed, as you know, some time ago I raised the issue of sample testing, on which the Government were not keen for other reasons.[324]

He considered that sample testing, using a standardised test instrument, was the best way of meeting the purpose of discovering national trends in children's performance standards over time. If, on the other hand, the purpose was to compare the performance of school against school, a sample test would not yield the necessary data, but a full cohort test would.[325] He did not believe that Key Stage tests, single-level tests and cohort sampling should be seen as mutually exclusive alternatives: different tests are needed to serve different purposes.[326]

177. The Minister, however, did not agree that alternatives to the current Key Stage tests were workable in practice. He acknowledged that some had argued in favour of sample testing to monitor national performance, but thought that testing should also be able to demonstrate a child's progress against a national comparator, as well as measuring the performance of a particular school. He thought that the use of teacher assessment for these purposes was problematic due to the difficulty of assuring comparability of data. He concluded that:

    When I look at the matter and begin to unravel the alternatives and think about how they would work in practice, I find that the current SATs are much more straightforward—everybody would understand it. They are used for a series of things, and there might be some compromise involved, but the system is straightforward and simple, and it shows what our priorities are and gives us accountability at every level. I do not think that it is a mess at all.[327]

178. The methodology of sample testing is well-established and is used, for example, in international comparison studies such as PISA and TIMMS. It was also used in the UK from the mid-1970s and through the 1980s by the Assessment of Performance Unit ("APU") within the Department for Education. The APU used light sampling of schools and light sampling of pupils within schools.[328] The GTC sets out a number of advantages to this approach, including reduced burden of testing; anonymity of schools and students, ensuring that the tests are low-stakes; wide curriculum coverage; a range of assessment formats can be employed; test items can be repeated over time; the system is relatively inexpensive; it provides good evidence of performance trends; and it is a tried and tested method. Limitations of the approach include the lack of ratings for individual schools; lack of feedback for individual schools; and certain technical complexities leading to difficulty of interpretation of statistical results.[329] The NFER also pointed out some possible drawbacks with a sampling system. Low-stakes assessment may not motivate pupils to try hard and show what they can really do, resulting in a potential underestimate of ability. In addition, there may be practical difficulties with a system relying on voluntary participation of schools and pupils. However, the NFER broadly supports a regular national monitoring programme.[330]

179. The AQA stated that:

    […] a light sampling survey method would enable de-coupling of national assessment from a requirement to deliver robust information on national educational standards. This would enable testing to reflect curriculum change with precision, to optimize the learning-focused functions of testing, and enable constant innovation in the form of tests to optimize accessibility.[331]

Some witnesses have been specific about what they would like to see. The GTC, for example, advocates cohort sampling involving a limited number of pupils in a limited number of schools, using a matrix test structure to allow for multiple tests across the sample to widen the breadth of the curriculum that is being tested. Common questions in any two or more tests would allow for pupils taking different tests to be compared on a common scale. The tests would be administered by teachers, with external support where necessary.[332] The NFER proposed a similar, matrix design.[333]

180. In this context, restoration of the former APU, or something like it, has been a popular theme in evidence.[334] Cambridge Assessment has, however, pointed out a series of technical and political issues which led to the demise of the original APU, stating that its operation was fraught with difficulty. Whilst Cambridge Assessment is in favour of the development of a light sampling, matrix-based model for national monitoring of standards over time, it counsels that this should be done with close attention to the lessons learned from the former APU and from similar systems used internationally.[335]

181. We do not necessarily see the point in creating a new body (or reinstating an old one) for its own sake, but we do think that the body developing and administering sample testing for national monitoring should be independent from government and, for this reason, the proposed new development agency, for example, would not be appropriate for this task.[336] As Professor Colin Richards said:

    An independent body is needed to keep standards under review and to devise a system for assessing performance in relation to […] standards over time—at a national level, not at the level of the individual school.[337]

182. In summary, the discussion in this Chapter has demonstrated that high-stakes testing, that is, testing where the stakes are high for schools and teachers, can lead to distortion of children's education experience where accountability is linked to the same testing system which is designed to measure pupil attainment:

    The full value of a creative, linked curriculum which addresses the interests, needs and talents of all pupils is not exploited because many schools seem to be afraid to innovate when test scores might be affected (even if evidence shows they might go up).[338]

183. Whilst we do not doubt the Government's intentions when it states that "The National Curriculum sets out a clear, full and statutory entitlement to learning for all pupils, irrespective of background or ability", we are persuaded that in practice many children have not received their entitlement and many witnesses believe that this is due to the demands of national testing.

184. We are persuaded that the current system of national tests should be reformed in order to decouple the multiple purposes of measuring pupil attainment, school and teacher accountability and national monitoring. The negative impacts of national testing arise more from the targets that schools are expected to achieve and schools' responses to them than from the tests themselves.

185. School accountability should be separated from this system of pupil testing, and we recommend that the Government consult widely on methods of assuring school accountability which do not impact on the right of children to a balanced education.

186. We recommend that the purpose of national monitoring of the education system, particularly for policy formation, is best served by sample testing to measure standards over time and that cohort testing is neither appropriate nor, in our view, desirable for this purpose. We recommend further that, in the interests of public confidence, such sample testing should be carried out by a body at arms length from the Government and suggest that it is a task either for the new regulator or a body answerable to it.



162   Ev 70 Back

163   Ev 29 Back

164   Q346 Back

165   Ev 55; written evidence from Advisory Committee on Mathematics Education, para 3 Back

166   Ev 234 Back

167   Ev 60 Back

168   Q193 Back

169   Ev 46 Back

170   Ev 102 Back

171   Ev 239 Back

172   Ev 160 Back

173   Ev 160 Back

174   Ev 239 Back

175   Ev 30 Back

176   Ev 48; Ev 57; Ev 59; Ev 60; Ev 68; Ev 69; Ev 73; Ev 83; Ev 115; Ev 199; Ev 200; Ev 217; Ev 224; Ev 227; Ev 269; Ev 271; Q128; Q139; Q169; Q268; written evidence from Heading for Inclusion, Alliance for Inclusive Education, paras 1(c) & (d); written evidence from Portway Infant School; written evidence from Doug French, University of Hull, para 2.1; written evidence from LexiaUK, para 2.8; written evidence from Advisory Committee on Mathematics Education, paras 1 & 26; written evidence from Campaign for Science and Education, para 24; written evidence from Barbara J Cook, Headteacher, Guillemont Junior School, Farnborough, Hants; written evidence from The Royal Society, para 5; written evidence from Association of Science Education, paras 24-25; written evidence from The Mathematical Association; written evidence from S Forrest, Teacher of Mathematics, Wokingham, para 9; written evidence from Science Community Partnership Supporting Education, section 2; written evidence from Warwick Mansell Back

177   Ev 113; Ev 224;written evidence from Association of Science Education, para 22; written evidence from Doug French, University of Hull, para 1.3 Back

178   Written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 2(c) Back

179   Ev 48 Back

180   Ev 217 Back

181   Ev 69 Back

182   Written evidence from The Mathematical Association Back

183   Ev 226; Q128; written evidence from Association of Science Education, para 22; written evidence from Institute of Physics, para 3; written evidence from Campaign for Science and Education, para 23; written evidence from Association of Science Education, para 20 Back

184   Written evidence from The Mathematical Association Back

185   Written evidence from Science Community Partnership Supporting Education, para 1 Back

186   Ev 271 Back

187   Ofsted, Evaluating Mathematics Provision for 14-19 Year Olds, May 2006, http://www.ofsted.gov.uk/assets/4207.pdf  Back

188   Ev 60 Back

189   Ev 60 Back

190   Ev 60 Back

191   Q139; Written evidence from Advisory Committee on Mathematics Education, para 27; written evidence from Association for Achievement and Improvement through Assessment, para 11; written evidence from S Forrest, Teacher of Mathematics, Wokingham, para 9  Back

192   Ev 69 Back

193   Ev 32 Back

194   Ev 227 Back

195   Ev 102; Q228 Back

196   Ev 102 Back

197   Q181 Back

198   Q181 Back

199   Written evidence from Warwick Mansell Back

200   Ev 117-118; Ev 159  Back

201   Ev 111; Q243 Back

202   Exam details get personal, Times Educational Supplement, 10 August 2007 Back

203   Ev 117-118; Ev 159 Back

204   Comparative data could benefit all, Times Educational Supplement, 10 August 2007 Back

205   This genie of technology can help to boost attainment, Times Educational Supplement, 10 August 2007 Back

206   Q255 Back

207   Q228 Back

208   Ev 115; and, for example, Q224; Q292 Back

209   Q329 Back

210   Q370 Back

211   Q370 Back

212   Q371 Back

213   Q371 Back

214   Q291; Q294; Q296 Back

215   Q296 Back

216   Q128 Back

217   Written evidence from Warwick Mansell Back

218   Q268 Back

219   Q270 Back

220   Ev 59; Ev 270; Q46; written evidence from Doug French, University of Hull, para 1.3; written evidence from Association for Achievement and Improvement through Assessment, para 3; written evidence from Barbara J Cook, Headteacher, Guillemont Junior School, Farnborough, Hants; written evidence from Lorraine Smith, Headteacher, Western Church of England Primary School, Winchester, para 7 Back

221   Ev 239; written evidence from Doug French, University of Hull, para 3; written evidence from Advisory Committee on Mathematics Education, para 1; written evidence from Campaign for Science and Education, paras 17-19; written evidence from Association of Science Education, paras 5-6; written evidence from The Mathematical Association Back

222   Written evidence from Doug French, University of Hull, para 1.3 Back

223   Ev 22 Back

224   Q120 Back

225   Ev 160 Back

226   Q46; Q292 Back

227   Ev 160 Back

228   Q128 Back

229   Ev 58 Back

230   Written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 1(f); written evidence from Purbrook Junior School, Waterlooville, para 13 Back

231   Ev 263 Back

232   Written evidence from Purbrook Junior School, Waterlooville, para 13 Back

233   Written evidence from Doug French, University of Hull, section 1.3; see also a research summary submitted by the Wellcome Trust, Ev 269-270 Back

234   Ev 199 Back

235   Ev 271 Back

236   Ev 199 Back

237   Ev 70; Ev 269; written evidence from Doug French, University of Hull, section 1.3;  Back

238   Ev 270; written evidence from Doug French, University of Hull, section 1.3  Back

239   Ev 198; Ev 202 Back

240   Ev 46; Ev 51; Ev 73  Back

241   Written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 1(b) Back

242   Ev 57 Back

243   Ev 74 Back

244   Ev 68 Back

245   Written evidence from Association of Science Education, para 9 Back

246   Q2 Back

247   Q2 Back

248   Q91 Back

249   Ev 121 Back

250   Ev 32 Back

251   Ev 32 Back

252   Ev 263 Back

253   Written evidence from The Royal Society, section 6 Back

254   Ev 202 Back

255   Ev 68 Back

256   Written evidence from Advisory Committee on Mathematics Education, para 2 Back

257   Ev 121-122 Back

258   Ev 72 Back

259   Written evidence from Mathematics in Engineering and Industry Back

260   Written evidence from Association of Science Education, para 11 Back

261   Ev 112 Back

262   Ev 222 Back

263   Ev 103; written evidence from Association of Science Education, para 17  Back

264   Qq 334 and 368 Back

265   Ev 51; Ev 55; Ev 263; Ev 247; written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 2(a); written evidence from Portway Infant School; written evidence from Purbrook Junior School, Waterlooville, para 8; written evidence from Association of Science Education, para 9  Back

266   Ev 73; Ev 263; Q128; Q172; written evidence from Science Community Partnership Supporting Education, section 2 Back

267   Ev 68; Ev 263  Back

268   Ev 200 Back

269   Q134; written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 1(e)  Back

270   Ev 52 Back

271   Ev 272 Back

272   Ev 59 Back

273   Qq 27 and 28 Back

274   Ev 270; written evidence from Institute of Physics, para 3 Back

275   Written evidence from The Mathematical Association Back

276   Written evidence from The Royal Society, para 5; see also written evidence from Association of Science Education, para 2 Back

277   Ev 269 Back

278   Ev 160 Back

279   Q55 Back

280   Ev 236 Back

281   Ev 159 Back

282   Written evidence of Professor Colin Richards, Annex A Back

283   Q27; written evidence of Professor Colin Richards, Annex A Back

284   Written evidence of Professor Colin Richards, Annex A Back

285   Ev 226; written evidence from Advisory Committee on Mathematics Education, paras 18-20; written evidence of Professor Colin Richards, Annex A Back

286   Koretz, D. M., Linn, R. L., Dunbar, S. B., & Shepard, L. A. (1991). The effects of high-stakes testing: preliminary evidence about generalization across tests. Paper presented at the Annual meetings of the American Educational Research Association and the National Council on Measurement in Education held at Chicago, IL.
Cannell, J. J. (1988). Nationally normed elementary achievement testing in America's public schools: how all fifty states are above the national average. Educational Measurement: Issues and Practice, 7(2), 5-9. 
Back

287   Ev 60 Back

288   Ev 60 Back

289   Ev 251 Back

290   Ev 60 Back

291   Ev 61 Back

292   Ev 236 Back

293   Ev 236 Back

294   Ev 236 Back

295   Q6 Back

296   Written evidence of Professor Colin Richards, Annex A Back

297   Written evidence of Professor Colin Richards, Annex A Back

298   Ev 159 Back

299   Ev 159 Back

300   Ev 161; Ev 162 Back

301   Ev 162 Back

302   Ev 161 Back

303   Ev 114 Back

304   Wiliam, D., Brown, M., Kerslake, D., Martin, S., & Neill, H. (1999). The transition from GCSE to A-level in mathematics: a preliminary study. Advances in Mathematics Education, 1, 41-56. Back

305   Q33 Back

306   Q34. This "dramatic shift" may be magnified by the move from a norm referenced to a criterion referenced grading system for A-levels in 1987. Norm referencing means that a pre-set percentage of candidates is awarded each grade. Criterion referencing sets standards against declared criteria of performance, so that the number of candidates achieving a grade A, for example, is not limited by predetermined quotas. Our predecessors reported on this subject: HC 153, House of Commons Education and Skills Committee, A Level Standards, Third Report of Session 2002-03, paras 4-8. Back

307   Q38 Back

308   Qq40 and 45 Back

309   Written evidence from Institute of Physics, 19:7; written evidence from Campaign for Science and Engineering, paras 22-23; The Mathematical Association, Background Paper Back

310   Q131; Q132 Back

311   Written evidence from Institute of Physics, 19:9; written evidence from Campaign for Science and Education, 21:3 Back

312   Ev 58 Back

313   Ev 237-238 Back

314   Written evidence from Doug French, University of Hull, para 1.1 Back

315   Ev 48; written evidence from Doug French, University of Hull, para 1.1 Back

316   Ev 58; Ev 79-81; Ev 265-266 Back

317   Written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 1 Back

318   Ev 258-259 Back

319   Written evidence from Association of Science Education, para 18 Back

320   Ev 58; Ev 80; Ev 240 Back

321   Ev 240 Back

322   Ev 48; Ev 55; Ev 74; Ev 103; Ev 116; Ev 230; Ev 240; Ev 259; Q3; Q12; Q138; Q152; Q162; written evidence from Doug French, University of Hull, para 1.1; written evidence from Dr A Gardiner, University of Birmingham, para 9; written evidence from Science Community Partnership Supporting Education, section 3  Back

323   Q88 Back

324   Q93 Back

325   Q100 Back

326   Q109 Back

327   Q329 Back

328   Ev 84-86 Back

329   Ev 84-86 Back

330   Ev 259 Back

331   Ev 208 Back

332   Ev 80-81; further detail at Ev 84-86 Back

333   Ev 258-259 Back

334   Ev 48; Ev 266; Ev 267; Q138; written evidence from Doug French, University of Hull, para 1.1 Back

335   Ev 217-218 Back

336   Ev 266; Ev 267 Back

337   Written evidence from Professor Colin Richards, Annex A Back

338   Written evidence from Association of Science Education, para 18 Back


 
previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries index

© Parliamentary copyright 2008
Prepared 13 May 2008