Select Committee on Children, Schools and Families Third Report


2  THE PURPOSES OF TESTING AND FITNESS FOR PURPOSE

27. David Bell, Permanent Secretary at the DCSF, has set out the Department's view of the key purposes of national tests:

    We want them to provide objective, reliable information about every child and young person's progress. We want them to enable parents to make reliable and informative judgments about the quality of schools and colleges. We want to use them at the national level, both to assist and identify where to put our support, and also, we use them to identify the state of the system and how things are moving. As part of that, both with national tests and public examinations, we are very alive to the need to have in place robust processes and procedures to ensure standards over time.[28]

28. The written evidence of the DfES similarly set out a variety of purposes of testing, stating that National Curriculum testing was developed to complement existing public examinations for the 16+ age group and that it is geared towards "securing valid and reliable data about pupil performance, which is used for accountability, planning, resource allocation, policy development and school improvement".[29]

29. The DfES elaborated on the uses to which data derived from examination results are put. National performance data are used to develop government policy and allocate resources. Local performance data are used for target-setting and to identify "areas of particular under-performance". School performance data form the basis for the findings of inspectors and interventions from School Improvement Partners. Parents make use of school data to make choices about their children's education. The DfES considered that school performance data is an important mechanism for improving school performance and for assisting schools to devise their own improvement strategies. Finally, the DfES stated that examination results for each individual child are "clear and widely-understood measures of progress", which support a personalised approach to teaching and learning and the realisation of each child's potential.[30]

Fitness for purpose

30. In coming to a view on the government's use of test results for this wide variety of purposes, we have been assisted by the QCA's paper, setting out a framework for evaluating assessment systems.[31] The QCA highlights the importance of distinguishing the many purposes for which assessment may be used and gives examples of possible interpretations of the term 'purpose of assessment':

    1.  to generate a particular kind of result, such as ranking pupils in terms of end-of-course level of attainment;

    2.  to enable a particular kind of decision, such as deciding whether a pupil has learned enough of a particular subject to allow them to move on to the next level;

    3.  to bring about a particular kind of educational or social impact, for example, to compel pupils to learn a subject thoroughly and to compel teachers to align their teaching with the National Curriculum; or the study of GCSE science to support progression to a higher level of study for some pupils and to equip all pupils with sufficient scientific literacy to function adequately as 21st century citizens.[32]

31. Clearly, interpretations of the purposes of assessment may be very wide or very narrow, but the important point is that there are a large number of possible purposes. The QCA asks us to consider the uses to which assessment results are put (interpretation 2 above) and distinguishes the four uses set out in the classification scheme established by the Task Group on Assessment and Testing in its 1988 report[33], which are described by the QCA in the following manner:

  • formative uses (assessment for learning);
  • summative uses (assessment of learning);
  • evaluative uses (assessment for accountability); and
  • diagnostic uses (assessment for special intervention).[34]

32. This classification scheme has been used widely in evidence submitted to this inquiry and we, likewise, rely on it extensively in our Report. It should be noted that these categories are not necessarily discreet and the QCA notes many examples of uses to which the results of the national testing system are put which may fall under more than one of the headings of the broad, four-limb classification. The QCA's non-exhaustive list of examples, reproduced at Figure 1, sets out 22 possible uses of assessment results.

Figure 1 Some examples of the uses to which assessment results can be put[35]



Source: QCA.

33. Each one of these possible uses of assessment results can, in itself, be seen as a purpose of assessment, depending on the context. Where an assessment instrument is designed and used only for one purpose, the answer to the question "is it fit for purpose" is the result of a relatively straightforward process of evaluation. However, the government's evidence, set out in paragraphs 27-29 above, highlights the fact that national tests are used for a wide variety of purposes at a number of different levels: national, local, school and individual.

34. Each instrument of assessment is (or should be) designed for a specific purpose or related purposes. It will only be fit (or as fit as a test instrument can be) for those purposes for which it is designed. The instrument will not necessarily be fit for any other purposes for which it may be used and, if it is relied upon for these other purposes, then this should be done in the knowledge that the inferences and conclusions drawn may be less justified than inferences and conclusions drawn from an assessment instrument specifically designed for those purposes.[36]

35. The DfES recognised that an assessment system inevitably makes trade-offs between purposes, validity, reliability and manageability. However, the evidence from the DfES and the DCSF has been consistent: that the data derived from the current testing system "equips us with the best data possible to support our education system".[37] David Bell, Permanent Secretary at the DCSF, told us that:

    I think that our tests give a good measure of attainment and the progress that children or young people have made to get to a particular point. It does not seem to be incompatible with that to then aggregate up the performance levels to give a picture of how well the school is doing. Parents can use that information, and it does not seem to be too difficult to say that, on the basis of those school-level results, we get a picture of what is happening across the country as a whole. While I hear the argument that is often put about multiple purposes of testing and assessment, I do not think that it is problematic to expect tests and assessments to do different things.[38]

36. Dr Ken Boston of the QCA told us that the current Key Stage tests were fit for the purpose for which they were designed, that is, "for cohort testing in reading, writing, maths and science for our children at two points in their careers and for reporting on the levels of achievement".[39] The primary purpose of Key Stage tests was "to decide the level that a child has reached at the end of a Key Stage".[40] He explained that Key Stage tests are developed over two and a quarter years, that they are pre-tested and run through teacher panels twice and that the marking scheme is developed over a period of time. He considers that Key Stage tests are as good as they can be and entirely fit for their design purpose.[41] Dr Boston noted, however, that issues were raised when, having achieved a test which is fit for one purpose, it is then used for other purposes. Figure 1 above lists 22 purposes currently served by assessments and, of those, 14 are being served by Key Stage tests.

    My judgment is that, given that there are so many legitimate purposes of testing, and [Figure 1 above] lists 22, it would be absurd to have 22 different sorts of tests in our schools. However, one serving 14 purposes is stretching it too far. Three or four serving three or four purposes each might get the tests closer to what they were designed to do. […] when you put all of these functions on one test, there is the risk that you do not perform any of those functions as perfectly as you might. What we need to do is not to batten on a whole lot of functions to a test, but restrict it to three or four prime functions that we believe are capable of delivering well.[42]

37. Similarly, Hargreaves et al argue that one test instrument cannot serve all the Government's stated purposes of testing because they conflict to a certain extent, so that some must be prioritised over others. According to them, the purpose of assessment for learning has suffered at the expense of the other stated purposes whereas, in their view, it should have priority.[43] The conflicts between the different purposes are not, perhaps, inherent, but arise because of the manner in which people change their behaviour when high-stakes are attached to the outcomes of the tests. Many others have raised similar points, claiming that two purposes in particular, school accountability on the one hand and promoting learning and pupil progress on the other, are often incompatible within the present testing system.[44] The practical effects of this phenomenon will be discussed further in Chapter 4. However, we have been struck by the depth of feeling on this subject, particularly from teachers.

38. The GTC (General Teaching Council for England) argues that reliance on a single assessment instrument for too many purposes compromises the reliability and validity of the information obtained. It claims that the testing system creates tensions that "have had a negative impact upon the nature and quality of the education" received by some pupils. It concludes that "These tensions may impede the full realisation of new approaches to education, including more personalised learning".[45]

39. The NUT (National Union of Teachers) stated that successive governments have ignored the teaching profession's concerns about the impact of National Curriculum testing on teaching and learning and it believes that this is "an indictment of Government attitudes to teachers' professional judgment".[46] The NUT argues further that:

    It is the steadfast refusal of the Government to engage with the evidence internationally about the impact of the use of summative test results for institutional evaluation which is so infuriating to the teaching profession.[47]

40. An NUT study, published in 2003, found that the use of test results for the purpose of school accountability had damaging effects on teachers and pupils alike. Teachers felt that the effect was to narrow the curriculum and distort the education experience of pupils. They thought that the "excessive time, workload and stress for children was not justified by the accuracy of the test results on individuals"[48].

41. Others have argued that the use of national testing for the twin aims of pupil learning and school accountability has had damaging effects on children's education experience. Hampshire County Council accepts that tests are valuable in ascertaining pupil achievement but is concerned that their increasingly extensive use for the purposes of accountability "has now become a distraction for teachers, headteachers and governing bodies in their core purpose of educating pupils".[49] The Council continues:

    Schools readily acknowledge the need to monitor pupil progress, provide regular information to parents and use assessment information evaluatively for school improvement. The key issue now is how to balance the need for accountability with the urgent need to develop a fairer and more humane assessment system that genuinely supports good learning and teaching.[50]

42. It is not a necessary corollary of national testing that schools should narrow the curriculum or allow the tests to dominate the learning experience of children, yet despite evidence that this does not happen in all schools there was very wide concern that it is common. We return to these concerns in Chapter 4.

43. The NUT highlighted evidence which suggests that teachers feel strongly that test results do not accurately reflect the achievements of either pupils or a school.[51] The NAHT considers that Key Stage tests provide one source of helpful performance data for both students and teachers, but that it is hazardous to draw too many conclusions from those data alone. They argue that "A teacher's professional knowledge of the pupil is vital— statistics are no substitute for professional judgment".[52] On the subject of school performance, the NAHT states that Key Stage test results represent only one measure of performance amongst a wide range, from financial benchmarking through to full Ofsted inspections. It considers that self-evaluation, taken with other professional educational data, "is far more reliable than the one-dimensional picture which is offered by the SATs".[53] The Association of Colleges stated that performance tables constructed from examination results data do not adequately reflect the actual work of a school and that the emphasis on performance tables risks shifting the focus of schools from individual need towards performance table results.[54]

44. The evidence we have received strongly favours the view that national tests do not serve all of the purposes for which they are, in fact used. The fact that the results of these tests are used for so many purposes, with high-stakes attached to the outcomes, creates tensions in the system leading to undesirable consequences, including distortion of the education experience of many children. In addition, the data derived from the testing system do not necessarily provide an accurate or complete picture of the performance of schools and teachers, yet they are relied upon by the Government, the QCA and Ofsted to make important decisions affecting the education system in general and individual schools, teachers and pupils in particular. In short, we consider that the current national testing system is being applied to serve too many purposes.

Validity and reliability

45. If the testing system is to be fit for purpose, it must also be valid and reliable.[55] City and Guilds, an Awarding Body accredited by the QCA, has told us:

    […]there is considerable obligation on the designer of tests or assessments to make them as efficient and meaningful as possible. Assessment opportunities should be seen as rare events during which the assessment tool must be finely tuned, accurate and incisive. To conduct a test that is inaccurate, excessive, unreliable or inappropriate is unpardonable.[56]

46. Although there is no consensus in the evidence on the precise meanings of the terms 'validity' and 'reliability', we have had to come to a working definition for our own purposes. 'Validity' is at the heart of this inquiry and we take it to refer to an overall judgment of the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment. This judgment is based on the premise that the tests in fact measure what it is claimed that they measure or, as the NFER (National Foundation for Educational Research) puts it, "the validation of a test consists of a systematic investigation of the claims being made for it".[57]

47. Our definition of validity is a broad definition precisely because it includes the concept of reliability: an assessment system cannot be valid without being reliable. 'Reliability' we define as the ability to produce the same outcome for learners who reach the same level of performance.

VALIDITY

48. If a valid test is defined as one that actually measures what it is claimed it measures, the NFER considered that Key Stage tests would be valid if they "give an accurate and useful indication of students' English, science or mathematical attainment in terms of National Curriculum levels".[58] The NFER made the following assessment of the coverage of the total curriculum:

    The tests do have limited coverage of the total curriculum: the English tests omit Speaking and Listening, the science tests formally omit the attainment target dealing with scientific enquiry (though questions utilising aspects of this are included) and mathematics formally omits using and applying mathematics. Outside of these the coverage of content is good. The fact that the tests change each year means that the content is varied and differing aspects occur each year.[59]

The NFER stated that the current tests adequately serve the accountability purposes of testing. They may not meet so successfully the standards of validity necessary for the purpose of national monitoring, although the NFER believed that the tests are as good as they can be for this purpose. The NFER said that, in principle, if there was to be an assessment system with the sole purpose of national monitoring of standards using comparable measures, then a low-stakes, lightly-sampled survey was probably the most valid form of assessment.

49. The validity of the current testing system has elsewhere been repeatedly challenged in the evidence to this inquiry. Whilst asserting that the Key Stage tests are fit for purpose, the QCA has acknowledged that:

    Like any tests, however well designed, they can measure only a relatively narrow range of achievement in certain subjects on a single occasion and they cannot adequately cover some key aspects of learning.[60]

50. Many witnesses are less content than the NFER with coverage of the National Curriculum and have challenged the validity of national tests on grounds that they test only a narrow part of the set curriculum and a narrow range of a pupil's wider skills and achievements.[61] It is also argued that existing tests measure recall rather than knowledge[62] and neglect skills which cannot easily be examined by means of an externally-marked, written assessment.[63] Furthermore, to enhance the ability of pupils to recall relevant knowledge in an examination, thereby improving test scores, teachers resort to coaching, or 'teaching to the test',[64] and to teaching only that part of the curriculum which is likely to be tested in an examination.[65] The Government does not intend it, but it is undeniable that the high stakes associated with achieving test benchmarks has led schools and teachers to deploy inappropriate methods to maximise the achievement of benchmarks. This is examined in Chapter 4. For now, we note that these phenomena affect the validity of the examination system as a whole, not just test instruments in particular, because the education experience of a child is arguably directly affected by the desire of some teachers and schools to enhance their pupils' test results at the expense of a more rounded education.

RELIABILITY

51. Professors Black, Gardner and Wiliam argued that the reliability of national tests and testing systems is limited; that the results of such systems are misused; and that the effects of such misuse would be reduced if test developers were required to inform the public of the margins of error inherent in these testing systems. They stressed that limited reliability of testing systems is systemic and inevitable and does not imply lack of competence or professionalism on the part of test developers.[66] The results of any assessment system are subject to measurement error because they are based on a limited sample of a candidate's attainment. In order that the testing system should be manageable and affordable, only a limited number of questions can be set, to be answered in a limited time and on a given day. Variations in results for a given candidate will arise out of the particular topics and skills tested in the particular test instrument and out of the performance of the candidate on the day. Other evidence has suggested that children aged ten or eleven exhibit increased tension and stress when facing a week of examinations in which they are expected to demonstrate "the full extent of their learning from seven years of education".[67] This may affect examination performance. Black et al stated that the 'true score' of a candidate can never be known because it is practically impossible to test more than a limited sample of his or her abilities.[68] Indeed, their evidence was that up to 30% of candidates in any public examination in the UK will receive the wrong level or grade, a statistical estimate which has also been quoted by others in evidence.[69] Dr Boston of the QCA accepted that error in the system exists, but said he was surprised by a figure as high as 30%.[70] Jon Coles, Director of 14-19 Reform at the DCSF, told us that:

    […] I simply do not accept that there is anything approaching that degree of error in the grading of qualifications, such as GCSEs and A-levels. The OECD has examined the matter at some length and has concluded that we have the most carefully and appropriately regulated exam system in the world.[71]

    […] I can say to you without a shadow of a doubt—I am absolutely convinced—that there is nothing like a 30% error rate in GCSEs and A-levels.[72]

52. We suspect that the strength of this denial stemmed from a misunderstanding of the argument made by Black et al. In their argument, they make the assumptions that tests are competently developed and that marking errors are minimal.[73] The inherent unreliability of the tests stems from the limited knowledge and skills tested by the assessment instrument and variations in individuals' performance on the day of the test.[74] This does not impugn the work of the regulator or the test development agencies and very little can be done to enhance reliability whilst maintaining a manageable and affordable system. The NFER gave similar evidence that the current Key Stage tests:

    […] have good to high levels of internal consistency (a measure of reliability) and parallel form reliability (the correlation between two tests). Some aspects are less reliable, such as the marking of writing, where there are many appeals/reviews. However, even here the levels of marker reliability are as high as those achieved in any other written tests where extended writing is judged by human (or computer) grades. The reliability of the writing tests could be increased but only by reducing their validity. This type of trade off is common in assessment systems with validity, reliability and manageability all in tension.[75]

53. Black et al identify that reliability could theoretically be enhanced in a number of ways:

  • Narrowing the range of question types, topics and skills tested; but the result would be less valid and misleading in the sense that users of that information would have only a very limited estimate of the candidates' attainments.
  • Increasing the testing time to augment the sample of topics and skills tested; however, reliability increases only marginally with test length.[76] For example, to reduce the proportion of pupils wrongly classified in a Key Stage 2 test to within 10%, it is estimated that 30 hours of testing would be required. (The NFER expressed the view that the present tests provide as reliable a measurement of individuals as is possible in a limited amount of testing time.[77])
  • Collating and using information that teachers have about their pupils. Teachers have evidence of performance on a range of tasks, in many different topics and skills and on many different occasions.

54. Black et al conclude this part of their argument by stating that, when results for a group of pupils are aggregated, the result for the group will be closer to the 'true score' because random errors for individuals—which may result in either higher or lower scores than their individual 'true score'—will average out to a certain extent.[78] The NFER went further, stating that aggregated results over large groups such as reasonably large classes and schools give an "extremely high" level of reliability at the school level.[79] Nevertheless, Black et al argue that not enough is known about the margins of error in the national testing system. Professor Black wrote to the QCA to enquire whether there was any research on reliability of the tests which it develops:

    The reply was that "there is little research into this aspect of the examining process", and [the QCA] drew attention only to the use of borderline reviews and to the reviews arising from the appeals system. We cannot see how these procedures can be of defensible scope if the range of the probable error is not known, and the evidence suggests that if it were known the volume of reviews needed would be insupportable.[80]

55. Black et al go on to argue that it is profoundly unsatisfactory that a measure of the error inherent in our testing system is not available, since important decisions are made on the basis of test results, decisions which will be ill-judged if it is assumed that these measures are without error. In particular, they argue that current policy is based on the idea that test results are reliable and teachers' assessments are unreliable. They consider that reliability could, in fact, be considerably enhanced by combining the two effectively and that work leading in this direction should be prioritised.[81] Black et al conclude that:

    […] the above is not an argument against the use of formal tests. It is an argument that they should be used with understanding of their limitations, an understanding which would both inform their appropriate role in an overall policy for assessment, and which would ensure that those using the results may do so with well-informed judgement.[82]

56. Some witnesses have emphasised what they see as a tension between validity and consistency in results. The argument is that, over time, national tests have been narrowed in scope and marking schemes specified in an extremely detailed manner in order to maximise the consistency of the tests. In other words, candidates displaying the same level of achievement in the test are more likely to be awarded the same grade since there is less room for the discretion of the examiner. However, it is argued further that this comes at the expense of validity, in the sense that the scope of the tests are narrowed so much that they test very little of either the curriculum or the candidate's wider skills.[83] Sue Hackman, Chief Adviser on School Standards at the DCSF, recognised this trade-off. However, she also told us that in relation to Key Stage tests the Department, together with the QCA, has tried to include a range of questions in test papers, some very narrow and others rather wider. In this way, she considered that a compromise has been reached between "atomistic and reliable questions, and wide questions that allow pupils with flair and ability to show what they can do more widely".[84]

57. Many witnesses have called for greater emphasis on teacher assessment in order to enhance both the validity and the reliability of the testing system.[85] A move towards a better balance between regular, formative teacher assessment and summative assessments —the latter drawn from a national bank of tests, to be externally moderated—would provide a more rounded view of children's achievements, and many have criticised the reliance on a 'snapshot' examination at a single point in time.[86]

58. We consider that the over-emphasis on the importance of national tests, which address only a limited part of the National Curriculum and a limited range of children's skills and knowledge has resulted in teachers narrowing their focus. Teachers who feel compelled to focus on that part of the curriculum which is likely to be tested may feel less able to use the full range of their creative abilities in the classroom and find it more difficult to explore the curriculum in an interesting and motivational way. We are concerned that the professional abilities of teachers are, therefore, under-used and that some children may suffer as a result of a limited educational diet focussed on testing. We feel that teacher assessment should form a significant part of a national assessment regime. As the Chartered Institute of Educational Assessors states, "A system of external testing alone is not ideal and government's recent policy initiatives in progress checks and diplomas have made some move towards addressing an imbalance between external testing and internal judgements made by those closest to the students, i.e. the teachers, in line with other European countries".[87]

Information for the public

59. The National Foundation for Educational Research stated that no changes should be made to the national testing system without a clear statement of the purposes of that system in order or priority. The level of requirements for validity and reliability should be elucidated and it should be made clear how these requirements would be balanced against the need for manageability and cost-effectiveness.[88] The NFER commented that Key Stage testing in particular:

    […] is now a complex system, which has developed many different purposes over the years and now meets each to a greater or lesser extent. It is a tenet of current government policy that accountability is a necessary part of publicly provided systems. We accept that accountability must be available within the education system and that the assessment system should provide it. However, the levels of accountability and the information to be provided are open to considerable variation of opinion. It is often the view taken of these issues which determines the nature of the assessment system advocated, rather than the technical quality of the assessments themselves.[89]

60. Cambridge Assessment criticised agencies, departments and Government for exaggerating the technical rigour of national assessment. It continued:

    […] any attempts to more accurately describe its technical character run the risk of undermining both the departments and ministers; '[…] if you're saying this now, how is it that you said that, two years ago […]'. This prevents rational debate of problems and scientifically-founded development of arrangements.[90]

Cambridge Assessment stated further that international best practice dictates that information on the measurement error intrinsic to any testing system should be published alongside test data and argues that this best practice should be adopted by the Government.[91] Professor Peter Tymms of Durham University similarly argued that:

    […] it would certainly be worth trying providing more information. I think that the Royal Statistical Society's recommendation not to give out numbers unless we include the uncertainties around them is a very proper thing to do, but it is probably a bit late.[92]

61. We are concerned about the Government's stance on the merits of the current testing system. We remain unconvinced by the Government's assumption that one set of national tests can serve a range of purposes at the national, local, institutional and individual levels. We recommend that the Government sets out clearly the purposes of national testing in order of priority and, for each purpose, gives an accurate assessment of the fitness of the relevant test instrument for that purpose, taking into account the issues of validity and reliability.

62. We recommend further that estimates of statistical measurement error be published alongside test data and statistics derived from those data to allow users of that information to interpret it in a more informed manner. We urge the Government to consider further the evidence of Dr Ken Boston, that multiple test instruments, each serving fewer purposes, would be a more valid approach to national testing.


28   Q287 Back

29   Ev 157 Back

30   Ev 158-159 Back

31   Ev 21 Back

32   Ev 23 Back

33   DES (1988), Task Group on Assessment and Testing: A Report, London, HMSO Back

34   Ev 23 Back

35   Ev 24 Back

36   Ev 24-25 Back

37   Ev 157 Back

38   Q290 Back

39   Q79 Back

40   Q84 Back

41   Q79; see also Ev 31 Back

42   Q79 Back

43   "System Redesign-2: assessment redesign", David Hargreaves, Chris Gerry and Tim Oates, November 2007, pp28-29 Back

44   Ev 261; Ev 264; Ev 198; Ev 273; Ev 75; Ev 47; Q134; Q237; written evidence from Association of Science Education, paras 5-6; written evidence from The Mathematical Association, under headings "General Issues" and "National Key Stage Tests";  Back

45   Ev 75 Back

46   Ev 261 Back

47   Ev 264 Back

48   Ev 263 Back

49   Ev 272 Back

50   Ev 273 Back

51   Ev 263 Back

52   Ev 68 Back

53   Ev 69 Back

54   Ev 198 Back

55   Ev 233 Back

56   Ev 110-111 Back

57   Ev 257 Back

58   Ev 257 Back

59   Ev 257-258 Back

60   Ev 32 Back

61   Ev 56; Ev 71; Q128; written evidence from the Advisory Committee on Mathematics Education, paras 18-20; written evidence from Association for Achievement and Improvement through Assessment, para 4 Back

62   Ev 263; Ev 269; written evidence from Barbara J Cook, Headteacher, Guillemont Junior School, Farnborough, Hants Back

63   Ev 238; Ev 239; written evidence from Doug French, University of Hull, para 2.2; written evidence from Association for Achievement and Improvement through Assessment, para 4 Back

64   Ev 75; Ev 56; written evidence from Doug French, University of Hull, para 1.3 Back

65   Ev 60; Ev 75; Ev 232; Q139; Written evidence from Doug French, University of Hull, para 1.3 Back

66   Ev 202-203 Back

67   Written evidence from Heading for Inclusion, Alliance for Inclusive Education, para 2(a) Back

68   Ev 203 Back

69   Ev 61; Ev 75; Ev 221-222; Ev 226; Q128 Back

70   Q83 Back

71   Q297 Back

72   Q298 Back

73   Ev 202-203 Back

74   Ev 203 Back

75   Ev 257 Back

76   See also Ev 236 Back

77   Ev 257 Back

78   Ev 203-204; see also Ev 226 Back

79   Ev 257 Back

80   Ev 204 Back

81   Ev 204-205 Back

82   Ev 205 Back

83   Ev 226 Back

84   Q324 Back

85   Ev 112; Ev 204; Ev 205; Ev 223; Ev 239; Ev 271; written evidence from the Association for Achievement and Improvement through Assessment, para 5 Back

86   Ev 49; Ev 68; Ev 75; Ev 112; Ev 223; Ev 225; Ev 271; written evidence from Purbrook Junior School, Waterlooville, para 5; written evidence from Association for Achievement and Improvement through Assessment, paras 4-5  Back

87   Ev 222 Back

88   Ev 251 Back

89   Ev 251 Back

90   Ev 251 Back

91   Ev 214-215 Back

92   Q19 Back


 
previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries index

© Parliamentary copyright 2008
Prepared 13 May 2008