Memorandum submitted by Advisory Committee on Mathematics Education

(ACME)

 

Summary of key points

 

1. The continual testing and practising for tests has resulted in a narrow and impoverished mathematics curriculum, and poor quality teaching of that curriculum. This seems likely to explain the failure to raise real standards and the reluctance of students to continue with mathematics.

2. Many post-14 mathematics examinations are not fit for purpose; questions are too fragmented and procedural.

3. There is too much external assessment both pre- and post-14 in mathematics. This should be replaced as much as possible by teacher-led assessment, where necessary moderated by centrally set tasks and local agreement of standards

4. In planning changes to qualifications, national assessment and associated curricula, QCA needs to work much more closely with the relevant subject community from an early stage, rather than conducting superficial consultations when it is too late to avoid problems within a particular subject.

5. QCA must either become a much more effective regulator of awarding bodies, seeking advice from subject communities on specific curriculum and assessment issues, or awarding bodies must be reduced to bidding for the administration and/or assessment of specific qualifications to a nationally agreed design.

6. ACME is concerned that DfES's 'Making Good Progress' proposals could lead to even more pressure on teachers and students to 'teach and learn to the test' unless handled carefully and that QCA's proposed changes to the Secondary Curriculum Mathematics Programme of Study cannot effectively be judged without knowing more about how they will impact on assessment at KS3 and KS4 (and vice versa).

 

What is ACME?

1. The Advisory Committee on Mathematics Education (ACME) is an independent committee, based at the Royal Society and operating under its auspices, which acts as a single voice for the mathematical community on mathematics education issues, seeking to improve the quality of such education in schools and colleges. It advises Government on issues such as the curriculum, assessment and the supply and training of mathematics teachers. ACME was established by the Joint Mathematical Council of the UK and the Royal Society, with the explicit support of all major mathematics organisations, and is funded by the Gatsby Charitable Foundation.

 

 

The ACME report: Assessment in 14-19 Mathematics

2. ACME is delighted to have the opportunity to respond to this call. We published a report[1] on assessment in 14-19 mathematics in 2005, because of a concern that the pre-eminence of testing and assessment in this age group was having a seriously negative effect on the teaching of mathematics. This reduces it in many cases to little more than sequences of lessons on test preparation, constituting fragmented teaching aimed at the answering of short test questions. Evidence for this was confirmed by Making Mathematics Count[2] and in a recent Ofsted report[3] on mathematics teaching 14-19.

3. Apart from doing assessed coursework, which will be abandoned in mathematics GCSE from 2008, in many schools and colleges students rarely get the opportunity to engage with longer problems that require a number of steps and decisions, either set in applied contexts or as pure mathematics. The result is that students find it difficult to apply mathematics they have only practised for examination purposes, and both students and teachers have grown weary of an unchallenging diet. However, many teachers fear to vary it because of constant pressure from headteachers and principals to improve results, especially in low performing schools and colleges. This pressure arises both from publication of annual league tables and the use of these by Ofsted as key inspection evidence. There is some evidence that pressure to increase standards and the type of teaching it generates are a factor contributing to both the departure of mathematics teachers from the profession[4] and the decision of students to stop their study of mathematics at 16[5]. There has been a reluctance on the part of the Government to appreciate that its commendable targets for increased numbers of students studying GCE mathematics (A and AS level), and increasing the percentage of lessons taught by qualified mathematics teachers, will therefore be jeopardised by the assessment regime.

4. The ACME assessment report recommended that:

4.1. The Government should reduce the overall volume and frequency of external assessment in mathematics.

 

4.2. Assessment mechanisms in mathematics should reflect the goals of different 14-19 `pathways' to a qualification. There should be no expectation that the mechanisms adopted for one pathway should be applied to other pathways, nor that mechanisms used in mathematics need coincide with those in other subjects. In particular, standardised modular structures and the use of coursework assessment should be used only as appropriate.

 

4.3. New assessment regimes in mathematics should always be trialled, and the results should be in the public domain. Before wider adoption, they should be analysed carefully to ensure that the changes will achieve the desired aims.

 

4.4. The revision of Application of Number assessment should be more compatible with the learning style and aspirations of the students it is assessing.

 

4.5. Where duplication exists in the assessment of 14-19 mathematics, the Government should rationalise assessment by the different Awarding Bodies.

 

4.6. The Government and its agencies should encourage more development work and research to examine ways in which the appropriate use of computers and calculators in 14-19 mathematics can be assessed.

 

4.7. More research is needed on the use of computer-based assessment in 14-19 mathematics.

 

4.8. Formative assessment in 14-19 mathematics should be strongly promoted, particularly through the initial training and the professional development of teachers of mathematics.

 

4.9. High-quality support material on formative assessment in 14-19 mathematics should be readily available and widely promoted.

 

5. Although the ACME report was focused on the 14-19 age group, it is clear that many of the issues are equally relevant and important to other age groups.

6. In the answers given below to specific questions raised by the Education and Skills Committee it should be appreciated that ACME speaks as a Committee of teachers, lecturers, advisers and educationists who specialise in mathematics. While many points will be common to other subjects we feel that the negative effect of assessment is more acute in mathematics both because of its status in GCSE league tables, as a core and gatekeeper subject, and because many find it a 'difficult' subject to learn. Thus superficial test coaching presents itself as a short cut which avoids the need for the deeper learning and understanding that can only be acquired over time and with a wide variety of experiences.

7. ACME would commend to the Committee the recent report from IPPR[6] which covers some of the same ground as this inquiry, and which we have found to be a balanced and well-informed account with sensible recommendations for change, all of which we would endorse.

 

General issues

Why do we have a centrally run system of testing and assessment?

8. Some purposes of assessment are listed below from the ACME report.

 

Key stakeholders, including employers and universities, need authoritative information about the level of skills and knowledge of applicants

 

Students need to know their level of achievement and to acquire qualifications

 

Parents need to know how their children are progressing against national standards

 

Assessment can guide and motivate learning for both students and teachers

 

The Government and its agencies use external assessment results to judge the performance of schools and colleges

 

National assessments provide some measure of standards over time

 

9. All but the fourth of these purposes require a centrally run system, in the sense that there is some common structure and assessment standardisation, but the intrusion of that system into the student experience of education should vary at different ages. A good principle is that it should be as light as possible and compatible with satisfying the purposes above. For example, many parents only need to know broadly how their child is progressing.

What other systems of assessment are in place both internationally and across the UK?

10. ACME does not have detailed knowledge of assessment systems across the UK and internationally but would point out that those countries like Finland and the Netherlands with the highest mathematical standards in international comparisons such as TIMSS and PISA have much less expensive and intrusive assessment regimes, and no league tables. Thus it is not clear that our current system encourages high standards in learning mathematics.

Does a focus on national testing and assessment reduce the scope for creativity in the curriculum?

11. A series of Ofsted reports has shown both that the focus on assessed core subjects at the primary stage has reduced the time spent on more creative subjects, and that creative aspects of mathematics are rare in classrooms, because they are not assessed by external tests[7].

Who is the QCA accountable to and is this accountability effective?

12. While not attempting to answer the whole question, we believe that one set of communities to which the QCA should be responsive are the subject communities. ACME has had some problems interacting with the QCA, at times on behalf of and at other times collectively with the mathematics subject community. QCA has sometimes proved secretive and reluctant to consult, and when consultation happens it is often too late to significantly affect the outcome. The subject community needs to be brought in at the start of a new development to allow genuinely collaborative working. In the past this reluctance to consult has led to serious problems, for example over mathematics in Curriculum 2000, which could have been avoided by earlier and wider discussion with more representative groups. More recent difficulties have been over the development of functional mathematics standards and assessment, and the mathematics in the new diplomas, where those in QCA who are developing these have been reluctant to involve the mathematics community until too late. QCA do not seem to appreciate that such community involvement can have two benefits: it can greatly enhance the quality of a new development; and it can gain the support of the community, an essential asset in managing change. Part of the problem has been due to recent instabilities in QCA's mathematics team, although the difficulties go much further back.

13. A second aspect of this has been weaknesses over the regulation of awarding bodies, at different levels. For example QCA should not have allowed some subjects to be demonstrably more difficult than others in GCSE and GCE; one reason why students do not continue with mathematics is the likelihood of receiving lower grades than they would for other subjects. They have also allowed awarding bodies to conduct examinations which do not serve the needs of either students or mathematics. The community again needs to be fully involved in the scrutiny of assessments, and the rejection of unsuitable examinations, just as they are much more successfully involved in the scrutiny of national tests. (This is easier as national test items go through two stages of thorough trialling before they are used; a practice that should also be adopted by awarding bodies.)

What role should exam boards have in testing and assessment?

14. We do not believe that the current post-14 system of 3 competing examination boards/awarding bodies in England and a weak regulator (QCA) has served mathematics education well. The awarding bodies have competed with each other to offer examinations and associated textbooks which are attractive to teachers as they are easy to teach to, and easier for students to pass. They thus lack challenge and a wide coverage of aspects of the curriculum. There has been a reluctance to work with teachers to invest in innovative specifications and forms of assessment.

15. There are three possible ways forward:

(a) to continue with three awarding bodies but to strengthen the regulator and make awarding bodies and QCA more responsive to the subject communities;

(b) to have a central organisation (NAA?) which requests bids from awarding bodies and others for the design and administration of post-14 national examinations to tight specifications, in the same way as national tests are currently organised;

(c) to have a single national awarding body, as in other countries and other parts of the UK.

16. We generally but not unanimously support option (b), which currently seems to work well in mathematics in national tests, which are of much higher quality than GCSE. There are great advantages in having a single assessment for each qualification - it avoids comparability questions, and competition which tends to dumb down examinations and lead to grade inflation, and a monopoly over textbooks. Greater control and a larger candidature mean that e.g. pre-testing and greater scrutiny could be required. While there are dangers in having no choice we tend to think this the lesser of the evils.

17. We further believe that there is a national shortage of expertise in setting good mathematics assessments, so it makes sense for this expertise to be pooled rather than split between different bodies. (We note for example that at present 9 different consortia are each trialling their own version of functional mathematics assessments; it cannot be sensible to divide up assessment expertise in this way or to offer schools and colleges such a wide choice.)

National Key Stage Tests: The current situation

How effective are the current Key Stage Tests?

Do they adequately reflect levels of performance of children and schools, and changes in performance over time?

18. As noted above, the key stage tests in mathematics are of reasonably high quality. However by their nature they cannot in their current form test the whole of the mathematics national curriculum e.g. use of computers to solve problems, more extended problems and investigations, etc. They are therefore lacking some aspects of validity. There are some other sources of lack of validity or reliability e.g. students with English as a second language or who have problems reading find it difficult to access the questions; last minute coaching may allow students to succeed who would not be able to do so two weeks later It has been estimated that 30-45% of pupils may be therefore assigned to the wrong level[8], for these reasons and due to other sources of lack of reliability or validity. Thus national tests are not very effective in providing reliable results for individual pupils.

19. At the school level there must also be some unreliability due to small numbers of pupils and variations in the pupils from year to year, and to different degrees of coaching. For this reason, secondary teachers rarely rely on KS2 national test levels for e.g. setting students.

20. There must also be questions about the reliability of national monitoring of standards over time, as it is difficult to ensure different tests are exactly comparable in difficulty.

Do they provide assessment for learning (enabling teachers to concentrate on areas of a pupil's performance that needs improvement?

21. Key stage tests do not take place at an appropriate time to be of use for formative or diagnostic assessment since they take place at the end of a key stage. In France testing took place at the start of the school year, and in Wales the tests are planned to be during Year 5, in each case to allow a formative function. While teachers do use old copies of key stage and optional tests set earlier in the year formatively, they are not designed for this purpose and alternative forms of assessment might be more effective.

Does testing help to improve levels of attainment?

22. Testing makes students perform better in what is tested, since students and teachers focus on obtaining a good performance. They probably do improve levels of attainment in some aspects of mathematics but they also distort the curriculum and tend to emphasise superficial learning which is sufficient to obtain correct answers.

Are they effective in holding schools accountable for their performance?

23. Because of the pressure of league tables and inspections, schools do generally feel accountable for their performance, often to a greater extent than is justified since results do not only reflect teaching quality.

How effective are performance measures such as value-added scores for schools?

24. Many schools, especially those in more deprived areas, unsurprisingly prefer value-added tables. However these are not perfect and seem sometimes to favour schools with particular types of intake, perhaps due to measurement distortions. They also depend crucially on the reliability of the scores at the previous key stage; for secondary schools there are usually a large number of feeder schools which rules out too much bias, but a primary school can to some extent determine its KS1 results to achieve greater value-added at KS2.

Are league tables based on test results an accurate reflection of how well schools are performing?

25. League tables reflect the intake of the school rather more than the quality of the teaching (at GCSE level they can be distorted by e.g. vocational ICT results). Value-added tables are more likely to reflect how well schools are performing but as noted in 22, these are not completely reliable either. Clearly the league tables only measure what is tested and not wider aspects of school performance.

To what extent is there teaching to the test?

26. Ofsted and QCA have catalogued in various reports the fact that preparation for national tests starts very early in the year e.g. Qualifications and Curriculum Authority Report, 2003/4: 'Even the most successful schools concentrate on drilling pupils for the national tests in Years 2 & 6.' Material that is unlikely to be in the test is not taught.

How much of a factor is 'hot-housing' in the fall-off in pupil performance from Year 6 to Year 7?

27. This seems likely to explain most of the fall. We note that on a numeracy test for which students were not specifically prepared there was still an average drop of around 2% between Year 6 and Year 7[9]. It should also be noted that mathematics is one of many more subjects in Year 7 and is typically taught for 3 rather than 5 hours a week. The mathematics curriculum in Year 7 is also broader, so there is less focus on number work.

Does the importance given to test results mean that teaching generally is narrowly focused?

28. We believe that the teaching of mathematics in the 5-14 age range has become narrowly focused on the content of the tests, to the detriment of deeper learning and more varied teaching styles. See paragraphs 2, 3 and 11.

What role does assessment by teachers have in teaching and learning?

29. There is evidence that formative assessment is the most cost-effective way of raising attainment[10]. However teachers are also in a position to assess a wider range of aspects of mathematics and to avoid some of the sources of unreliability in national tests. The weakness of teacher assessment is lack of comparability but we believe that that could be reduced by the use by teachers of assessments from a national data bank, and by group moderation procedures as proposed by the original TGAT report[11]. We also recommend the IPPR report conclusions, which suggest that national tests might sample a very small portion of the curriculum and act as a moderator of teacher assessment. ACME's own report also proposes a greater emphasis on teacher assessment and a reduction of external testing.

 

 

National Key Stage Tests: The future

Should the system of national tests be changed?

30. Yes. We would suggest a system which used moderated teacher assessment (see paragraph 29).

If so, should the tests be modified or abolished?

31. The problem with national tests is that they have become so high stakes that they badly distort the curriculum, and their lack of reliability is not generally appreciated so that they can be used to give misleading labels to students. One way of retaining some limited testing but reducing the emphasis on them is suggested by the IPPR - see paragraph 29.

The Secretary of State has suggested that there should be a move to more personalised assessment to measure how a pupil's level of attainment has improved over time. Pilot areas to test proposals have just been announced. Would the introduction of this kind of assessment make it possible to make an overall judgement on a school's performance?

32. While we are in favour of more flexible and personalised assessment ACME responded to this proposal with considerable concern. One problem about using the average gain in pupil levels as a measure to judge schools is that levels are a discrete and not a continuous measure. There is 2 years learning between the lower and the upper end of a specific level. Thus in the 4 year key stage 2, one child may appear to only progress one level but may start at the bottom of level 3 and finish at the top of level 4, effectively making virtually the expected level of progress of 2 levels, while a second child may progress from the top of level 2 to the bottom of level 5, apparently making 3 levels progress but effectively making very little more progress than the first child. In small schools the effects of these distortions could be considerable. There is also a problem about whether the distances between levels are equivalent - it may for example be easier to progress from the beginning of level 3 to the beginning of level 4 than it is from the beginning of level 5 to the beginning of level 6.

33. We have heard of examples of where schools are asking teachers to assess in sub-levels with very little guidance or reliability, and requiring all students to progress by 1.5 levels per year. We feel that such use of the current structures, which were not intended or designed for these purposes, is misguided and is likely to lead to bad practice. Learning in mathematics takes time for consolidation and practice; these measures are likely to lead to even more superficial learning, not just at ends of key stages but throughout a child's education.

Would it be possible to make meaningful comparisons between different schools?

34. For reasons expressed in the previous paragraph, there are some problems with using average number of levels progressed as a measure of performance of a primary school. In a large secondary school it will be more reliable, but not entirely so because the levels are not equally far apart in terms of learning times.

What effect would testing at different times have on pupils and schools? Would it create pressure on schools to push pupils to take tests earlier?

35. ACME in its response was very critical of this aspect of the proposals - we feel that the test preparation that now occupies Years 2 and 6 may spread to all years in a primary school. Pressure may well be put on students to keep taking tests until they pass, though without necessarily understanding some of the work. This effect could be reduced by using teacher assessment moderated by assessments drawn from an item bank to help teachers to judge when a student has really achieved a complete level.

If key stage tests remain, what should they be seeking to measure?

36. Probably nothing on their own, since they are neither comprehensive enough or reliable enough, but they could usefully be used to moderate teacher assessment.

If for example Level 4 is the average level of attainment for an eleven year old, what proportion of children is it reasonable to expect to achieve at or above that level?

37. It is not possible to answer this question without using some model of a distribution of ability; empirical results will depend on the structure of the test used. The original National Curriculum Task Group made some estimates which were fairly close to the actual results in the early years of testing, but are now less close since results have risen, due probably to more focused test preparation.

How are the different levels of performance expected at each age decided on? Is there a broad agreement that the levels are appropriate and meaningful?

38. It was the task of the original National Curriculum Working Groups to give criteria for each level which corresponded to the model of progression set by the Task Group. The current versions of these are known as level descriptions. This was done successfully to a remarkable degree. However the current definition of levels is unsatisfactory e.g. a child can achieve a Level 5 in mathematics on a KS2 test without actually answering any questions designed to match the level descriptions for Level 5 but by collecting marks on questions targeted at level descriptions for levels 3 and 4. Thus the meaning of a level (and probably the standard of the tests) has slipped over time. This would cause problems in setting level tests or in providing guidance for teacher assessment if it was intended that the results be consistent with both the end of key stage test results and with the level descriptions. It would be better to get back to the earlier situation where the level descriptions were used to judge whether a child had achieved that level, rather than the current situation where they are defined by arbitrary mark ranges on tests. However this does allow for the fact that not all items matching a level description are of similar difficulty, so that the mark range can be set empirically to ensure similar standards to earlier years.

39. In spite of this there is probably a broad agreement in the mathematics education community that the level descriptions are more or less appropriate.

 

Teaching and assessment at 16 and after

Is the testing and assessment in 'summative' tests (for example, GCSE, AS, A2) fit for purpose?

40. As noted in paragraphs 2 and 3, the mathematics items set tend to be single step routine items which do not challenge students to construct solutions to unfamiliar problems, fragment the curriculum, and lead to very procedural teaching.[12] Some aspects of the curriculum cannot easily be assessed in this sort of examination and thus are not often taught.

41. It is also a problem that grades are often awarded in mathematics with very low threshold marks, giving candidates the idea that they do not understand much maths whatever the grade awarded.

Are the changes to GCSE coursework due to come into effect in 2009 reasonable? What alternative forms of assessment might be used?

42. In mathematics the change occurs from 2007. Given the high stakes nature of GCSE and the fact that coursework questions were not changed from year to year as they were originally, it is probably inevitable that coursework was abandoned as the marks were unreliable and the load on teachers, who were expected by schools to ensure that every candidate got a high grade, was unacceptable. Thus we broadly welcome the change.

43. However there were several benefits of coursework in that it provided an extended piece of work, sometimes drawing from different areas of mathematics, and students often felt some ownership of it, especially in making their own decisions about how to proceed with problems or investigations, a rare experience in mathematics lessons. Some way is needed of ensuring that this type of experience is not lost - there may be several mechanisms for this.

44. It is not clear yet what will be the impacts of the change, but we are concerned that some schools appear to be using the removal of coursework, together with staff shortages, as a reason to cut the amount of time devoted to mathematics in KS4. If this happens on a wide scale it will threaten both standards of attainment at GCSE and greater participation at AS/A level.

45. Coursework could be replaced by teacher assessment - giving marks to teachers to award against given criteria - or by externally set long tasks being done under examination conditions.

What are the benefits of exams and coursework? How should they work together? What should the balance between them be?

46. Coursework in its current interpretation, as extended externally set tasks, is not currently acceptable to most mathematics teachers. Longer tasks, set and marked externally and administered under examination conditions, as proposed in some of the functional mathematics assessments, would at least give students experience of more sustained problems requiring decisions, but would not have other benefits of coursework, such as more freedom to plan and time for reflection.

47. There is also the possibility as proposed in the Tomlinson report[13] of coursework being interpreted much more broadly as work done during the course, including classwork and routine tests as well as some longer tasks done in class. This seems worth following up as it should not impose an excessive load on teachers and would allow a wider range of skills and understanding to be assessed. It would need careful moderation to achieve comparable standards, but we believe that this would be achievable, given sufficient resource in trialling and professional development.

48. If there were such a component it would have to be worth more than 20% to be worth doing, but there would probably be opposition to the value being more than 50%. Nevertheless during the late 1980s and early 1990s, at least two mathematics GCSEs successfully and reliably awarded a full range of grades on 100% teacher assessed work, moderated by trained local assessors.

Will the ways in which the new 14-19 diplomas are to be assessed impact on other qualifications, such as GCSE?

49. We do not yet know how the mathematics in the new diplomas will be assessed, even the functional mathematics elements. Some awarding bodies are setting sustained tasks with in some cases pre-circulated data sheets to allow students time to become familiar with some of the data from the questions. We would not want to see at one extreme functional mathematics, or any mathematics units within principal learning in diplomas, assessed only by tests with fragmented short knowledge items. Nor at the other extreme would we want mathematics to be totally absorbed in some sort of work-related portfolio assessment with vague criteria for success (experience of this is that students never fail to meet the mathematics criteria even when their grasp of mathematics is very weak).

50. It is likely that some of these methods will influence GCSE, either directly as functional mathematics becomes a required part of GCSE mathematics, or indirectly as noted in paragraphs 46 & 47.

Is holding formal summative tests at ages 16, 17and 18 imposing too great a burden on students? If so, what changes should be made?

51. The ACME report suggested that there was too much external assessment in the 14-19 period. One possible solution is to replace the age 16 external assessment with moderated internal assessment, as suggested by the Tomlinson report, since this is no longer necessary when there is a universal requirement to continue with some form of education to age 18. Alternatively there could be a greater use of teacher assessment in the early units of AS and/or A level, provided that there was external assessment of final A2 units to ensure comparability.

To what extent is frequent modular assessment altering both the scope of teaching and the style of teaching?

52. We believe that frequent modular assessment has some advantages in spreading the final assessment load, in encouraging students to focus and work early in their courses and in giving early feedback. However the disadvantages are additional time spent in test preparation and more limited short questions which drive the nature of classroom teaching towards the fragmented and procedural. Frequent re-tests also takes time for revision out of teaching time.

How does the national assessment system interact with university entrance? What does it mean for a national system of testing and assessment that universities are setting entrance tests at individual institutions?

53. The setting of separate tests in mathematics happens but is reasonably rare. It reflects the fact that GCE A level is not sensitive enough to differentiate students with high potential from other A grade students, largely because of the impoverished nature of the questions set. We support the continuation of the AEA papers for this reason, but see difficulties if some harder questions are placed on traditional mathematics papers, as it is difficult for students during the paper to decide whether to risk answering more difficult questions rather than sticking to more routine questions.

June 2007



[1] ACME (2005) Assessment in 14-19 Mathematics. London: The Royal Society/Joint Mathematical Council.

[2] Smith, A. (2004) Making Mathematics Count: the report of Professor Adrian Smith's Inquiry into Post-14 Mathematics Education. London: The Stationery Office.

[3] Ofsted (2006). Evaluating mathematics provision for 14-19 year-olds. London: Ofsted.

[4] Smart, T. and Tickly, C. (2006)  Career Patterns of Secondary Mathematics Teachers   Leicester: The Mathematical Association.

[5] Many sources, e.g. Matthews, A. & Pepper, D. (2005) Evaluation of participation in A-level Mathematics: Interim report. London: Qualifications and Curriculum Agency.

 

[6] Brooks, R. and Tough, S. (2006) Assessment and Testing: Making space for teaching and learning. London: IPPR

[7] Ofsted (2006). Evaluating mathematics provision for 14-19 year-olds. London: Ofsted. See also the Annual Report by the Chief Inspector for Schools in 2002/3: 'Only a small proportion of (primary) schools successfully combine high standards in the core subjects of English, mathematics and science with a rich and varied curriculum'.

[8] Brooks, R. and Tough, S. (2006) Assessment and Testing: Making space for teaching and learning. London: IPPR

[9] Brown, M., Askew, M., Millett, A., & Rhodes, V. (2003) The key role of educational research in the development and evaluation of the National Numeracy Strategy. British Educational Research Journal, 29(5), 655-672.

[10] Wiliam, D. (2007). Assessment for Learning: why, what and how? Professorial lecture, London.

[11] Task Group on Assessment and Testing (1987) A Report. DfES.

[12] Ofsted (2006). Evaluating mathematics provision for 14-19 year-olds. London: Ofsted.

[13] Working Group on 14-19 Reform (2004) 14-19 Qualifications and Curriculum Reform. London: DfES