Memorandum submitted by Association of Teachers and Lecturers (ATL)


Executive Summary


ATL outlines the current excessive burden imposed by the current assessment and examination system particularly under the yoke of performance league tables, and gives a brief history of the system's development.


Using research evidence, ATL finds that:


¨ That the data provided by the testing and examination system is compromised by the number of purposes for which it is used.


¨ That these purposes can be met through a system of cohort sampling, with evidence that this works in other countries.


¨ That the current high-stakes national assessment and testing system

- narrows the curriculum and reduces flexibility in curriculum coverage

- undermines the Every Child Matters agenda

- has a negative impact on pupil attitude

- depresses staff morale and leads to 'teaching to the test'


¨ That the current system of Key Stage tests

- leads to duplication of testing between stages, particularly between Key Stages 2 and 3

- provides data which is not used by teachers upon which to build further learning

- does not accurately reflect changes in performance over time

- does not provide valid information about students' attainment

- undermines Assessment for Learning approaches

- produces performance levels that are not sustained

- assesses a limited range of skills

- measures schools on indicators that are not only too narrow but are damaging to learning

- leads to a narrow teaching focus; 'teaching to the test'

- excludes many higher-level cognitive skills

- produces simplistic grades which often of little value in diagnosing learner needs.


ATL proposes a fundamental change to the assessment system, where we propose assessment for learning as the primary method of assessment throughout pupils' learning careers in a league-table free environment that uses cohort sampling to provide data for national monitoring purposes.


ATL believes that there should be no national assessment system prior to a terminal stage and international evidence links high pupil achievement to such systems which postpone national assessment and selection.


ATL outlines the need for schools to provide their students with the skills, understanding and desire for lifelong learning, something which the narrowness and high-pressure of the current assessment system may prevent.


ATL believes that assessment for learning principles and practice should underpin teacher assessment which should be, in the main, formative. This submission provides a wealth of research evidence about assessment for learning (AfL) and teacher assessment in the following areas:

¨ the positive impact of AfL on standards

¨ the tension between AfL and summative assessment

¨ personalised learning and AfL

¨ AfL and the measuring of achievement

¨ how AfL's vision of learning and ability is undermined by age-dependent levels

¨ teacher assessment and the needs of a diverse school population

¨ perceptions of bias in teacher assessment

¨ resource needs of AfL

¨ workload implications of teacher assessment and AfL.


ATL strongly believes that this proposed system cannot exist alongside performance tables which already have a pernicious effect on the current national testing system.


ATL's recommendations for action are for the government to do the following:




¨ Review the current assessment system with urgency in light of its impact on curriculum coverage and on teaching and learning

¨ Investigate the purposes applied to the present national assessment system

¨ Develop AfL pilots in schools exempt from national testing during the pilot period

¨ Prioritise CPD for teachers in assessment, particularly AfL techniques and strategies

¨ End the use of national testing as market information and accountability mechanisms

¨ Explore options of cohort sampling to meet national monitoring needs

¨ Work with awarding bodies to produce a national bank of test materials as resources for teachers

¨ Abolish school performance league tables

¨ Explore alternative options to age-dependent levels

And ultimately

¨ Postpone national testing until a terminal stage.

ATL - leading education union

1. ATL, as a leading education union, recognises the link between education policy and our members' conditions of employment. Our evidence-based policy making enables us to campaign and negotiate from a position of strength. We champion good practice and achieve better working lives for our members. We help our members, as their careers develop, through first-rate research, advice, information and legal support. Our 160,000 members - teachers, lecturers, headteachers and support staff - are empowered to get active locally and nationally. We are affiliated to the TUC, and work with government and employers by lobbying and through social partnership.


2. ATL has recently produced Subject to Change: New Thinking on the Curriculum which questions whether our current curriculum and assessment systems are fit for purpose for the needs of society and our young people in the 21st century. This submission is based on these very arguments and we strongly welcome this Inquiry into testing and assessment, particularly around areas which challenge the efficacy of current national arrangements such as Key Stage testing.


Current excessive assessment - the historical picture

3. Our current pupil cohorts experience years of national assessment and testing; if you count foundation stage assessment, a pupil who goes on to take A-levels will have undergone national assessments and tests in seven of their 13 years of schooling. Yet prior to 1988, pupils faced only two external national tests - GCSEs and A-Levels - and a system of sample testing existed, which was overseen by the Assessment Performance Unit (APU). During that time, teachers had the power to design and carry out assessment for pupils not yet undertaking GCSE or A-level exams.


4. New arrangements for testing and league tables, including the assessment of all pupils by statutory assessment tasks and tests in core subjects at the ages of seven, 11 and 14 (at the end of Key Stages 1, 2 and 3 respectively) set up by the 1988 Education Reform Act have had, and continue to have, a huge impact on the primary and early secondary curricula as taught in schools.


5. 14-19 debates around curriculum and assessment have often concentrated on the issues of GCSE and AS/A2 provision with a resulting focus on the tensions between academic and vocational qualifications and the demands of external examination processes. The focus on difficulties of delivery has narrowed the debate and future thinking. For example, the 14-19 Diplomas, currently in development, from starting with a vision of integrating academic and vocational strands is becoming increasingly mooted as a vocational-only learning route due to the requirements of most stakeholders bar one, the learner.


6. The introduction of league tables of school exam and national test results through legislation in the 1990s has had an enormous and detrimental impact on the effects of the national testing regime in schools and has encouraged a risk-averse culture there. By placing such emphasis on 'standards' as evinced through test results, league tables have encouraged 'teaching to the test' and the regurgitation by learners of key 'facts' leading to 'surface' or 'shallow' learning.


7. These measures represent a significant increase in the accountability to government of schools, teachers and learners concerning their performance, creating an imbalance between professional autonomy, professional judgement and accountability where the latter has assumed a disproportionate part of the experience of being a teacher.


The data machine - a centrally run system of testing and assessment

8. What the current centrally run assessment and testing system does give us is a large amount of data on pupil attainment and school performance; indeed at times, this seems to be its primary raison d'ętre. However, ATL questions whether that data in itself is helpful or useful enough to offset the detrimental effect it is widely acknowledged to have on the teaching of the current curriculum. The Daugherty Assessment Review Group in Wales, reviewing assessment arrangements at Key Stages 2 and 3, considered whether the 'hard data... on pupil attainments and the targets it gives some pupils to aspire to, is of sufficient value to compensate for the evident impoverishment of pupils' learning that is occurring at a critical stage in their educational development'.1 Their conclusion can be inferred by their recommendation to the Welsh Assembly that statutory National Curriculum testing of 11 year olds at Key Stage 2 and 14 year olds at Key Stage 3 should be discontinued.


9. 'While the concept of summative assessment may be simple, the uses of data from summative assessment are varied and the requirements of different uses make varying demands in relation to reliability and validity of the assessment.'2

As outlined above by the Assessment Reform Group, the different uses of summative assessment data has a significant impact on its rigour and its fitness for purpose. Newton (2006) lists 18 uses for this data, currently:


1. Social evaluation

2. Formative

3. Student monitoring

4. Transfer

5. Placement

6. Diagnosis

7. Life choice

8. Qualification

9. Selection

10. Licensing

11. School choice

12. Institution monitoring

13. Resource allocation

14. Organisational intervention

15. Programme evaluation

16. System monitoring

17. Comparability

18. National accounting3


10. ATL questions whether one system can be fit for all these purposes. In terms of assessment, we understand validity to be the extent to which any assessment succeeds in measuring what it originally set out to measure. However, a plethora of purposes means that in fact we are measuring many other things in addition to the original focus of that assessment; for example, the aggregation of pupil's grades into broad level for the purposes of monitoring pupils, schools and systems will impact on the formative purpose of the assessment, making the outcome far less meaningful. Swaffield (2003) relates this to the notion of consequential validity: "This means that even a well-constructed test is not valid if the results are used inappropriately - which moves the idea of validity on from something which is the concern of test writers to something which is the responsibility of everyone who interprets and uses assessment results."4


11. ATL believes that clearer distinctions need to be made between the respective uses and purposes of assessment. Other countries' systems make this distinction clearer; strategies used include those which combine teacher led formative assessment with the utilisation of a national bank of tests applied for summative purposes when learners are ready. National monitoring needs are met through a system of sampling pupils' performance (eg cohort sampling), thus reducing the overall test burden whilst increasing the relevance and breadth of the learner evidence. While there is an economic advantage of collecting readily-available achievement data, eg the results of end-of-key-stage tests, we will demonstrate, throughout this submission, the lack of useful and relevant information it provides. If monitoring was separated from the performance of individual pupils, there would be no need for the central collection of individual pupil assessment data. As the Assessment Reform Group conclude, 'this would remove the "need" for high stakes testing and would ensure that assessment - and, more importantly, what is taught - was no longer restricted to what can be tested. The continuation in several countries of regular surveys of small random samples of pupils indicates the value of this approach.'5 In addition to the US National Assessment of Educational Progress (NAEP), there is New Zealand's National Education Monitoring Project (NEMP) and nearer to home, the Scottish Survey of Achievement (SSA).


Lessons from across UK and the international scene

12. The Scottish Survey of Achievement could provide a useful model for further investigation into restoring the place of teachers to the heart of curriculum and assessment. From the end of 2002/03, a new system of assessment in Scotland has been introduced. Teachers there have been provided with an online bank of assessment materials, based on the Scottish Survey of Achievement. The aim of these tests is to confirm the teachers' assessments of their pupils' attainment. These are to be administered to pupils when teachers deem they are ready to take them, rather than at a pre-determined time, making testing far more manageable within the school system and less likely to distort teaching and learning. Teachers have been supported in this process by the Assessment is for Learning (AiFL) programme. This has not led to any lack of accountability in the system; HMIE produce full reports on schools, based around a set of 33 quality indicators in 7 key areas and the system strongly encourages schools to continually self-evaluate and assess achievements using these quality indicators. The Scottish Survey of Achievement also provides national figures, thus offering a way of measuring national progress over time without testing every child. The AiFL programme is being fully integrated into the national assessment system. In England, Assessment for Learning (AfL) still appears to be a separate strand from the national testing system, rather than an integrated part of a coherent whole.


13. International comparisons prove particularly interesting when we constantly hear of rising standards. Indeed, test results are improving, yet our international standing is falling in terms of our place on international league tables as evidenced by trends demonstrated in the PISA/OECD surveys. The UK's standing on international league tables for 15 year olds has slipped; although the UK's response rate to the 2003 PISA/OECD survey was too low to ensure comparability, the mean score that was produced was far lower than that achieved in the 2000 survey, leading to a fall in ranking within the OECD countries alone, a drop in place further increased by the inclusion of non-OECD countries within the survey.6


Impact of high-stakes national testing and assessment

14. A central proposition to the introduction of the national curriculum in 1988 was the entitlement of pupils to access a broad and balanced curriculum. However, the amount of high-stakes testing has had a well-documented narrowing effect on the curriculum, undermining this entitlement for many pupils, particularly in schools fearful of low scores on the league tables.


15. Narrowing curriculum and reducing flexibility

Webb and Vulliamy, carrying out research commissioned by ATL, document this effect in the primary sector; the standards agenda, through national curriculum testing in English, Maths and Science at various key stages and related performance league tables, 'focused teachers' attention on curriculum coverage in literacy, numeracy and science to the detriment of the rest of the primary curriculum'.7 However, it is not just teachers and their representatives who are expressing this concern; Ofsted state in their 2005 evaluation of the impact of the Primary National Strategy in schools that the raising standards agenda has been the primary concern of most headteachers and subject leaders coupled with a far more cautionary approach in promoting greater flexibility within the curriculum. Ofsted also recognises the narrowing effect which Key Stage 2 tests have on teaching of the curriculum, in terms of time and also in terms of support for earlier year groups.8


16. Undermining the Every Child Matters agenda

The negative impact of current assessment mechanisms is not only diluting the principles of the curriculum vision of 1988, it is undermining the current Every Child Matters agenda. The longitudinal PACE project in primary schools in England observed that curriculum and testing pressures appeared to be 'diminishing the opportunities for teachers to work in a way that enables them to "develop the whole child" and address the social concerns of the wider society'.9 The Assessment Reform Group notes the lack of correlation between 'the narrow range of learning outcomes assessed by tests...with the broad view of learning goals reflected in the DfES Every Child Matters policy document'.10 This tension at school level between narrow standards and school goals of engendering pupil enjoyment and creativity was strongly expressed by the headteachers who took part in ATL's research by Webb and Vulliamy.


17. Impact on pupil attitude

And what effect does this 'tension' have on our pupils? A view across schools and colleges, observed by researchers, is that pupils have become very utilitarian in their views of what is 'worthwhile to pursue'; 'Ecclestone and Hall (1999) call this a "...strategic and cynical compliance with assessment requirements" where passing tests is the primary focus and learning is 'marginalised'.11 This is hardly surprising when we consider the high-stakes purposes of individual assessment data in our current system and the sheer volume of assessment which each pupil will face. But there are other pupils for whom such a utilitarian approach is not a possibility; for lower-achieving pupils, research has shown that the experience of frequently failing tests is demoralising, reducing self-esteem, including their belief in their ability to succeed with other tasks.12 Thus, the gap between higher and lower achieving pupils widens, exacerbated by the fact that focus on test outcomes reduces the levels of early identification of under-achievement and appropriate interventions as noted by Ofsted in relation to the impact of Key Stage 2 testing.13


18. Impact on education staff

ATL's members, teachers and support staff, with pupils, are bearing the brunt of the testing overload and the high-stakes pressure. They are frustrated by the narrowing of the curriculum and the need to ready pupils for ever-increasing numbers of tests. This pressure encourages / drives many teachers to be complicit with the 'strategic and cynical compliance' of students mentioned earlier and to be 'presenters of content' to ensure that their pupils succeed in the narrow focus of the tests and that the school receives a good ranking on the performance tables. This process is ultimately de-skilling; an enforced focus on performance outcomes lessens and undermines richer assessment skills and feedback and will ultimately weaken these skills within the profession.


How effective are the current Key Stage tests?

19. Key stage tests are effective in producing a vast quantity of data on pupil performance as defined by the tests. However, we have earlier addressed the issues of validity around this data, particularly in regards to the myriad of uses to which it is put. Research has shown that key stage tests lead to a narrowing of curriculum, and within high-stakes frameworks which include school performance league tables, to 'teaching to the test' and a destructive emphasis on testing rather than learning. To further explore the question, it is necessary to address the following issues which investigate this notion of their effectiveness.


20. Limited value of test result data for further stages of learning

An issue with the testing system currently in use is the limited value of its data for further stages of learning. The evidence for this is particularly strong in the transition between Key Stages 2 and 3. Many secondary schools carry out their own testing of Year 7 pupils in the autumn term, 'a considerable duplication when pupils have already been assessed in most aspects of the core subjects at the end of Key Stage 2'. (Ofsted)14 It was also one of the main findings of the PPI survey, commissioned by ACCAC in 2002, that secondary schools did not make extensive use of the statutory assessment data available to them.15


21. Do they adequately reflect levels of performance in children and schools, and changes in performance over time?

Many of the purposes of assessment data can be linked to the standards agenda. Government is particularly concerned with proving through that agenda that their emphasis on, and investment in, education is resulting in rising standards over time. Pupils' grades in national curriculum tests and exams are, indeed, improving over time. However, Wiliam (2001) argues that any attempt to measure standards of achievement over time is 'doomed' as we are not comparing like with like; what is taught in schools changes even if the official curriculum does not. We have already observed the evidence of growing focus on test-preparation and on teaching those subjects, or indeed aspects of subjects, which are tested to the detriment of untested aspects or subjects. Wiliam argues that the idea of measuring standards over time 'in any real sense is nonsense' and that 'while reported standards may rise, actual level of achievement could be falling - tests are no longer an adequate proxy for achievement across the whole domain.'16


22. It is particularly those purposes which add high-stakes contexts to assessment that limit the value of achievement data. Tests do not usually test the full range of what is taught and in low-stakes contexts that limited range of achievement can indicate achievement across the whole subject.17 Yet we know that once assessment occurs within high-stakes contexts, there is pressure on the school and the teacher to focus on the student's performance on the aspects of the subject likely to be tested - within an overburdened curriculum, those aspects will, inevitably, be given more time. Any such concentration of resources will inevitably mean that breadth, and indeed depth, of subject coverage will be sacrificed to the relentless pressure of targets, standards, tests and league tables. The purpose of assessment as an aid to the development of learning is shunted into second place.


23. Harlen and Deakin-Crick (2003) concluded from their research that current high-stakes testing does not provide valid information about students' attainment due to the narrow focus of tests and the consequences of being taught to the test leading to many students not actually possessing the skills or understanding which the test is designed to assess; the focus of teaching in this environment is to teach students to pass tests even where they do not have the skills or understanding.18


24. Do they provide assessment for learning (enabling teachers to concentrate on areas of a pupil's performance that needs improvement)?

A definition of assessment for learning which centres around its purpose and focus describes it thus; 'assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting pupils' learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence."19 This definition demonstrates that key stage tests with their current emphasis on ranking, certification and accountability do not provide assessment for learning. Many good teachers use an assessment for learning approach working with learners to gather and interpret evidence to use to discover 'where the learners are in their learning, where they need to go and how best to get there.'20 However, the pressures of test preparation and the importance of grade achievement have made it a secondary or 'add-on' practice in many schools and classrooms.


25. Does testing help to improve levels of attainment? Fall-off in pupil performance from Y6 to Y8 due to 'hot housing'

We hear all the time that standards are improving; ATL questions whether this means that our pupils are learning more and better. Research would suggest otherwise. Durham University carried out research, commissioned by the DfES, which noted the lack of evidence to show that pupils reaching Level 4 at Key Stage 2 will retain their learning, let alone progress to higher learning. They cite a study by Watson (2002) which showed how a 'level focus' and booster classes temporarily raised pupils to mathematics level 4 but that was not sustained over a period of 6 months to a year. Not only were learning outcomes not sustained but the Durham university report also details how high stakes assessment encourages a more rigid teaching style which disadvantages and lowers the self-esteem 'of those who prefer more active and creative ways of learning'.21


26. The current system is perceived as a selection system by pupils. The totemic importance of level 4 at key stage 2 is now so huge that pupils who fail to achieve it cannot be blamed for feeling just that - failures. And we know that this is just how many of them do feel. We also know the effect this has on their future attitudes to learning. It is therefore no surprise that there is a dip in performance between Year 6 and 7. The policy of a differentiated offer post age-14 makes the key stage 3 tests an even clearer selection mechanism, determining how pupils' 'choice' is to be 'guided'.


27. Are they effective in holding schools accountable for their performance?

Whilst ATL must again question the notion of effectiveness in this context, we acknowledge that key stage tests are 'effective' in holding schools accountable for aspects of their performance, ie the performance of pupils in Key Stage tests. However, the cost of this excessive accountability is high. An IPSOS Mori poll in October 2006 found that the current target-driven culture was one of the top factors to demotivate teachers. Also, Key Stage tests are holding schools accountable for their performance across only a part of the curriculum; we have already documented research evidence around curriculum narrowing, the lack of sustainability of learning into subsequent key stages, the negative impact on attitudes towards learning amongst students and the lack of evidence of real attainment across the whole subject or curriculum.


28. 'Teaching to the test' - the high-stakes nature of test results leads to narrow teaching focus

Despite the earlier mentioned demotivating effects of working within the current national assessment system, teachers are working so that their pupils have the opportunity to succeed within those same systems. There is strong evidence that rising test scores are not caused by rising standards of achievement but are rather the effect of growing familiarity amongst teachers and students with test requirements; research shows that changes in the tests are accompanied by a sudden fall in achievement, followed by a rise as teachers begin 'teaching to the new test'.22

29. National curriculum tests and exams as assessment measures

National curriculum tests and exams have long struggled to produce assessment instruments of high validity with optimum reliability and coursework and teacher assessment are examples of their attempts to ensure greater validity. However, these were add-ons, expected to fit in around the testing / examination system and thus were compromised in value and in practice. We have already noted that the limited coverage possible in tests combined with a high-stakes environment has a corresponding curtailing effect on the taught curriculum in schools. However, the format of the national tests which are written tests of limited duration also 'excludes many of the higher-level cognitive and communication skills and the ability to learn both independently and collaboratively'.23

30. Proponents for exams cite their objectivity, an assertion which needs to be briefly examined before we move onto a viable alternative. Public examination grades are not exact measures; they are approximate with known margins for error. These grades depend upon the judgements of examiners, who though very often highly professional, skilled and experienced people are also fallible human beings. Grades depend on snapshots of student performance under very particular conditions, at a certain point of time and in response to a certain set of assessment tasks. And e-assessment will not remove these features - it may bring many advantages of efficiency but 'it won't by itself eliminate grade uncertainties'.24


31. In addition, the needs of many of the assessment purposes outlined in paragraph 9 for simplistic grades mean that that much useful information about actual performance is lost. Sue Swaffield warns of the limitations of this data: "Summary statistics are often used to compare individual pupils or schools. In doing so, it is important to remember that any single score or level could have been arrived at from a wide variety of individual judgements, and so a level or grade gives no specific information about a pupil's performance. Much more information is needed if teachers in the next year group or school are to build upon pupils' prior attainment."25 Furthermore, there is a danger that we 'fail to appreciate the impact of test unreliability' (it is likely that the proportion of students awarded a level higher or lower than they should be because of test unreliability is at least 30% at KS2, for example) on the 'reliability of change scores for individuals'26 hindering diagnosis of a learning problem, should one exist.


32. Standardised tests can also obfuscate the meaning of pupil performance. For example, many tests offer multiple choice options to the pupil but these can confuse a reader who understood the text perfectly but was confused by the similarity of the choices offered - not by the text.27 Without the teacher there to mediate, clarify and feedback the learning to the pupil, we, and they, lose the meaning and ultimately, it is the learner who loses out.


ATL Vision for the Future


33. Change at a fundamental level

ATL is arguing for a fundamental change in the assessment system; it is not enough to hand over the administration of summative assessment to teachers within a high stakes context and expect real advances in pupil achievement and engagement. Otherwise we are in danger of merely adding workload to teachers with no real addition in terms of professional autonomy nor a move to assessment which puts learning in first place. This fundamental change means that we are proposing assessment for learning as the primary method of assessment throughout the career of pupils in a league-table free environment that uses cohort sampling to provide data for national monitoring purposes.


34. No national assessment system prior to terminal stage

Due to the here- and elsewhere-documented detrimental effect of national curriculum testing on teaching and learning, ATL believes that there should be no national assessment system prior to a terminal stage. We believe that the present and future needs of our society requires an assessment system which focuses learners on learning rather than tests, maintains the breadth which was part of the vision of the National Curriculum in 1988 and which encapsulates part of the current vision for Every Child Matters, and which engages learners as participants in their learning and progress.


35. It can be argued that a system which postpones summative assessment at a national level fits within the earlier recommendations of the Task Group on Assessment and Testing (TGAT). The original vision of TGAT was for 'an assessment system designed for formative purposes' which 'can meet all the needs of national assessment at ages before 16...only at 16 does it seem appropriate for assessment components to be designed specifically for summative purposes (paragraph 26).'28


36. International evidence now clearly links high pupil achievement with systems which postpone national assessment and selection. Finland's education system is a strong example of this as it is one which has gained it a high (often first) place on the OECD Programme for International Student Achievement (PISA) surveys of 2000 and 2003 with top ranking scores in mathematics, problem solving, science and reading and it defers national testing until a terminal stage. In fact, not only did Finland's students score highly in terms of performance and proficiency, but they demonstrated positive attitudes towards learning as this excerpt from the Executive Summary of the 2003 survey indicates: "For example, more than half of the students in France and Japan report that they get very tense when they have to do mathematics homework, but only 7 per cent of students in Finland and the Netherlands report this. It is noteworthy that Finland and the Netherlands are also two of the top performing countries."29


37. Focus on learning

Across subjects, there are two key sets of goals: that pupils learn with understanding (develop understanding of concepts which can be applied in different contexts, identifying the links between different situations, applying the learning); and, understanding learning (that learners develop awareness of the process of learning). ATL has argued, and indeed it is widely recognised, that 'students cannot learn in school everything they will need to know in adult life' [OECD, 1999]30 and therefore, schools must provide 'the skills, understanding and desire needed for lifelong learning'. This means that we need to look critically at our assessment systems, which have a huge influence on what is taught in the classroom and as we have demonstrated earlier in this submission, our current assessment system produces 'strategic and cynical' test-takers rather than engaged and questioning lifelong learners with the flexibility needed for a rapidly changing society.


38. Formative assessment, assessment for learning (AfL) and personalised learning

ATL believes that assessment for learning principles and practices should underpin teacher assessment in schools and colleges. When assessment for learning (AfL) is talked of as a strong assessment model to support pupil learning and engagement, the formative aspects of assessment are highlighted, when evidence of pupil learning is used to identify learning needs and to adapt teaching work accordingly to meet them. The education community are fortunate to have an abundance of evidence to demonstrate the positive effects of formative assessment, even within the current system. Black et al (2002) answer the question, 'Is there evidence that improving formative assessment raises standards?' with 'an unequivocal yes, a conclusion based on a review, by Black and Wiliam (1998a), of evidence published in over 250 articles by researchers from several countries. There have been few initiatives in education with such a strong body of evidence to support a claim to raise standards.'31 They found that an increased focus on using formative assessment as principle and practice within the classroom produced gains in pupil achievement, even when measured in narrow terms such as national curriculum tests and examinations.


39. Research by the Assessment Reform Group endorses this finding regarding the weight of evidence that assessment for learning, with its formative assessment focus, has a positive impact on summative results, citing a quarter to a half GCSE grade improvement per student. However, their research does point to the tension between assessment for learning and summative assessment which clouds the 'improvement' focus of AfL, subsumed by information about successes and failures.'32 This argues for ATL's proposition that assessment for learning becomes the norm for teachers and pupils throughout the school careers of learners; it cannot fully realize its potential and vision within a system which has summative national tests and examinations at its core.


40. The advent of 'personalised learning' on the horizon has brought AfL to the fore. This is unsurprising as assessment for learning is an approach which has the learning needs of individual students at its heart and is one which involves students far more directly in the assessment process. The DfES rightly sees the assessment for learning model as being school-based, collaborative, whole-school enquiry and yet this model cannot fit within a high-stakes assessment system which adds huge time and focus pressures to schools, creating a risk-averse school culture and through league tables, pits school against school. This is a fundamental flaw with the assessment for learning focus within the Making Good Progress project which will be hampered by its having to develop alongside more frequent national testing and targets.


41. AfL requires a fundamental re-think in how we measure achievement

This will require a culture change in schools and indeed, the wider community, about how we see achievement in schools. Many pupils and their parents will see learning tasks as competitions, achievement marked by a grade or a ranking within the class. One of the key problems with this 'win/lose' view is that those who often lose no longer even try; better to switch off rather than risk 'failure'. Teachers working with researchers on formative assessment methods have found that 'whilst pupils' learning can be advanced by feedback through comments, the giving of marks - or grades - has a negative effect in that pupils ignore comments when marks are also given'33. Once grades were removed, pupils concentrated on the feedback given by the teacher and on how it could help them improve.


42. Research shows that grading and feedback have a big impact on pupil motivation and resulting willingness to engage in tasks and learning. Black et al detail key research findings on these effects:


¨ "Pupils told that feedback '...will help you to learn' learn more than those told that 'how you do tells us how smart you are and what grades you'll get'; the difference is greatest for low attainers (Newman & Schwager, 1995)

¨ Those given feedback as marks are likely to see it as a way of comparing themselves with others (ego-involvement), those given only comments see it as helping them to improve (task-involvement): the latter group out-performs the former (Butler, 1987)

¨ In a competitive system, low attainers attribute their performance to lack of 'ability', high attainers to their effort; in a task-oriented system, all attribute to effort, and learning is improved, particularly amongst low attainers (Craven et al, 1991)"34


This evidence shows that the returns for making this kind of change to how we assess learning will be significant, particularly amongst those who are currently losing out under the current system.


43. Move away from age-dependent levels

Target-setting, within the standards agenda, has led to a system of age-dependent levels. Again, researchers have argued that these mitigate against learning through an erroneous and demotivating belief about the nature of ability. Wiliam highlights the work of Dweck and her colleagues on students' views of the nature of ability and how that has a profound impact on how they react to challenging tasks. Those who see ability as a fixed entity, 'how clever you are is how clever you stay' will tackle a challenging task if they believe their chance of success is high but will not engage if they believe that their chance of success is low. Those who see ability as incremental will see a challenging task as offering a chance to 'get cleverer', ie to improve ability. As Wiliam observes, 'in order to optimise the conditions for learning, it is therefore necessary for students to believe that ability is incremental, rather than fixed. A system of age-dependent levels would lead to a situation in which many students would get the same grade or level at ages 7, 11 and 14, thus potentially reinforcing a belief in ability as being fixed'.35


44. Teacher-led assessment and the needs of a diverse school population

Our current curriculum and assessment models are based on the idea of 'homogeneous knowledge to be owned by all'. Shohamy (2000) observes this emphasis on homogeneous knowledge: 'This is even more apparent in educational assessment. In a number of situations there is a gap between curricula and assessment as curricula may, at times, contain statements and intentions for the recognition of diverse knowledge, yet the tests are based on homogeneous knowledge"36. It is not possible to de-contextualise assessment but ATL believes that local teacher-led assessment makes it possible to minimise the use of contexts which will have a detrimental effect on pupils' opportunities for achievement.


45. ATL believes that a fair assessment system is one which 'elicit[s] an individual's best performance' and Gipps details the factors that need to be in place for assessment tasks or tests for this to occur: 'This involves tasks that are concrete and within the experience of the pupil (an equal access issue) presented clearly (the pupil must understand what is required of her if she is to perform well) relevant to the current concerns of the pupil (to engender motivation and engagement) and in conditions that are not threatening (to reduce stress and enhance performance) (Gipps, 1994). This is where teacher assessment can be more equitable since it is under the teacher's control (Gipps, 1994).'37 Teachers are one of the parties who are in the best place to ensure that these conditions are in place and therefore teacher assessment is the method through which pupils have the opportunity to achieve to the best of their ability.


46. Popular concerns regarding teacher bias: the evidence

ATL acknowledges, in proposing a teacher-assessment focus using AfL, that there is a hurdle to be tackled in perceptions about teachers assessments. Harlen (2004) documents the 'widespread assumptions that teachers' assessments are unreliable and subject to bias - despite their use in some countries as a main feature of national and state systems'.38 But Harlen goes on to pose ways in which that unreliability can be addressed; through provision of training around identification and understanding of assessment criteria by teachers and training which highlights sources of potential bias, as revealed through research.39 Studies in Australia have shown that finer specification of criteria, describing progressive levels of competency, can lead to increased reliability of teacher assessment using assessment evidence from the full range of classroom activity.


47. The extent of evidence base for this perception regarding unreliability and bias is open to challenge. Harlen (2004) highlights a key concern with the process through which such a judgement has been reached in the past: "It should be noted that much of the evidence of bias in teachers' assessment comes mainly from studies where TA is compared with another measure and based on the questionable assumption that the benchmark measure is unbiased and is measuring the same thing as the teachers' assessment. So, whilst it has been reported that teachers under-rate boys more than girls in mathematics and science as compared with their performance in tests (Reeves et al, 2001), the conclusion might equally be that boys perform above expectation on mathematics and science tests."40 Researchers have concluded that TA is prone to bias due to systematic variations between TA and standards task/test performance judgements, based on the assumption that the latter measures are unbiased. Yet bias in terms of gender, first language and SEN has also been found in the results of these standard tasks and tests so their original conclusion must be called into question. However, as we propose that teacher assessment, through assessment for learning, should be the only form of assessment throughout pupils' school careers, we acknowledge that bias and its effects must be a key part of training for teachers so that non-relevant assessment factors such as pupil behaviour and gender are recognised as potential sources of bias and influence and guarded against by teacher and moderators. The bias of unfamiliar situations is one which is a risk in national standard tasks and tests, a risk which lessens with teacher assessment.


48. Resource needs of AfL

Literature and research around assessment for learning yield a rich source of support, information and advice to teachers, through research observations, case studies and exemplifications of good practice. And much of that relates to involving the pupils to a far greater degree with their own learning in a conscious fashion combining subject/focussed skill learning with cognitive skills' learning. Teachers have access to examples of AfL techniques such as comment-only marking, peer and self-assessment, open questions that engage pupils and the promotion by the teacher of the liberating notion that wrong answers can as useful as right answers for learning, particularly with the exploration of ideas and concepts.


49. It is crucial that teachers are supported by training and resources. These resources can include exemplifications, concrete examples of good practice, diagnostic instruments, even task banks. Possibly most importantly, is the need for teachers to have space and time to collaborate to share examples of positive classroom experience (or perhaps examples of where/when things did not go so well), growing experience leading to fluency and efficiency with methods and to exploration of new ways of working with students. Students who are skilled and equipped to be self- and peer-assessors can check straightforward tasks. Sensitive and robust moderation procedures are a key part of this vision and here we can envisage a role for LEAs, consortia, clusters or networks of schools. Indeed each school needs to be an assessment community where assessment is something at the heart of each pupil's, each class's and each department's curriculum.


50. Workload implications of AfL

ATL is aware of the implications of this proposed assessment system in terms of new demands and workload. However, ATL believes that workload is not merely an issue of work level, it is also an issue of responsibility, autonomy, and professional satisfaction. It is important to remember that teachers already spend a large proportion of their time on assessment. Saving half of that time by removing or reducing the burden of national testing would more than compensate for the extra time needed for the embedding of assessment for learning practices and the process of moderation which is a vital component of it.


51. Performance tables

Assessment for learning does not lend itself to the narrow forms of data which currently feed performance league tables and ATL wishes to make it clear that the system which we have outlined would be negatively impacted by the continuation of these instruments of high-stakes pressure, particularly on schools and LEAs. Any such focus on narrow, hard data will undermine the learning focus of schools, and inevitably some schools will succumb to the pressure to conform to the rigid measures of the standards agenda. League tables also undercut any notion of collaboration between schools and yet any system which hopes to offer full and broad curricula and personalised learning, needs to promote cost-effective ways for schools to meet those needs through the sharing of resources, expertise and knowledge. This is not a form of accountability which promotes equitable access of opportunity to all and ATL has no hesitation in calling for its abolition - there are other, far more meaningful, forms of accountability and of school information.



Conclusion: Recommendations for Action

52. ATL's vision is for a system where assessment is a key part of learning, a central activity for teacher and pupil within a low-stakes context which does not create a culture of competition in which 'losers' become demotivated or disengaged and in which teachers become empowered, further skilled and re-motivated.


53. ATL calls on the government to:


§ Review the current assessment system with urgency in light of its impact on curriculum coverage and on teaching and learning

§ Investigate the purposes applied to the present national assessment system

§ Develop AfL pilots in schools exempt from national testing during the pilot period

§ Prioritise CPD for teachers in assessment, particularly AfL techniques and strategies

§ End the use of national testing as market information and accountability mechanisms

§ Explore options of cohort sampling to meet national monitoring needs

§ Work with awarding bodies to produce a national bank of test materials as resources for teachers

§ Abolish school performance league tables

§ Explore alternative options to age-dependent levels

And ultimately

§ Postpone national testing until a terminal stage


June 2007


Submission References


1. Daugherty Assessment Review Group (May 2004) Learning pathways through statutory assessment: Key stages 2 and 3. Final Report

2. Assessment Reform Group Assessment Systems for the Future, Working Paper 1: Aims and outcomes of the first year's work of the project, Draft 10 (2006)

3. Newton, PE Clarifying the purposes of educational assessment draft June 2006

4. Swaffield, S (2003) Assessment Literacy for Wise Decisions Association of Teachers and Lecturers, London

5. Assessment Reform Group (2006) The role of teachers in the assessment of learning London

6. Organisation for Economic Co-operation & Development (OECD) Learning for tomorrow's world: First results from PISA 2003 France 2004

7. Webb, R and Vulliamy, G (2006) Coming full circle: The impact of New Labour's education policies on primary school teachers' work, Association of Teachers and Lecturers

8. Ofsted (2005) Primary National Strategy: An evaluation of its impact in primary schools 2004/05

9. Pollard, A, Triggs, P, Broadfoot, P, McNess, E, and Osborne, M. (2000) What pupils say: changing policy and practice in primary education, London, Continuum

10. Assessment Reform Group The role of teachers ...

11. Wilmut, J (2004) Experiences of summative teacher assessment in the UK

12. Assessment Reform Group Ibid

13. Ofsted (2005) Ibid

14. Ofsted (2002) Changing schools - effectiveness of transfer arrangements at age 11: an evaluation by Ofsted

15. Daugherty Assessment Review Group Ibid

16. Wiliam, D (2001) Level best? Levels of attainment in national curriculum assessment Association of Teachers and Lecturers, London

17. Wiliam, Ibid

18. Harlen, W & Deakin-Crick, R (2003) Testing and motivation for Learning Assessment in Education, 10.2 169-208

19. Black, P, Harrison, C, Lee, C, Marshall, B, Wiliam, D (2002) Working Inside the Black Box Dept of Education and Professional Studies, King's College, London

20. Assessment Reform Group Role of teachers...

21. Beverton, S, Harris, T, Gallannaugh, F and Galloway, D, (2005) Teaching approaches to promote consistent level 4 performance in Key Stage 2 English and mathematics University of Durham, School of Education.

22. Harlen, W (2004) Can assessment by teachers be a dependable option for summative purposes? Published in Perspectives on pupil assessment GTC, London

23. Assessment Reform Group (2006) ibid Role of teachers...

24. Murphy, R (2004) Grades of uncertainty: Reviewing the uses and misuses of examination results Association of Teachers and Lecturers, London

25. Swaffield, ibid

26. Wiliam, Ibid

27. Swaffield, ibid

28. National Curriculum Task Group on Assessment and Testing. A report 1988, London: Department of Education and Science

29. Organisation for Economic Co-operation & Development Learning for tomorrow's world: First results from PISA 2003: Executive Summary France 2004

30. Assessment Reform Group Role of teachers...

31. P Black, C Harrison, C Lee, B Marshall and D Wiliam, (2002) Working inside the Black Box Dept of Education & Professional Studies, King's College, London

32. GTC (2004) The role of teacher in pupil assessment Perspectives on Pupil Assessment GTC, London, November 2004

33. P Black, C Harrison, C Lee, B Marshall and D Wiliam, (2002) Working inside the Black Box Dept of Education & Professional Studies, King's College, London

34. Ibid

35. Wiliam, ibid

36. C Gipps & G Stobart (2004) Fairness in Assessment Perspectives on Pupil Assessment GTC, London, November 2004

37. Ibid

38. Harlen, W (2004) Can assessment by teachers be a dependable option for summative purposes? Published in Perspectives on pupil assessment GTC, London

39. GTC (2004) The role of teacher in pupil assessment Perspectives on Pupil Assessment GTC, London, November 2004

40. Harlen, W (2004) Ibid.