Memorandum submitted by Association of
Teachers and Lecturers (ATL)
ATL outlines the current excessive burden imposed
by the current assessment and examination system particularly
under the yoke of performance league tables, and gives a brief
history of the system's development.
Using research evidence, ATL finds:
That the data provided by the testing
and examination system is compromised by the number of purposes
for which it is used.
That these purposes can be met through
a system of cohort sampling, with evidence that this works in
That the current high-stakes national
assessment and testing system:
narrows the curriculum and reduces
flexibility in curriculum coverage;
undermines the Every Child Matters
has a negative impact on pupil attitude;
depresses staff morale and leads
to "teaching to the test".
That the current system of Key Stage
leads to duplication of testing between
stages, particularly between Key Stages 2 and 3;
provides data which is not used by
teachers upon which to build further learning;
does not accurately reflect changes
in performance over time;
does not provide valid information
about students' attainment;
undermines Assessment for Learning
produces performance levels that
are not sustained;
assesses a limited range of skills;
measures schools on indicators that
are not only too narrow but are damaging to learning;
leads to a narrow teaching focus;
"teaching to the test";
excludes many higher-level cognitive
produces simplistic grades which
often of little value in diagnosing learner needs.
ATL proposes a fundamental change to the assessment
system, where we propose assessment for learning as the primary
method of assessment throughout pupils' learning careers in a
league-table free environment that uses cohort sampling to provide
data for national monitoring purposes.
ATL believes that there should be no national
assessment system prior to a terminal stage and international
evidence links high pupil achievement to such systems which postpone
national assessment and selection.
ATL outlines the need for schools to provide
their students with the skills, understanding and desire for lifelong
learning, something which the narrowness and high-pressure of
the current assessment system may prevent.
ATL believes that assessment for learning principles
and practice should underpin teacher assessment which should be,
in the main, formative. This submission provides a wealth of research
evidence about assessment for learning (AfL) and teacher assessment
in the following areas:
the positive impact of AfL on standards;
the tension between AfL and summative
personalised learning and AfL;
AfL and the measuring of achievement;
how AfL's vision of learning and
ability is undermined by age-dependent levels;
teacher assessment and the needs
of a diverse school population;
perceptions of bias in teacher assessment;
resource needs of AfL; and
workload implications of teacher
assessment and AfL.
ATL strongly believes that this proposed system
cannot exist alongside performance tables which already have a
pernicious effect on the current national testing system.
ATL's recommendations for action are for the
Government to do the following:
Review the current assessment system
with urgency in light of its impact on curriculum coverage and
on teaching and learning.
Investigate the purposes applied
to the present national assessment system.
Develop AfL pilots in schools exempt
from national testing during the pilot period.
Prioritise CPD for teachers in assessment,
particularly AfL techniques and strategies.
End the use of national testing as
market information and accountability mechanisms.
Explore options of cohort sampling
to meet national monitoring needs.
Work with awarding bodies to produce
a national bank of test materials as resources for teachers.
Abolish school performance league
Explore alternative options to age-dependent
Postpone national testing until a
1. ATL, as a leading education union, recognises
the link between education policy and our members' conditions
of employment. Our evidence-based policy making enables us to
campaign and negotiate from a position of strength. We champion
good practice and achieve better working lives for our members.
We help our members, as their careers develop, through first-rate
research, advice, information and legal support. Our 160,000 membersteachers,
lecturers, headteachers and support staffare empowered
to get active locally and nationally. We are affiliated to the
TUC, and work with Government and employers by lobbying and through
2. ATL has recently produced Subject
to Change: New Thinking on the Curriculum which questions
whether our current curriculum and assessment systems are fit
for purpose for the needs of society and our young people in the
21st century. This submission is based on these very arguments
and we strongly welcome this Inquiry into testing and assessment,
particularly around areas which challenge the efficacy of current
national arrangements such as Key Stage testing.
3. Our current pupil cohorts experience
years of national assessment and testing; if you count foundation
stage assessment, a pupil who goes on to take A-levels will have
undergone national assessments and tests in seven of their 13
years of schooling. Yet prior to 1988, pupils faced only two external
national testsGCSEs and A-Levelsand a system of
sample testing existed, which was overseen by the Assessment Performance
Unit (APU). During that time, teachers had the power to design
and carry out assessment for pupils not yet undertaking GCSE or
4. New arrangements for testing and league
tables, including the assessment of all pupils by statutory assessment
tasks and tests in core subjects at the ages of seven, 11 and
14 (at the end of Key Stages 1, 2 and 3 respectively) set up by
the 1988 Education Reform Act have had, and continue to have,
a huge impact on the primary and early secondary curricula as
taught in schools.
5. 14-19 debates around curriculum and assessment
have often concentrated on the issues of GCSE and AS/A2 provision
with a resulting focus on the tensions between academic and vocational
qualifications and the demands of external examination processes.
The focus on difficulties of delivery has narrowed the debate
and future thinking. For example, the 14-19 Diplomas, currently
in development, from starting with a vision of integrating academic
and vocational strands is becoming increasingly mooted as a vocational-only
learning route due to the requirements of most stakeholders bar
one, the learner.
6. The introduction of league tables of
school exam and national test results through legislation in the
1990s has had an enormous and detrimental impact on the effects
of the national testing regime in schools and has encouraged a
risk-averse culture there. By placing such emphasis on "standards"
as evinced through test results, league tables have encouraged
"teaching to the test" and the regurgitation by learners
of key "facts" leading to "surface" or "shallow"
7. These measures represent a significant
increase in the accountability to government of schools, teachers
and learners concerning their performance, creating an imbalance
between professional autonomy, professional judgement and accountability
where the latter has assumed a disproportionate part of the experience
of being a teacher.
8. What the current centrally run assessment
and testing system does give us is a large amount of data on pupil
attainment and school performance; indeed at times, this seems
to be its primary raison d'être. However, ATL questions
whether that data in itself is helpful or useful enough to offset
the detrimental effect it is widely acknowledged to have on the
teaching of the current curriculum. The Daugherty Assessment Review
Group in Wales, reviewing assessment arrangements at Key Stages
2 and 3, considered whether the "hard data . . . on pupil
attainments and the targets it gives some pupils to aspire to,
is of sufficient value to compensate for the evident impoverishment
of pupils' learning that is occurring at a critical stage in their
educational development".1 Their conclusion can be inferred
by their recommendation to the Welsh Assembly that statutory National
Curriculum testing of 11 year olds at Key Stage 2 and 14 year
olds at Key Stage 3 should be discontinued.
9. "While the concept of summative
assessment may be simple, the uses of data from summative assessment
are varied and the requirements of different uses make varying
demands in relation to reliability and validity of the assessment".2
As outlined above by the Assessment Reform Group,
the different uses of summative assessment data has a significant
impact on its rigour and its fitness for purpose. Newton (2006)
lists 18 uses for this data, currently:
|1. Social evaluation
||7. Life choice||13. Resource allocation
|2. Formative||8. Qualification
||14. Organisational intervention
|3. Student monitoring||9. Selection
||15. Programme evaluation
|4. Transfer||10. Licensing
||16. System monitoring
|5. Placement||11. School choice
|6. Diagnosis||12. Institution monitoring
||18. National accounting3
10. ATL questions whether one system can be fit for all
these purposes. In terms of assessment, we understand validity
to be the extent to which any assessment succeeds in measuring
what it originally set out to measure. However, a plethora of
purposes means that in fact we are measuring many other things
in addition to the original focus of that assessment; for example,
the aggregation of pupil's grades into broad level for the purposes
of monitoring pupils, schools and systems will impact on the formative
purpose of the assessment, making the outcome far less meaningful.
Swaffield (2003) relates this to the notion of consequential validity:
"This means that even a well-constructed test is not valid
if the results are used inappropriatelywhich moves the
idea of validity on from something which is the concern of test
writers to something which is the responsibility of everyone who
interprets and uses assessment results".4
11. ATL believes that clearer distinctions need to be
made between the respective uses and purposes of assessment. Other
countries' systems make this distinction clearer; strategies used
include those which combine teacher led formative assessment with
the utilisation of a national bank of tests applied for summative
purposes when learners are ready. National monitoring needs are
met through a system of sampling pupils' performance (eg cohort
sampling), thus reducing the overall test burden whilst increasing
the relevance and breadth of the learner evidence. While there
is an economic advantage of collecting readily-available achievement
data, eg the results of end-of-Key-Stage tests, we will demonstrate,
throughout this submission, the lack of useful and relevant information
it provides. If monitoring was separated from the performance
of individual pupils, there would be no need for the central collection
of individual pupil assessment data. As the Assessment Reform
Group conclude, "this would remove the `need' for high stakes
testing and would ensure that assessmentand, more importantly,
what is taughtwas no longer restricted to what can be tested.
The continuation in several countries of regular surveys of small
random samples of pupils indicates the value of this approach".5
In addition to the US National Assessment of Educational Progress
(NAEP), there is New Zealand's National Education Monitoring Project
(NEMP) and nearer to home, the Scottish Survey of Achievement
ACROSS UK AND
12. The Scottish Survey of Achievement could provide
a useful model for further investigation into restoring the place
of teachers to the heart of curriculum and assessment. From the
end of 2002-03, a new system of assessment in Scotland has been
introduced. Teachers there have been provided with an online bank
of assessment materials, based on the Scottish Survey of Achievement.
The aim of these tests is to confirm the teachers' assessments
of their pupils' attainment. These are to be administered to pupils
when teachers deem they are ready to take them, rather than at
a pre-determined time, making testing far more manageable within
the school system and less likely to distort teaching and learning.
Teachers have been supported in this process by the Assessment
is for Learning (AiFL) programme. This has not led to any lack
of accountability in the system; HMIE produce full reports on
schools, based around a set of 33 quality indicators in seven
key areas and the system strongly encourages schools to continually
self-evaluate and assess achievements using these quality indicators.
The Scottish Survey of Achievement also provides national figures,
thus offering a way of measuring national progress over time without
testing every child. The AiFL programme is being fully integrated
into the national assessment system. In England, Assessment for
Learning (AfL) still appears to be a separate strand from the
national testing system, rather than an integrated part of a coherent
13. International comparisons prove particularly interesting
when we constantly hear of rising standards. Indeed, test results
are improving, yet our international standing is falling in terms
of our place on international league tables as evidenced by trends
demonstrated in the PISA/OECD surveys. The UK's standing on international
league tables for 15 year olds has slipped; although the UK's
response rate to the 2003 PISA/OECD survey was too low to ensure
comparability, the mean score that was produced was far lower
than that achieved in the 2000 survey, leading to a fall in ranking
within the OECD countries alone, a drop in place further increased
by the inclusion of non-OECD countries within the survey.6
14. A central proposition to the introduction of the
national curriculum in 1988 was the entitlement of pupils to access
a broad and balanced curriculum. However, the amount of high-stakes
testing has had a well-documented narrowing effect on the curriculum,
undermining this entitlement for many pupils, particularly in
schools fearful of low scores on the league tables.
15. Narrowing curriculum and reducing flexibility
Webb and Vulliamy, carrying out research commissioned by
ATL, document this effect in the primary sector; the standards
agenda, through national curriculum testing in English, Maths
and Science at various key stages and related performance league
tables, "focused teachers' attention on curriculum coverage
in literacy, numeracy and science to the detriment of the rest
of the primary curriculum".7 However, it is not just teachers
and their representatives who are expressing this concern; Ofsted
state in their 2005 evaluation of the impact of the Primary National
Strategy in schools that the raising standards agenda has been
the primary concern of most headteachers and subject leaders coupled
with a far more cautionary approach in promoting greater flexibility
within the curriculum. Ofsted also recognises the narrowing effect
which Key Stage 2 tests have on teaching of the curriculum, in
terms of time and also in terms of support for earlier year groups.8
16. Undermining the Every Child Matters agenda
The negative impact of current assessment mechanisms is not
only diluting the principles of the curriculum vision of 1988,
it is undermining the current Every Child Matters agenda.
The longitudinal PACE project in primary schools in England observed
that curriculum and testing pressures appeared to be "diminishing
the opportunities for teachers to work in a way that enables them
to `develop the whole child' and address the social concerns of
the wider society".9 The Assessment Reform Group notes the
lack of correlation between "the narrow range of learning
outcomes assessed by tests . . . with the broad view of learning
goals reflected in the DfES Every Child Matters policy
document".10 This tension at school level between narrow
standards and school goals of engendering pupil enjoyment and
creativity was strongly expressed by the headteachers who took
part in ATL's research by Webb and Vulliamy.
17. Impact on pupil attitude
And what effect does this "tension" have on our
pupils? A view across schools and colleges, observed by researchers,
is that pupils have become very utilitarian in their views of
what is "worthwhile to pursue"; Ecclestone and Hall
(1999) call this a " . . . strategic and cynical compliance
with assessment requirements" where passing tests is the
primary focus and learning is "marginalised".11 This
is hardly surprising when we consider the high-stakes purposes
of individual assessment data in our current system and the sheer
volume of assessment which each pupil will face. But there are
other pupils for whom such a utilitarian approach is not a possibility;
for lower-achieving pupils, research has shown that the experience
of frequently failing tests is demoralising, reducing self-esteem,
including their belief in their ability to succeed with other
tasks.12 Thus, the gap between higher and lower achieving pupils
widens, exacerbated by the fact that focus on test outcomes reduces
the levels of early identification of under-achievement and appropriate
interventions as noted by Ofsted in relation to the impact of
Key Stage 2 testing.13
18. Impact on education staff
ATL's members, teachers and support staff, with pupils, are
bearing the brunt of the testing overload and the high-stakes
pressure. They are frustrated by the narrowing of the curriculum
and the need to ready pupils for ever-increasing numbers of tests.
This pressure encourages/drives many teachers to be complicit
with the "strategic and cynical compliance" of students
mentioned earlier and to be "presenters of content"
to ensure that their pupils succeed in the narrow focus of the
tests and that the school receives a good ranking on the performance
tables. This process is ultimately de-skilling; an enforced focus
on performance outcomes lessens and undermines richer assessment
skills and feedback and will ultimately weaken these skills within
19. Key Stage tests are effective in producing a vast
quantity of data on pupil performance as defined by the tests.
However, we have earlier addressed the issues of validity around
this data, particularly in regards to the myriad of uses to which
it is put. Research has shown that Key Stage tests lead to a narrowing
of curriculum, and within high-stakes frameworks which include
school performance league tables, to "teaching to the test"
and a destructive emphasis on testing rather than learning. To
further explore the question, it is necessary to address the following
issues which investigate this notion of their effectiveness.
20. Limited value of test result data for further stages
An issue with the testing system currently in use is the
limited value of its data for further stages of learning. The
evidence for this is particularly strong in the transition between
Key Stages 2 and 3. Many secondary schools carry out their own
testing of Year 7 pupils in the autumn term, "a considerable
duplication when pupils have already been assessed in most aspects
of the core subjects at the end of Key Stage 2". (Ofsted)14
It was also one of the main findings of the PPI survey, commissioned
by ACCAC in 2002, that secondary schools did not make extensive
use of the statutory assessment data available to them.15
21. Do they adequately reflect levels of performance in
children and schools, and changes in performance over time?
Many of the purposes of assessment data can be linked to
the standards agenda. Government is particularly concerned with
proving through that agenda that their emphasis on, and investment
in, education is resulting in rising standards over time. Pupils'
grades in national curriculum tests and exams are, indeed, improving
over time. However, Wiliam (2001) argues that any attempt to measure
standards of achievement over time is "doomed" as we
are not comparing like with like; what is taught in schools changes
even if the official curriculum does not. We have already observed
the evidence of growing focus on test-preparation and on teaching
those subjects, or indeed aspects of subjects, which are tested
to the detriment of untested aspects or subjects. Wiliam argues
that the idea of measuring standards over time "in any real
sense is nonsense" and that "while reported standards
may rise, actual level of achievement could be fallingtests
are no longer an adequate proxy for achievement across the whole
22. It is particularly those purposes which add high-stakes
contexts to assessment that limit the value of achievement data.
Tests do not usually test the full range of what is taught and
in low-stakes contexts that limited range of achievement can indicate
achievement across the whole subject.17 Yet we know that once
assessment occurs within high-stakes contexts, there is pressure
on the school and the teacher to focus on the student's performance
on the aspects of the subject likely to be testedwithin
an overburdened curriculum, those aspects will, inevitably, be
given more time. Any such concentration of resources will inevitably
mean that breadth, and indeed depth, of subject coverage will
be sacrificed to the relentless pressure of targets, standards,
tests and league tables. The purpose of assessment as an aid to
the development of learning is shunted into second place.
23. Harlen and Deakin-Crick (2003) concluded from their
research that current high-stakes testing does not provide valid
information about students' attainment due to the narrow focus
of tests and the consequences of being taught to the test leading
to many students not actually possessing the skills or understanding
which the test is designed to assess; the focus of teaching in
this environment is to teach students to pass tests even where
they do not have the skills or understanding.18
24. Do they provide assessment for learning (enabling teachers
to concentrate on areas of a pupil's performance that needs improvement)?
A definition of assessment for learning which centres around
its purpose and focus describes it thus; "assessment for
learning is any assessment for which the first priority in its
design and practice is to serve the purpose of promoting pupils"
learning. It thus differs from assessment designed primarily to
serve the purposes of accountability, or of ranking, or of certifying
competence".19 This definition demonstrates that Key Stage
tests with their current emphasis on ranking, certification and
accountability do not provide assessment for learning. Many good
teachers use an assessment for learning approach working with
learners to gather and interpret evidence to use to discover "where
the learners are in their learning, where they need to go and
how best to get there".20 However, the pressures of test
preparation and the importance of grade achievement have made
it a secondary or "add-on" practice in many schools
25. Does testing help to improve levels of attainment?
Fall-off in pupil performance from Y6 to Y8 due to "hot housing"
We hear all the time that standards are improving; ATL questions
whether this means that our pupils are learning more and better.
Research would suggest otherwise. Durham University carried out
research, commissioned by the DfES, which noted the lack of evidence
to show that pupils reaching Level 4 at Key Stage 2 will retain
their learning, let alone progress to higher learning. They cite
a study by Watson (2002) which showed how a "level focus"
and booster classes temporarily raised pupils to mathematics Level
4 but that was not sustained over a period of six months to a
year. Not only were learning outcomes not sustained but the Durham
university report also details how high stakes assessment encourages
a more rigid teaching style which disadvantages and lowers the
self-esteem "of those who prefer more active and creative
ways of learning".21
26. The current system is perceived as a selection system
by pupils. The totemic importance of Level 4 at Key Stage 2 is
now so huge that pupils who fail to achieve it cannot be blamed
for feeling just thatfailures. And we know that this is
just how many of them do feel. We also know the effect this has
on their future attitudes to learning. It is therefore no surprise
that there is a dip in performance between Year 6 and 7. The policy
of a differentiated offer post age-14 makes the Key Stage 3 tests
an even clearer selection mechanism, determining how pupils' "choice"
is to be "guided".
27. Are they effective in holding schools accountable for
Whilst ATL must again question the notion of effectiveness
in this context, we acknowledge that Key Stage tests are "effective"
in holding schools accountable for aspects of their performance,
ie the performance of pupils in Key Stage tests. However, the
cost of this excessive accountability is high. An IPSOS Mori poll
in October 2006 found that the current target-driven culture was
one of the top factors to demotivate teachers. Also, Key Stage
tests are holding schools accountable for their performance across
only a part of the curriculum; we have already documented research
evidence around curriculum narrowing, the lack of sustainability
of learning into subsequent key stages, the negative impact on
attitudes towards learning amongst students and the lack of evidence
of real attainment across the whole subject or curriculum.
28. "Teaching to the test"the high-stakes
nature of test results leads to narrow teaching focus
Despite the earlier mentioned demotivating effects of working
within the current national assessment system, teachers are working
so that their pupils have the opportunity to succeed within those
same systems. There is strong evidence that rising test scores
are not caused by rising standards of achievement but are rather
the effect of growing familiarity amongst teachers and students
with test requirements; research shows that changes in the tests
are accompanied by a sudden fall in achievement, followed by a
rise as teachers begin "teaching to the new test".22
29. National curriculum tests and exams as assessment measures
National curriculum tests and exams have long struggled to
produce assessment instruments of high validity with optimum reliability
and coursework and teacher assessment are examples of their attempts
to ensure greater validity. However, these were add-ons, expected
to fit in around the testing/examination system and thus were
compromised in value and in practice. We have already noted that
the limited coverage possible in tests combined with a high-stakes
environment has a corresponding curtailing effect on the taught
curriculum in schools. However, the format of the national tests
which are written tests of limited duration also "excludes
many of the higher-level cognitive and communication skills and
the ability to learn both independently and collaboratively".23
30. Proponents for exams cite their objectivity, an assertion
which needs to be briefly examined before we move onto a viable
alternative. Public examination grades are not exact measures;
they are approximate with known margins for error. These grades
depend upon the judgements of examiners, who though very often
highly professional, skilled and experienced people are also fallible
human beings. Grades depend on snapshots of student performance
under very particular conditions, at a certain point of time and
in response to a certain set of assessment tasks. And e-assessment
will not remove these featuresit may bring many advantages
of efficiency but "it won't by itself eliminate grade uncertainties".24
31. In addition, the needs of many of the assessment
purposes outlined in paragraph 9 for simplistic grades mean that
that much useful information about actual performance is lost.
Sue Swaffield warns of the limitations of this data: "Summary
statistics are often used to compare individual pupils or schools.
In doing so, it is important to remember that any single score
or level could have been arrived at from a wide variety of individual
judgements, and so a level or grade gives no specific information
about a pupil's performance. Much more information is needed if
teachers in the next year group or school are to build upon pupils'
prior attainment".25 Furthermore, there is a danger that
we "fail to appreciate the impact of test unreliability"
(it is likely that the proportion of students awarded a level
higher or lower than they should be because of test unreliability
is at least 30% at KS2, for example) on the "reliability
of change scores for individuals"26 hindering diagnosis of
a learning problem, should one exist.
32. Standardised tests can also obfuscate the meaning
of pupil performance. For example, many tests offer multiple choice
options to the pupil but these can confuse a reader who understood
the text perfectly but was confused by the similarity of the choices
offerednot by the text.27 Without the teacher there to
mediate, clarify and feedback the learning to the pupil, we, and
they, lose the meaning and ultimately, it is the learner who loses
ATL VISION FOR
33. Change at a fundamental level
ATL is arguing for a fundamental change in the assessment
system; it is not enough to hand over the administration of summative
assessment to teachers within a high stakes context and expect
real advances in pupil achievement and engagement. Otherwise we
are in danger of merely adding workload to teachers with no real
addition in terms of professional autonomy nor a move to assessment
which puts learning in first place. This fundamental change means
that we are proposing assessment for learning as the primary method
of assessment throughout the career of pupils in a league-table
free environment that uses cohort sampling to provide data for
national monitoring purposes.
34. No national assessment system prior to terminal stage
Due to the here- and elsewhere-documented detrimental effect
of national curriculum testing on teaching and learning, ATL believes
that there should be no national assessment system prior to a
terminal stage. We believe that the present and future needs of
our society requires an assessment system which focuses learners
on learning rather than tests, maintains the breadth which was
part of the vision of the National Curriculum in 1988 and which
encapsulates part of the current vision for Every Child Matters,
and which engages learners as participants in their learning and
35. It can be argued that a system which postpones summative
assessment at a national level fits within the earlier recommendations
of the Task Group on Assessment and Testing (TGAT). The original
vision of TGAT was for "an assessment system designed for
formative purposes" which "can meet all the needs of
national assessment at ages before 16 . . . only at 16 does it
seem appropriate for assessment components to be designed specifically
for summative purposes (paragraph 26)".28
36. International evidence now clearly links high pupil
achievement with systems which postpone national assessment and
selection. Finland's education system is a strong example of this
as it is one which has gained it a high (often first) place on
the OECD Programme for International Student Achievement (PISA)
surveys of 2000 and 2003 with top ranking scores in mathematics,
problem solving, science and reading and it defers national testing
until a terminal stage. In fact, not only did Finland's students
score highly in terms of performance and proficiency, but they
demonstrated positive attitudes towards learning as this excerpt
from the Executive Summary of the 2003 survey indicates: "For
example, more than half of the students in France and Japan report
that they get very tense when they have to do mathematics homework,
but only 7% of students in Finland and the Netherlands report
this. It is noteworthy that Finland and the Netherlands are also
two of the top performing countries".29
37. Focus on learning
Across subjects, there are two key sets of goals: that pupils
learn with understanding (develop understanding of concepts which
can be applied in different contexts, identifying the links between
different situations, applying the learning); and, understanding
learning (that learners develop awareness of the process of learning).
ATL has argued, and indeed it is widely recognised, that "students
cannot learn in school everything they will need to know in adult
life" [OECD, 1999]30 and therefore, schools must provide
"the skills, understanding and desire needed for lifelong
learning". This means that we need to look critically at
our assessment systems, which have a huge influence on what is
taught in the classroom and as we have demonstrated earlier in
this submission, our current assessment system produces "strategic
and cynical" test-takers rather than engaged and questioning
lifelong learners with the flexibility needed for a rapidly changing
38. Formative assessment, assessment for learning (AfL)
and personalised learning
ATL believes that assessment for learning principles and
practices should underpin teacher assessment in schools and colleges.
When assessment for learning (AfL) is talked of as a strong assessment
model to support pupil learning and engagement, the formative
aspects of assessment are highlighted, when evidence of pupil
learning is used to identify learning needs and to adapt teaching
work accordingly to meet them. The education community are fortunate
to have an abundance of evidence to demonstrate the positive effects
of formative assessment, even within the current system. Black
et al (2002) answer the question, "Is there evidence
that improving formative assessment raises standards?"
with "an unequivocal yes, a conclusion based on a review,
by Black and Wiliam (1998a), of evidence published in over 250
articles by researchers from several countries. There have been
few initiatives in education with such a strong body of evidence
to support a claim to raise standards".31 They found that
an increased focus on using formative assessment as principle
and practice within the classroom produced gains in pupil achievement,
even when measured in narrow terms such as national curriculum
tests and examinations.
39. Research by the Assessment Reform Group endorses
this finding regarding the weight of evidence that assessment
for learning, with its formative assessment focus, has a positive
impact on summative results, citing a quarter to a half GCSE grade
improvement per student. However, their research does point to
the tension between assessment for learning and summative assessment
which clouds the "improvement" focus of AfL, subsumed
by information about successes and failures.32 This argues for
ATL's proposition that assessment for learning becomes the norm
for teachers and pupils throughout the school careers of learners;
it cannot fully realise its potential and vision within a system
which has summative national tests and examinations at its core.
40. The advent of "personalised learning" on
the horizon has brought AfL to the fore. This is unsurprising
as assessment for learning is an approach which has the learning
needs of individual students at its heart and is one which involves
students far more directly in the assessment process. The DfES
rightly sees the assessment for learning model as being school-based,
collaborative, whole-school enquiry and yet this model cannot
fit within a high-stakes assessment system which adds huge time
and focus pressures to schools, creating a risk-averse school
culture and through league tables, pits school against school.
This is a fundamental flaw with the assessment for learning focus
within the Making Good Progress project which will be hampered
by its having to develop alongside more frequent national testing
41. AfL requires a fundamental re-think in how we measure
This will require a culture change in schools and indeed,
the wider community, about how we see achievement in schools.
Many pupils and their parents will see learning tasks as competitions,
achievement marked by a grade or a ranking within the class. One
of the key problems with this "win/lose" view is that
those who often lose no longer even try; better to switch off
rather than risk "failure". Teachers working with researchers
on formative assessment methods have found that "whilst pupils'
learning can be advanced by feedback through comments, the giving
of marksor gradeshas a negative effect in that pupils
ignore comments when marks are also given".33 Once grades
were removed, pupils concentrated on the feedback given by the
teacher and on how it could help them improve.
42. Research shows that grading and feedback have a big
impact on pupil motivation and resulting willingness to engage
in tasks and learning. Black et al detail key research
findings on these effects:
"Pupils told that feedback ` . . . will help
you to learn' learn more than those told that `how you do tells
us how smart you are and what grades you'll get'; the difference
is greatest for low attainers (Newman & Schwager, 1995);
Those given feedback as marks are likely to see
it as a way of comparing themselves with others (ego-involvement),
those given only comments see it as helping them to improve (task-involvement):
the latter group out-performs the former (Butler, 1987); and
In a competitive system, low attainers attribute
their performance to lack of `ability', high attainers to their
effort; in a task-oriented system, all attribute to effort, and
learning is improved, particularly amongst low attainers (Craven
et al, 1991)".34
This evidence shows that the returns for making this kind
of change to how we assess learning will be significant, particularly
amongst those who are currently losing out under the current system.
43. Move away from age-dependent levels
Target-setting, within the standards agenda, has led to a
system of age-dependent levels. Again, researchers have argued
that these mitigate against learning through an erroneous and
demotivating belief about the nature of ability. Wiliam highlights
the work of Dweck and her colleagues on students' views of the
nature of ability and how that has a profound impact on how they
react to challenging tasks. Those who see ability as a fixed entity,
"how clever you are is how clever you stay" will tackle
a challenging task if they believe their chance of success is
high but will not engage if they believe that their chance of
success is low. Those who see ability as incremental will see
a challenging task as offering a chance to "get cleverer",
ie to improve ability. As Wiliam observes, "in order to optimise
the conditions for learning, it is therefore necessary for students
to believe that ability is incremental, rather than fixed. A system
of age-dependent levels would lead to a situation in which many
students would get the same grade or level at ages 7, 11 and 14,
thus potentially reinforcing a belief in ability as being fixed".35
44. Teacher-led assessment and the needs of a diverse school
Our current curriculum and assessment models are based on
the idea of "homogeneous knowledge to be owned by all".
Shohamy (2000) observes this emphasis on homogeneous knowledge:
"This is even more apparent in educational assessment. In
a number of situations there is a gap between curricula and assessment
as curricula may, at times, contain statements and intentions
for the recognition of diverse knowledge, yet the tests are based
on homogeneous knowledge".36 It is not possible to de-contextualise
assessment but ATL believes that local teacher-led assessment
makes it possible to minimise the use of contexts which will have
a detrimental effect on pupils' opportunities for achievement.
45. ATL believes that a fair assessment system is one
which "elicit[s] an individual's best performance" and
Gipps details the factors that need to be in place for assessment
tasks or tests for this to occur: "This involves tasks that
are concrete and within the experience of the pupil (an equal
access issue) presented clearly (the pupil must understand what
is required of her if she is to perform well) relevant to the
current concerns of the pupil (to engender motivation and engagement)
and in conditions that are not threatening (to reduce stress and
enhance performance) (Gipps, 1994). This is where teacher assessment
can be more equitable since it is under the teacher's control
(Gipps, 1994)".37 Teachers are one of the parties who are
in the best place to ensure that these conditions are in place
and therefore teacher assessment is the method through which pupils
have the opportunity to achieve to the best of their ability.
46. Popular concerns regarding teacher bias: the evidence
ATL acknowledges, in proposing a teacher-assessment focus
using AfL, that there is a hurdle to be tackled in perceptions
about teachers assessments. Harlen (2004) documents the "widespread
assumptions that teachers' assessments are unreliable and subject
to biasdespite their use in some countries as a main feature
of national and state systems".38 But Harlen goes on to pose
ways in which that unreliability can be addressed; through provision
of training around identification and understanding of assessment
criteria by teachers and training which highlights sources of
potential bias, as revealed through research.39 Studies in Australia
have shown that finer specification of criteria, describing progressive
levels of competency, can lead to increased reliability of teacher
assessment using assessment evidence from the full range of classroom
47. The extent of evidence base for this perception regarding
unreliability and bias is open to challenge. Harlen (2004) highlights
a key concern with the process through which such a judgement
has been reached in the past: "It should be noted that much
of the evidence of bias in teachers' assessment comes mainly from
studies where TA is compared with another measure and based on
the questionable assumption that the benchmark measure is unbiased
and is measuring the same thing as the teachers' assessment. So,
whilst it has been reported that teachers under-rate boys more
than girls in mathematics and science as compared with their performance
in tests (Reeves et al, 2001), the conclusion might equally
be that boys perform above expectation on mathematics and science
tests".40 Researchers have concluded that TA is prone to
bias due to systematic variations between TA and standards task/test
performance judgements, based on the assumption that the latter
measures are unbiased. Yet bias in terms of gender, first language
and SEN has also been found in the results of these standard tasks
and tests so their original conclusion must be called into question.
However, as we propose that teacher assessment, through assessment
for learning, should be the only form of assessment throughout
pupils' school careers, we acknowledge that bias and its effects
must be a key part of training for teachers so that non-relevant
assessment factors such as pupil behaviour and gender are recognised
as potential sources of bias and influence and guarded against
by teacher and moderators. The bias of unfamiliar situations is
one which is a risk in national standard tasks and tests, a risk
which lessens with teacher assessment.
48. Resource needs of AfL
Literature and research around assessment for learning yield
a rich source of support, information and advice to teachers,
through research observations, case studies and exemplifications
of good practice. And much of that relates to involving the pupils
to a far greater degree with their own learning in a conscious
fashion combining subject/focussed skill learning with cognitive
skills' learning. Teachers have access to examples of AfL techniques
such as comment-only marking, peer and self-assessment, open questions
that engage pupils and the promotion by the teacher of the liberating
notion that wrong answers can as useful as right answers for learning,
particularly with the exploration of ideas and concepts.
49. It is crucial that teachers are supported by training
and resources. These resources can include exemplifications, concrete
examples of good practice, diagnostic instruments, even task banks.
Possibly most importantly, is the need for teachers to have space
and time to collaborate to share examples of positive classroom
experience (or perhaps examples of where/when things did not go
so well), growing experience leading to fluency and efficiency
with methods and to exploration of new ways of working with students.
Students who are skilled and equipped to be self- and peer-assessors
can check straightforward tasks. Sensitive and robust moderation
procedures are a key part of this vision and here we can envisage
a role for LEAs, consortia, clusters or networks of schools. Indeed
each school needs to be an assessment community where assessment
is something at the heart of each pupil's, each class's and each
50. Workload implications of AfL
ATL is aware of the implications of this proposed assessment
system in terms of new demands and workload. However, ATL believes
that workload is not merely an issue of work level, it is also
an issue of responsibility, autonomy, and professional satisfaction.
It is important to remember that teachers already spend a large
proportion of their time on assessment. Saving half of that time
by removing or reducing the burden of national testing would more
than compensate for the extra time needed for the embedding of
assessment for learning practices and the process of moderation
which is a vital component of it.
51. Performance tables
Assessment for learning does not lend itself to the narrow
forms of data which currently feed performance league tables and
ATL wishes to make it clear that the system which we have outlined
would be negatively impacted by the continuation of these instruments
of high-stakes pressure, particularly on schools and LEAs. Any
such focus on narrow, hard data will undermine the learning focus
of schools, and inevitably some schools will succumb to the pressure
to conform to the rigid measures of the standards agenda. League
tables also undercut any notion of collaboration between schools
and yet any system which hopes to offer full and broad curricula
and personalised learning, needs to promote cost-effective ways
for schools to meet those needs through the sharing of resources,
expertise and knowledge. This is not a form of accountability
which promotes equitable access of opportunity to all and ATL
has no hesitation in calling for its abolitionthere are
other, far more meaningful, forms of accountability and of school
52. ATL's vision is for a system where assessment is
a key part of learning, a central activity for teacher and pupil
within a low-stakes context which does not create a culture of
competition in which "losers" become demotivated or
disengaged and in which teachers become empowered, further skilled
53. ATL calls on the Government to:
Review the current assessment system with urgency
in light of its impact on curriculum coverage and on teaching
Investigate the purposes applied to the present
national assessment system.
Develop AfL pilots in schools exempt from national
testing during the pilot period.
Prioritise CPD for teachers in assessment, particularly
AfL techniques and strategies.
End the use of national testing as market information
and accountability mechanisms.
Explore options of cohort sampling to meet national
Work with awarding bodies to produce a national
bank of test materials as resources for teachers.
Abolish school performance league tables.
Explore alternative options to age-dependent levels.
Postpone national testing until a terminal stage.
1. Daugherty Assessment Review Group (May 2004) Learning
pathways through statutory assessment: Key Stages 2 and 3. Final
2. Assessment Reform Group Assessment Systems for the
Future, Working Paper 1: Aims and outcomes of the first year's
work of the project, Draft 10 (2006).
3. Newton, PE Clarifying the purposes of educational assessment
draft June 2006.
4. Swaffield, S (2003) Assessment Literacy for Wise Decisions
Association of Teachers and Lecturers, London.
5. Assessment Reform Group (2006) The role of teachers
in the assessment of learning London.
6. Organisation for Economic Co-operation & Development
(OECD) Learning for tomorrow's world: First results from PISA
2003 France 2004.
7. Webb, R and Vulliamy, G (2006) Coming full circle: The
impact of New Labour's education policies on primary school teachers'
work, Association of Teachers and Lecturers.
8. Ofsted (2005) Primary National Strategy: An evaluation
of its impact in primary schools 2004-05.
9. Pollard, A, Triggs, P, Broadfoot, P, McNess, E, and
Osborne, M. (2000) What pupils say: changing policy and practice
in primary education, London, Continuum.
10. Assessment Reform Group The role of teachers . . .
11. Wilmut, J (2004) Experiences of summative teacher
assessment in the UK.
12. Assessment Reform Group Ibid.
13. Ofsted (2005) Ibid.
14. Ofsted (2002) Changing schoolseffectiveness
of transfer arrangements at age 11: an evaluation by Ofsted.
15. Daugherty Assessment Review Group Ibid.
16. Wiliam, D (2001) Level best? Levels of attainment in
national curriculum assessment Association of Teachers and
17. Wiliam, Ibid.
18. Harlen, W & Deakin-Crick, R (2003) Testing and
motivation for Learning Assessment in Education, 10.2 169-208.
19. Black, P, Harrison, C, Lee, C, Marshall, B, Wiliam, D
(2002) Working Inside the Black Box Dept of Education and
Professional Studies, King's College, London.
20. Assessment Reform Group Role of teachers . . .
21. Beverton, S, Harris, T, Gallannaugh, F and Galloway,
D, (2005) Teaching approaches to promote consistent Level
4 performance in Key Stage 2 English and mathematics University
of Durham, School of Education.
22. Harlen, W (2004) Can assessment by teachers be a dependable
option for summative purposes? Published in Perspectives
on pupil assessment GTC, London.
23. Assessment Reform Group (2006) ibid Role of teachers
. . .
24. Murphy, R (2004) Grades of uncertainty: Reviewing
the uses and misuses of examination results Association of Teachers
and Lecturers, London.
25. Swaffield, ibid.
26. Wiliam, Ibid.
27. Swaffield, ibid.
28. National Curriculum Task Group on Assessment and Testing.
A report 1988, London: Department of Education and Science.
29. Organisation for Economic Co-operation & Development
Learning for tomorrow's world: First results from PISA 2003:
Executive Summary France 2004.
30. Assessment Reform Group Role of teachers . . .
31. P Black, C Harrison, C Lee, B Marshall and D Wiliam,
(2002) Working inside the Black Box Dept of Education &
Professional Studies, King's College, London.
32. GTC (2004) The role of teacher in pupil assessment
Perspectives on Pupil Assessment GTC, London, November 2004.
33. P Black, C Harrison, C Lee, B Marshall and D Wiliam, (2002)
Working inside the Black Box Dept of Education & Professional
Studies, King's College, London.
35. Wiliam, ibid.
36. C Gipps & G Stobart (2004) Fairness in Assessment
Perspectives on Pupil Assessment GTC, London, November 2004.
38. Harlen, W (2004) Can assessment by teachers be a dependable
option for summative purposes? Published in Perspectives
on pupil assessment GTC, London.
39. GTC (2004) The role of teacher in pupil assessment
Perspectives on Pupil Assessment GTC, London, November 2004.
40. Harlen, W (2004) Ibid.