APPENDIX 3
Submission by Ofsted to the Tomlinson
inquiry (QCA 25)
A. SUMMARY OF
MAIN FINDINGS
AND PROPOSALS:
The introduction in September 2000
of the new AS/A2 structure, as part of "Curriculum 2000",
has features which have been widely welcomed in many schools and
colleges. However, its rapid implementation created difficulties
initially with specifications and standards, as well as with the
workload demands on teachers and students, and the assessment
regime associated with it remains burdensome and volatile.
QCA, working with the awarding bodies,
has put in great efforts, but has not succeeded in providing an
adequate system of quality assurance to give national confidence
about the value and consistency of awards.
The roles and operations of QCA have
risked being insufficiently sharply focused for it to be fully
successful as a regulator.
Much more work would be needed to
ensure consistency and comparability across awarding bodies, and
a single national examining body should be considered seriously.
A review of the quality of examiners
and of their recruitment, terms and conditions and remuneration,
as well as of the timing of awards, is a matter of urgency.
Ofsted has the potential to contribute
far more strongly to the setting and maintenance of standards,
drawing on its subject expertise and knowledge of schools and
colleges.
B. COMMENTARY:
The following comments, drawn from all of our
sources of evidence, including more informal intelligence-gathering,
are offered under the following three topics:
(i) The AS/A2 structure;
(ii) The role of QCA;
(iii) The examination groups.
A digest of key points from inspection is attached
as an annex, together with a note about the sources of Ofsted
evidence on which we have drawn.
(i) The AS/A2 Structure
1. In reporting on the first year of implementation,
HMI pointed to a number of positive features, and these have become
more firmly embedded in the second year. Nevertheless, some of
the problematic inbuilt design features remain a significant cause
for concern. It is undeniable that students face an ever more
exacting schedule of assessment, and that the character of Year
12 has changed dramatically. These changes have had beneficial
effects in concentrating teachers' and students' minds and giving
a real sense of purpose, and have broadly maintained the rigour
and depth expected for advanced study. However, evidence from
our survey and other subject inspections suggests that they have
also on occasion narrowed the students' range of knowledge and
experience within subjects, while not always succeeding in broadening
coverage of the areas of the curriculum through the choice of
a range of contrasting AS courses.
2. The weaknesses in the assessment structure
have been rehearsed at length elsewhere (and the fact that these
are only in part weaknesses of control). There is an inherent
self-contradiction in the new ASpartially masked in "old"
AS. The standard either is that of a full A Level (in which case
it is often too high for Year 12) or it isn't, in which case it
cannot be right to allow Year 13 students to "improve"
their performance. A statistical change to make AS weighted at,
say, 40% of the full A Level would help, but it would still be
subject to objection. The weighting attached to coursework is
another serious concern. It is a concern not least because of
the risk, in these IT-dominated times, of much re-drafting and
possible cheating. But, in addition, analyses of the spread of
candidates' results have often indicated that coursework marks
and grades can be significantly inflated when compared with students'
unaided work under controlled conditions.
3. One of the instantly glaring anomalies
of the results this year is that candidates were being given a
"U" grade for courseworksomething hitherto virtually
unheard of, let alone where candidates were securing A grades
on written papers. These anomalous results were palpably indefensible
and should have been spotted.
4. Modules compound the intrinsic problems
over maintaining standards. In some subjects and some parts of
subjects, specialists feel that it may be perfectly proper to
"sign off" students' achievements before the end of
the course. But there are many others where it is not, because
of the importance of gradual maturation and skills development,
and where both curriculum and assessment standards are potentially
distorted by early completion. All of these factors may contribute
to variability in assessed standards, with the risk that grades
no longer conform to previously accepted levels of achievement;
however, it is also of concern that the lack of clarity over the
expected standard makes it so difficult to determine whether they
do so or not.
5. The AS/A2 structure has the potential
for ending up as something of a compromise, failing for too many
students to achieve either breadth or depth in a satisfying and
sensible way, and not clearly representing improvements in quality
and the safeguarding of standards for all. However, many individual
institutions and students have certainly appreciated the greater
range and tighter structure, as our evidence from surveying "Curriculum
2000" makes clear. Some have suggested various forms of baccalaureate
approach as an alternative. We would argue that there are good
reasons for not moving from AS/A2 to such an approach in the short
term, partly on pragmatic grounds: there has been so much turbulence
in the system that a further radical overhaul is the last thing
that is wanted. In the medium term, however, a case for such an
approach (with strands such as humanities, languages, science,
economics, technology, performing arts) could be made. The structure
of higher and standard level subjects within a coherent and intrinsically
broad curricular framework has been found successful in schools
which have adopted the International Baccalaureate, and HMI inspection
has commented on much work of high quality in the work of candidates
preparing for this qualification. But the demands of such courses
are high, and their assessment systems are by no means proof against
criticism. Meanwhile, the best way to progress is to eradicate
the most glaring weaknesses of the current system (overweighting
AS; retaking modules; excessive reliance on coursework) with a
sufficiently rigorous system of quality assurance.
(ii) The Role of the QCA
6. Events over the past two years particularly
have given widespread credence to the view that QCA has failed
to act as a firm regulator of the system and of the work of the
examining groups. In many ways, the QCA has had a hard set of
challenges, and its staff deserve enormous credit for the way
in which they have sought to cope with the range of initiatives
and new tests and qualifications. But its roles have been too
varied, its teeth too few and its management and managers not
always able to provide a constant level of leadership, partly
because of frequent changes at Chief Executive level. There has
also been much uncertainty about its relationship with its parent
department, the DfES, a matter which requires urgent resolution.
7. The QCA has also been too much involved
in evaluating its own advice or policies, in a way that can lead
to defensiveness and a lack of transparency. In consequence, it
has failed to exercise effective quality controls on the awarding
bodies. The reasons for this are complex. They relate in part
to the weaknesses in the powers, which QCA was formally given,
as QCA officers have, reasonably, pointed out. However, there
has also appeared to be a lack of resolution, even within the
powers it has had, in taking decisive action against anomalous
or inconsistent actions on the part of individual awarding bodies.
With its recent lack of involvement with awarding procedures,
Ofsted has no direct, first-hand or up-to-date evidence on this,
but the evidence from our specialist advisers, and from their
contacts in the system, indicates that the scrutinising procedures
adopted have been rather variable in effectiveness. They have
frequently been revamped, but currently, the QCA does not always
have a strong presence in the very processes most critical in
determining standards (awarding, standardising and borderlining
meetings, together with those intra-Board procedures which follow
these, which is where statistical overlays are applied to the
examiners' professional judgements). Furthermore, our subject
monitoring suggests that the new scrutinies have often not been
staffed or managed in such a way as to ensure quality; and the
reports, while engaging with key issues, are at times too anodyne
and lacking in decisive effect.
8. For these reasons, it is evident that
if QCA is to be an effective regulator it needs strong management,
clear powers and a real commitment to setting and monitoring standards.
All of these can, to a large extent, be addressed internally,
and it is already apparent that the new Chief Executive has them
on his agenda. However, it remains the case that QCA can be seen
as too complicit in the very weaknesses that need addressing:
intrinsically part of the problem, not of the solution. Moreover,
although it has much expertise, it inevitably lacks the kind of
perspective on standards in the field possessed by Ofsted.
9. Hence the arguments for a possible development
of Ofsted's remit on these matters seem strong ones. In particular,
the risk of having "standards" apparently "guarded"
by two largely separate arms of government is a real one; and
while in principle the two roles can be seen as mutually supportive
and complementary, in practice this is an unhealthy schism which
can erode confidence and generate uncertainty. Reporting against
nationally assessed standards is crucial to Ofsted's role. There
are currently significant doubts about the extent of grade drift
and about the value of an A at A level or a C at GCSE. A firm
fix on what these grades mean is needed, and at present it is
lacking. Ofsted is, because of its remit, rights of access and
expertise, uniquely well placed to contribute to the independent
review of examining standards for which this year's events have
simply underlined the long pressing need.
10. This argument in no way reflects a belief
that Ofsted should usurp what are properly the functions of others,
but it is born of a strong desire to work in effective partnership
with them. We would suggest that Ofsted's role should be focused
essentially on issues connected with monitoring standards at all
stages: in assessing and reporting on standards of syllabus construction,
of setting questions and writing mark schemes, and of awarding
and grading procedures. This development should encompass a wider
exploration of how Ofsted employs its specialist expertise (eg
through HMI subject advisers) in relation to QCA, and its working
groups. A key principle should be the importance of integrating
the evidence of standards and quality provided by inspection and
that emerging from assessments. There is scope for further joint
quality assurance work between Ofsted and the QCA, to evaluate
independently both the standards achieved at the various grades
and the reliability and validity of marking and awarding. The
aftermath of the quinquennial review of the QCA provides a good
opportunity to analyse functions in a coherent and systematic
way, and also to ensure that Ofsted is not excluded from access
to processes, which are of the utmost importance in determining
standards. There are also important matters about the role of
teachers in assessment procedures, with scope for exploring more
widesoread and planned development of teachers' professional skills
through experience of examining. In summary, therefore, our case
is that:
Ofsted's annual reporting on standards
is strongly inter-dependent with the outcomes of testing and examination
regimes. Unless Ofsted can have complete confidence in the reliability
of those data, a key element of Ofsted's benchmarking is lost.
Closer integration of the scrutiny
of standards which occurs within inspection and that which relates
to external assessment procedures would be possiblewith Ofsted's
involvement in the latter.
The links between standards, curriculum,
assessment and pedagogy are so important that there would be advantages
in having a body with the capacity to offer a clear overview of
these interlocking elements.
Ofsted has proved itself successful
in delivering high quality advice on standards, draws on the long
experience of HMI in thinking and writing about the curriculum,
and has a large number of high-level subject specialist HMI who
could valuably be involved more fully in monitoring standards
of assessment.
11. Based on the above analysis, we propose
the following specific areas of work where Ofsted might usefully
become involved:
independent inspections, leading
to public report, on the work of individual awarding bodies;
within or in addition to such inspections,
scrutinies and reports on standardising and awarding procedures
for particular qualifications;
checks on year-on-year consistency
in awarding standards, looking in particular at the effects on
such features as: the level of questions; the effect of changes
to assessment procedures; the relationship between course-assessed
elements and terminal tests; and objective evidence of performance
in basic skills elements (eg written and computational accuracy);
and
evaluation of particular stages/facets
of the curriculum, perhaps leading more broadly to a more formal
locus in advice on curricular matters.
12. To enhance Ofsted's work in this way
would be an evolutionary development, not a radical break with
the past. For many years, both while HMI worked more directly
with the DfES and in the early period of Ofsted's history, it
was standard practice for HMI to attend subject meetings of the
examination boards and hence to scrutinise scripts. However, in
recent years that traditional role has fallen into disuse, not
least because QCA's roles differed in significant respects from
those of its predecessor bodies and because, in consequence of
this, it has set up its own quality arm. Still more recently,
there has been a keen desire on both sides for Ofsted to work
more closely again with the QCA.
(iii) Examination Groups
13. Events in the last two years have demonstrated
that the examining groups are currently not always successful
at self-regulation and that they are subject to inadequate external
controls. This is in no way to minimise the extraordinary job
the three groups have, in many respects, done to cope with change,
keep the system going and meet exacting deadlines and new requirements.
However, the system has creaked and groaned with every innovation
and additional assessment load. Structural weaknesses have been
evident in the examining system, and the questions raised may
affect every level, from the competence of individual examiners,
to the quality of administration and to the whole operation of
grade determination. Nor are problems confined to the general
awards: weaknesses over vocationally-related certification have
been recorded by HMI and others over a number of years.
14. Various suggestions have been proposed,
many of which miss the central point, which is simply one of consistency
and credibility. Any extension of the "free market"
approach is fraught with potential problems, if the key aim is
to achieve sufficient consistency of standards and practice. However,
it is right to continue to pose the question "three or one?"
since the justification for having competition among three boards,
setting syllabuses and examinations to a single national framework
and intending to offer awards which are nationally comparable,
is inherently weak. The temptations in the system (such as providing
the "easiest" - or "hardest" - examinations)
are obvious. A single examining board for all general awards would
be a leviathan and a monopoly; it would be likely to reduce choice
and risk over-centralisation, and might be exceptionally demanding
to manage. However, to many it has an inexorable logic, given
the weight attached to these awards, eg by higher education. The
evidence from comparability studies done over a number of years
is anything but reassuring: the reduction of boards has perhaps
limited the extent of inter-board variability, but Ofsted's subject
evidence shows that this still continues. In addition, "subject
pairs" analysis has exposed that there are not just hard
boards and easy boards, but hard subjects and easy subjects and
hard syllabuses, and options within them, and easy ones: hence
we are nowhere near a world where the standard of an A grade can
be assumed to be constantacross all subjects, all examination
groups and all strands of the assessment process.
15. An added complexity is that of the Key
Stage tests, where the Quinquennial Review had some important
things to say about the QCA's role. One possible course would
be to have a body responsible for all 5-16 National Curriculum
testing (KS 1-4), or all Level One and Two awards, and a separate
body responsible for all further education and sixth-form examinations
at Level Three, whether general or vocational. (This might also
have the effect of helping to develop an integrated structure
at that level, to counteract some of the current uncertainties
over parity of esteem and flaws in vocational assessment, and
would make even better sense if a baccalaureate approach were
to be developed.)
C. KEY ISSUES
FOR THE
FUTURE
16. Whichever structure is adopted, some
of the quality issues will not wait:
Well-grounded research into "standards
over time" is urgently needed: when Ofsted sought to undertake
this work with QCA, its efforts were bedevilled by the lack of
adequate archive scripts; now that, at least for recent years,
these exist, a proper scrutiny should be possible of standards
achieved by candidates under the pre-2001 system and those in
new AS/A2 arrangements.
A full in-depth study of awarding
procedures is surely a matter of urgency. The evidence is now
in the open that statistical "interference" with examiners'
assessments is common practice, almost certainly exceeding thevery
permissivebounds tolerated by the examination Code of Practice,
but we still have seen only the tip of the iceberg. This study
would need to encompass the processes of "borderlining".
A review of the qualifications, training
and assessment of examinerscoupled with an analysis of
the remuneration, timing and conditions under which examiners
workwould test fully the vulnerability of the current system.
It is likely that a re-phasing of examining and marking timetables,
to reduce June and July congestion and even to produce "post-award"
offers for higher education, would have considerable benefits.
The proposal to increase the regular
professional engagement of practising teachers in the process
has much to commend it. However, exploring this option should
take place with a recognition that extending examining competence
so widely across the teaching profession is far from being a simple
matter: there is much evidence that not all teachers' own assessments
currently within the system (in KS1-3 or in GCSE/AL coursework,
for example) are completely reliable. Especially in those subjects
where examining is essentially a matter of judgement against the
criteria, rather than marking points right or wrong, the degree
of challenge in securing consistency and quality assurance should
not be under-estimated.
A system of regular independent published
reports, with teeth, from the subject-based scrutinies of GCSE
and A Level would do much to strengthen quality assurance. As
noted above, Ofsted would be well placed to produce such reports.
A central place for Ofsted in all
aspects of assessment procedures, making full use of inspection
evidence, would ensure the necessary link between evaluations
of standards in schools and colleges and those in the awarding
systems.
October 2002
Annex A
SOURCES OF
OFSTED EVIDENCE:
Subject monitoring by HMI, especially
through the Curriculum Advice and Inspection Division (CAID) and
the work of Specialist Advisors (SAs) and other specialist HMI.
Inspections of schools (section 10)
and colleges (Learning and Skills Act 2000) and of other parts
of Ofsted's remit.
HMI surveys, especially those on
the implementation of Curriculum 2000leading to a published
report (in production) on the second year of implementation.
HMCI's Annual Report: that for 2000-01,
published in February 2002, summarised key points on the first
inspections of the new AS examinations.
Ofsted's advice to DfES on the 14-19
Green Paper (June 2002).
Ofsted's oral evidence to the QCA
Quinquennial Review.
Close and regular contact between
Ofsted and the QCA, though meetings at Chief Executive/Inspector
level and other levels in the organisation and the presence of
an Ofsted observer at OCA board meetings.
Correspondence between Ofsted and
QCA, and Ofsted and DfES, on matters of common concern.
POINTS FROM
INSPECTION EVIDENCE:
The following series of points is offered as
a summary of issues to emerge from Ofsted's evidence:
CURRICULUM
2000 (YEAR ONE)ANNUAL
REPORT AND
OTHER EVIDENCE:
1. New AS course specifications for Curriculum
2000 were generally well devised; however, in some subjects, the
level was insecure and varied excessively between units.
2. The requirements of internal and external
assessment procedures were excessive for both students and teachers;
the use of assessment data to set students learning targets and
monitor their progress was patchy.
3. Students were generally well motivated,
but there was a perceptible decline in enthusiasm as the year
progressed and the pressures became more evident.
4. Students were subject to excessive, relentless
assessment, which put unreasonable pressure and constraints on
Year 12.
5. Technical problems over the assessment
arrangements were substantial and resulted in a loss of confidence
in the system.
6. Timetabling difficulties were at times
formidable, leading to administrative problems for Centres and
demanding schedules for students.
7. Difficulties over IT exacerbated an already
difficult system, for example in developing the key skills assessments.
8. Awarding bodies were under mounting pressure
over the supply of examiners and other assessors.
9. The impact on numbers taking so-called
"minority subjects" was variable.
10. There was sometimes a narrowing of teaching
approaches, both in content and method, at the expense of students'
independence of learning and development of study and research
skills.
11. Teaching was often initially rather
uncertain, with doubts over the coverage requirements or on the
new specifications.
12. Key skills had only rarely had a positive,
discernible impact in schools on the quality of teaching.
13. A substantial investment in staff development
(notably in further education) often improved quality markedly,
not least in relation to key skills.
14. There was much evidence of appreciable
lengthening of the teaching week and of heavier programmes for
students.
15. The compression of programmes at times
crowded out the development of the habits and attitudes of scholarship.
CURRICULUM
2000 (YEAR TWO)
1. The difficulties of implementation observed
in the first year of this inspection were to some extent overcome
in the second.
2. Curriculum 2000 had been incorporated
into the work of schools and colleges, with considerable difficulty,
but without the loss of the rigour and depth traditionally associated
with advanced study.
3. Teachers' confidence in teaching the
new specifications grew considerably, though further support and
training were still needed.
4. In the schools and colleges visited,
the work seen improved over the two years of this inspection.
5. Teaching was almost always expert, well-planned
and enthusiastic, and given greater clarity of focus by the quality
of the A2 specifications, which were found to be helpful and supportive.
6. Many teachers still felt that they had
little opportunity to go beyond the immediate demands of the specifications.
7. Despite the time teachers and students
spent completing assessments, use of the results of assessment
to set learning targets and to monitor progress remained patchy.
8. Standards of achievement in the schools
and colleges inspected remained high, and had in some respects
risen over the two years of the inspection.
9. Most students were addressing successfully
the additional demands of A2 courses, and were developing at a
high level the skills of analysis, critical thinking and evaluation
of information, as appropriate to the subjects studied.
10. There was some evidence in the institutions
inspected of a broadening of the range of subjects offered.
11. Colleges in particular had seen an increase
in the numbers of students opting for subjects such as information
technology, psychology, media studies and art.
12. Because of increased numbers overall,
the retention of subjects, such as some languages, which attracted
relatively few takers, was often possible.
13. The impact on the curriculum as experienced
by the individual student was often modest.
14. Students, especially in schools, were
much less well-informed about training and employment routes than
about academic and vocational options in schools and colleges.
15. Generally, too, post-16 institutions,
particularly schools, were insufficiently responsive to the views
and needs of employers.
SUBJECT MONITORING
1. Modular arrangements in some subjects
were seen to sit very uneasily with the desire to "maintain
standards".
2. Candidates were often retaking AS modules
later in the course, and with the benefit of significant maturation,
so that their grade profile in advance of taking A2s could be
raised.
3. Candidates were occasionally retaking
modules when they already had high grades (including, in business
studies, candidates with grade A at AS).
4. In order to maintain standards, awarding
bodies appeared to have resorted to statistical manipulation.
In the past, under the Code of Practice, awarding panels were
required to take account of statistical information after they
had set provisional grade boundaries. This meant that judgmental
awarding was informed by the overall statistics, and significant
changes in grade distributions had to be justified. This was perhaps
more difficult this year as examiners were working in a new context.
5. With regard to this year's awards, these
processes perhaps explained the eccentric patterns of attainment.
In the "new" system the moderated module grades had
been declared to schools, as had the AS grades by the time the
AL awarding took place. Inevitably, any adjustment would therefore
fall disproportionately upon the remaining components, usually
A2 coursework and the terminal synoptic paper. Thus some candidates,
for example, had CID adjusted to U in these components although
their overall grade shifted less.
6. In subjects where modules were newly
introduced, there are concerns. For example, in history there
was a danger of "pick and mix" incoherence or the focus
on particular periods, such as Europe of the Dictators. In art,
there was a view that the demise of the more "open-ended"
Year 12 course had narrowed the students' experience, inhibiting
experimental approaches.
7. The synoptic papers were an aspect of
the A2 which suffered from the outset from unclear definition.
In history, for example, different awarding bodies interpreted
the synoptic requirements in different ways. The role and nature
of specifications in their definition of synoptic and papers in
carrying this forward would merit early review.
8. There was evidence to suggest that the
scrutiny process was still not robust. Before the Code, scrutiny
was by peer review, chaired by the relevant professional officer.
Currently, scrutiny teams had membership from outside the normal
pool of chief examiners, but as a consequence could lack experience.
9. The gravity of unresolved comparability
issues among the examining groups was illustrated by the inexplicable
differences in proportions of candidates reaching particular grade
boundaries. In 2001 D&T, for example, the variations were
very wide:
Group | Percentage A
| Percentage A-E |
AQA | 13.2 | 89.6
|
Edexcel | 2.3 | 74.5
|
OCR | 16.3 | 90.1
|
10. The proliferation of examinations had exacerbated
the difficulty in getting sufficient markers and moderators. For
example, this summer Edexcel used student teachers to mark history,
and it was suggested for other subjects, such as art and design.
|