Select Committee on Education and Skills Appendices to the Minutes of Evidence


APPENDIX 3

Submission by Ofsted to the Tomlinson inquiry (QCA 25)

A.  SUMMARY OF MAIN FINDINGS AND PROPOSALS:

    —  The introduction in September 2000 of the new AS/A2 structure, as part of "Curriculum 2000", has features which have been widely welcomed in many schools and colleges. However, its rapid implementation created difficulties initially with specifications and standards, as well as with the workload demands on teachers and students, and the assessment regime associated with it remains burdensome and volatile.

    —  QCA, working with the awarding bodies, has put in great efforts, but has not succeeded in providing an adequate system of quality assurance to give national confidence about the value and consistency of awards.

    —  The roles and operations of QCA have risked being insufficiently sharply focused for it to be fully successful as a regulator.

    —  Much more work would be needed to ensure consistency and comparability across awarding bodies, and a single national examining body should be considered seriously.

    —  A review of the quality of examiners and of their recruitment, terms and conditions and remuneration, as well as of the timing of awards, is a matter of urgency.

    —  Ofsted has the potential to contribute far more strongly to the setting and maintenance of standards, drawing on its subject expertise and knowledge of schools and colleges.

B.  COMMENTARY:

  The following comments, drawn from all of our sources of evidence, including more informal intelligence-gathering, are offered under the following three topics:

  (i)  The AS/A2 structure;

  (ii)  The role of QCA;

  (iii)  The examination groups.

  A digest of key points from inspection is attached as an annex, together with a note about the sources of Ofsted evidence on which we have drawn.

(i)   The AS/A2 Structure

  1.  In reporting on the first year of implementation, HMI pointed to a number of positive features, and these have become more firmly embedded in the second year. Nevertheless, some of the problematic inbuilt design features remain a significant cause for concern. It is undeniable that students face an ever more exacting schedule of assessment, and that the character of Year 12 has changed dramatically. These changes have had beneficial effects in concentrating teachers' and students' minds and giving a real sense of purpose, and have broadly maintained the rigour and depth expected for advanced study. However, evidence from our survey and other subject inspections suggests that they have also on occasion narrowed the students' range of knowledge and experience within subjects, while not always succeeding in broadening coverage of the areas of the curriculum through the choice of a range of contrasting AS courses.

  2.  The weaknesses in the assessment structure have been rehearsed at length elsewhere (and the fact that these are only in part weaknesses of control). There is an inherent self-contradiction in the new AS—partially masked in "old" AS. The standard either is that of a full A Level (in which case it is often too high for Year 12) or it isn't, in which case it cannot be right to allow Year 13 students to "improve" their performance. A statistical change to make AS weighted at, say, 40% of the full A Level would help, but it would still be subject to objection. The weighting attached to coursework is another serious concern. It is a concern not least because of the risk, in these IT-dominated times, of much re-drafting and possible cheating. But, in addition, analyses of the spread of candidates' results have often indicated that coursework marks and grades can be significantly inflated when compared with students' unaided work under controlled conditions.

  3.  One of the instantly glaring anomalies of the results this year is that candidates were being given a "U" grade for coursework—something hitherto virtually unheard of, let alone where candidates were securing A grades on written papers. These anomalous results were palpably indefensible and should have been spotted.

  4.  Modules compound the intrinsic problems over maintaining standards. In some subjects and some parts of subjects, specialists feel that it may be perfectly proper to "sign off" students' achievements before the end of the course. But there are many others where it is not, because of the importance of gradual maturation and skills development, and where both curriculum and assessment standards are potentially distorted by early completion. All of these factors may contribute to variability in assessed standards, with the risk that grades no longer conform to previously accepted levels of achievement; however, it is also of concern that the lack of clarity over the expected standard makes it so difficult to determine whether they do so or not.

  5.  The AS/A2 structure has the potential for ending up as something of a compromise, failing for too many students to achieve either breadth or depth in a satisfying and sensible way, and not clearly representing improvements in quality and the safeguarding of standards for all. However, many individual institutions and students have certainly appreciated the greater range and tighter structure, as our evidence from surveying "Curriculum 2000" makes clear. Some have suggested various forms of baccalaureate approach as an alternative. We would argue that there are good reasons for not moving from AS/A2 to such an approach in the short term, partly on pragmatic grounds: there has been so much turbulence in the system that a further radical overhaul is the last thing that is wanted. In the medium term, however, a case for such an approach (with strands such as humanities, languages, science, economics, technology, performing arts) could be made. The structure of higher and standard level subjects within a coherent and intrinsically broad curricular framework has been found successful in schools which have adopted the International Baccalaureate, and HMI inspection has commented on much work of high quality in the work of candidates preparing for this qualification. But the demands of such courses are high, and their assessment systems are by no means proof against criticism. Meanwhile, the best way to progress is to eradicate the most glaring weaknesses of the current system (overweighting AS; retaking modules; excessive reliance on coursework) with a sufficiently rigorous system of quality assurance.

(ii)   The Role of the QCA

  6.  Events over the past two years particularly have given widespread credence to the view that QCA has failed to act as a firm regulator of the system and of the work of the examining groups. In many ways, the QCA has had a hard set of challenges, and its staff deserve enormous credit for the way in which they have sought to cope with the range of initiatives and new tests and qualifications. But its roles have been too varied, its teeth too few and its management and managers not always able to provide a constant level of leadership, partly because of frequent changes at Chief Executive level. There has also been much uncertainty about its relationship with its parent department, the DfES, a matter which requires urgent resolution.

  7.  The QCA has also been too much involved in evaluating its own advice or policies, in a way that can lead to defensiveness and a lack of transparency. In consequence, it has failed to exercise effective quality controls on the awarding bodies. The reasons for this are complex. They relate in part to the weaknesses in the powers, which QCA was formally given, as QCA officers have, reasonably, pointed out. However, there has also appeared to be a lack of resolution, even within the powers it has had, in taking decisive action against anomalous or inconsistent actions on the part of individual awarding bodies. With its recent lack of involvement with awarding procedures, Ofsted has no direct, first-hand or up-to-date evidence on this, but the evidence from our specialist advisers, and from their contacts in the system, indicates that the scrutinising procedures adopted have been rather variable in effectiveness. They have frequently been revamped, but currently, the QCA does not always have a strong presence in the very processes most critical in determining standards (awarding, standardising and borderlining meetings, together with those intra-Board procedures which follow these, which is where statistical overlays are applied to the examiners' professional judgements). Furthermore, our subject monitoring suggests that the new scrutinies have often not been staffed or managed in such a way as to ensure quality; and the reports, while engaging with key issues, are at times too anodyne and lacking in decisive effect.

  8.  For these reasons, it is evident that if QCA is to be an effective regulator it needs strong management, clear powers and a real commitment to setting and monitoring standards. All of these can, to a large extent, be addressed internally, and it is already apparent that the new Chief Executive has them on his agenda. However, it remains the case that QCA can be seen as too complicit in the very weaknesses that need addressing: intrinsically part of the problem, not of the solution. Moreover, although it has much expertise, it inevitably lacks the kind of perspective on standards in the field possessed by Ofsted.

  9.  Hence the arguments for a possible development of Ofsted's remit on these matters seem strong ones. In particular, the risk of having "standards" apparently "guarded" by two largely separate arms of government is a real one; and while in principle the two roles can be seen as mutually supportive and complementary, in practice this is an unhealthy schism which can erode confidence and generate uncertainty. Reporting against nationally assessed standards is crucial to Ofsted's role. There are currently significant doubts about the extent of grade drift and about the value of an A at A level or a C at GCSE. A firm fix on what these grades mean is needed, and at present it is lacking. Ofsted is, because of its remit, rights of access and expertise, uniquely well placed to contribute to the independent review of examining standards for which this year's events have simply underlined the long pressing need.

  10.  This argument in no way reflects a belief that Ofsted should usurp what are properly the functions of others, but it is born of a strong desire to work in effective partnership with them. We would suggest that Ofsted's role should be focused essentially on issues connected with monitoring standards at all stages: in assessing and reporting on standards of syllabus construction, of setting questions and writing mark schemes, and of awarding and grading procedures. This development should encompass a wider exploration of how Ofsted employs its specialist expertise (eg through HMI subject advisers) in relation to QCA, and its working groups. A key principle should be the importance of integrating the evidence of standards and quality provided by inspection and that emerging from assessments. There is scope for further joint quality assurance work between Ofsted and the QCA, to evaluate independently both the standards achieved at the various grades and the reliability and validity of marking and awarding. The aftermath of the quinquennial review of the QCA provides a good opportunity to analyse functions in a coherent and systematic way, and also to ensure that Ofsted is not excluded from access to processes, which are of the utmost importance in determining standards. There are also important matters about the role of teachers in assessment procedures, with scope for exploring more widesoread and planned development of teachers' professional skills through experience of examining. In summary, therefore, our case is that:

    —  Ofsted's annual reporting on standards is strongly inter-dependent with the outcomes of testing and examination regimes. Unless Ofsted can have complete confidence in the reliability of those data, a key element of Ofsted's benchmarking is lost.

    —  Closer integration of the scrutiny of standards which occurs within inspection and that which relates to external assessment procedures would be possiblewith Ofsted's involvement in the latter.

    —  The links between standards, curriculum, assessment and pedagogy are so important that there would be advantages in having a body with the capacity to offer a clear overview of these interlocking elements.

    —  Ofsted has proved itself successful in delivering high quality advice on standards, draws on the long experience of HMI in thinking and writing about the curriculum, and has a large number of high-level subject specialist HMI who could valuably be involved more fully in monitoring standards of assessment.

  11.  Based on the above analysis, we propose the following specific areas of work where Ofsted might usefully become involved:

    —  independent inspections, leading to public report, on the work of individual awarding bodies;

    —  within or in addition to such inspections, scrutinies and reports on standardising and awarding procedures for particular qualifications;

    —  checks on year-on-year consistency in awarding standards, looking in particular at the effects on such features as: the level of questions; the effect of changes to assessment procedures; the relationship between course-assessed elements and terminal tests; and objective evidence of performance in basic skills elements (eg written and computational accuracy); and

    —  evaluation of particular stages/facets of the curriculum, perhaps leading more broadly to a more formal locus in advice on curricular matters.

  12.  To enhance Ofsted's work in this way would be an evolutionary development, not a radical break with the past. For many years, both while HMI worked more directly with the DfES and in the early period of Ofsted's history, it was standard practice for HMI to attend subject meetings of the examination boards and hence to scrutinise scripts. However, in recent years that traditional role has fallen into disuse, not least because QCA's roles differed in significant respects from those of its predecessor bodies and because, in consequence of this, it has set up its own quality arm. Still more recently, there has been a keen desire on both sides for Ofsted to work more closely again with the QCA.

(iii)   Examination Groups

  13.  Events in the last two years have demonstrated that the examining groups are currently not always successful at self-regulation and that they are subject to inadequate external controls. This is in no way to minimise the extraordinary job the three groups have, in many respects, done to cope with change, keep the system going and meet exacting deadlines and new requirements. However, the system has creaked and groaned with every innovation and additional assessment load. Structural weaknesses have been evident in the examining system, and the questions raised may affect every level, from the competence of individual examiners, to the quality of administration and to the whole operation of grade determination. Nor are problems confined to the general awards: weaknesses over vocationally-related certification have been recorded by HMI and others over a number of years.

  14.  Various suggestions have been proposed, many of which miss the central point, which is simply one of consistency and credibility. Any extension of the "free market" approach is fraught with potential problems, if the key aim is to achieve sufficient consistency of standards and practice. However, it is right to continue to pose the question "three or one?" since the justification for having competition among three boards, setting syllabuses and examinations to a single national framework and intending to offer awards which are nationally comparable, is inherently weak. The temptations in the system (such as providing the "easiest" - or "hardest" - examinations) are obvious. A single examining board for all general awards would be a leviathan and a monopoly; it would be likely to reduce choice and risk over-centralisation, and might be exceptionally demanding to manage. However, to many it has an inexorable logic, given the weight attached to these awards, eg by higher education. The evidence from comparability studies done over a number of years is anything but reassuring: the reduction of boards has perhaps limited the extent of inter-board variability, but Ofsted's subject evidence shows that this still continues. In addition, "subject pairs" analysis has exposed that there are not just hard boards and easy boards, but hard subjects and easy subjects and hard syllabuses, and options within them, and easy ones: hence we are nowhere near a world where the standard of an A grade can be assumed to be constant—across all subjects, all examination groups and all strands of the assessment process.

  15.  An added complexity is that of the Key Stage tests, where the Quinquennial Review had some important things to say about the QCA's role. One possible course would be to have a body responsible for all 5-16 National Curriculum testing (KS 1-4), or all Level One and Two awards, and a separate body responsible for all further education and sixth-form examinations at Level Three, whether general or vocational. (This might also have the effect of helping to develop an integrated structure at that level, to counteract some of the current uncertainties over parity of esteem and flaws in vocational assessment, and would make even better sense if a baccalaureate approach were to be developed.)

C.  KEY ISSUES FOR THE FUTURE

  16.  Whichever structure is adopted, some of the quality issues will not wait:

    —  Well-grounded research into "standards over time" is urgently needed: when Ofsted sought to undertake this work with QCA, its efforts were bedevilled by the lack of adequate archive scripts; now that, at least for recent years, these exist, a proper scrutiny should be possible of standards achieved by candidates under the pre-2001 system and those in new AS/A2 arrangements.

    —  A full in-depth study of awarding procedures is surely a matter of urgency. The evidence is now in the open that statistical "interference" with examiners' assessments is common practice, almost certainly exceeding the—very permissive—bounds tolerated by the examination Code of Practice, but we still have seen only the tip of the iceberg. This study would need to encompass the processes of "borderlining".

    —  A review of the qualifications, training and assessment of examiners—coupled with an analysis of the remuneration, timing and conditions under which examiners work—would test fully the vulnerability of the current system. It is likely that a re-phasing of examining and marking timetables, to reduce June and July congestion and even to produce "post-award" offers for higher education, would have considerable benefits.

    —  The proposal to increase the regular professional engagement of practising teachers in the process has much to commend it. However, exploring this option should take place with a recognition that extending examining competence so widely across the teaching profession is far from being a simple matter: there is much evidence that not all teachers' own assessments currently within the system (in KS1-3 or in GCSE/AL coursework, for example) are completely reliable. Especially in those subjects where examining is essentially a matter of judgement against the criteria, rather than marking points right or wrong, the degree of challenge in securing consistency and quality assurance should not be under-estimated.

    —  A system of regular independent published reports, with teeth, from the subject-based scrutinies of GCSE and A Level would do much to strengthen quality assurance. As noted above, Ofsted would be well placed to produce such reports.

    —  A central place for Ofsted in all aspects of assessment procedures, making full use of inspection evidence, would ensure the necessary link between evaluations of standards in schools and colleges and those in the awarding systems.

October 2002

Annex A

SOURCES OF OFSTED EVIDENCE:

    —  Subject monitoring by HMI, especially through the Curriculum Advice and Inspection Division (CAID) and the work of Specialist Advisors (SAs) and other specialist HMI.

    —  Inspections of schools (section 10) and colleges (Learning and Skills Act 2000) and of other parts of Ofsted's remit.

    —  HMI surveys, especially those on the implementation of Curriculum 2000—leading to a published report (in production) on the second year of implementation.

    —  HMCI's Annual Report: that for 2000-01, published in February 2002, summarised key points on the first inspections of the new AS examinations.

    —  Ofsted's advice to DfES on the 14-19 Green Paper (June 2002).

    —  Ofsted's oral evidence to the QCA Quinquennial Review.

    —  Close and regular contact between Ofsted and the QCA, though meetings at Chief Executive/Inspector level and other levels in the organisation and the presence of an Ofsted observer at OCA board meetings.

    —  Correspondence between Ofsted and QCA, and Ofsted and DfES, on matters of common concern.

POINTS FROM INSPECTION EVIDENCE:

  The following series of points is offered as a summary of issues to emerge from Ofsted's evidence:

CURRICULUM 2000 (YEAR ONE)—ANNUAL REPORT AND OTHER EVIDENCE:

  1.  New AS course specifications for Curriculum 2000 were generally well devised; however, in some subjects, the level was insecure and varied excessively between units.

  2.  The requirements of internal and external assessment procedures were excessive for both students and teachers; the use of assessment data to set students learning targets and monitor their progress was patchy.

  3.  Students were generally well motivated, but there was a perceptible decline in enthusiasm as the year progressed and the pressures became more evident.

  4.  Students were subject to excessive, relentless assessment, which put unreasonable pressure and constraints on Year 12.

  5.  Technical problems over the assessment arrangements were substantial and resulted in a loss of confidence in the system.

  6.  Timetabling difficulties were at times formidable, leading to administrative problems for Centres and demanding schedules for students.

  7.  Difficulties over IT exacerbated an already difficult system, for example in developing the key skills assessments.

  8.  Awarding bodies were under mounting pressure over the supply of examiners and other assessors.

  9.  The impact on numbers taking so-called "minority subjects" was variable.

  10.  There was sometimes a narrowing of teaching approaches, both in content and method, at the expense of students' independence of learning and development of study and research skills.

  11.  Teaching was often initially rather uncertain, with doubts over the coverage requirements or on the new specifications.

  12.  Key skills had only rarely had a positive, discernible impact in schools on the quality of teaching.

  13.  A substantial investment in staff development (notably in further education) often improved quality markedly, not least in relation to key skills.

  14.  There was much evidence of appreciable lengthening of the teaching week and of heavier programmes for students.

  15.  The compression of programmes at times crowded out the development of the habits and attitudes of scholarship.

CURRICULUM 2000 (YEAR TWO)

  1.  The difficulties of implementation observed in the first year of this inspection were to some extent overcome in the second.

  2.  Curriculum 2000 had been incorporated into the work of schools and colleges, with considerable difficulty, but without the loss of the rigour and depth traditionally associated with advanced study.

  3.  Teachers' confidence in teaching the new specifications grew considerably, though further support and training were still needed.

  4.  In the schools and colleges visited, the work seen improved over the two years of this inspection.

  5.  Teaching was almost always expert, well-planned and enthusiastic, and given greater clarity of focus by the quality of the A2 specifications, which were found to be helpful and supportive.

  6.  Many teachers still felt that they had little opportunity to go beyond the immediate demands of the specifications.

  7.  Despite the time teachers and students spent completing assessments, use of the results of assessment to set learning targets and to monitor progress remained patchy.

  8.  Standards of achievement in the schools and colleges inspected remained high, and had in some respects risen over the two years of the inspection.

  9.  Most students were addressing successfully the additional demands of A2 courses, and were developing at a high level the skills of analysis, critical thinking and evaluation of information, as appropriate to the subjects studied.

  10.  There was some evidence in the institutions inspected of a broadening of the range of subjects offered.

  11.  Colleges in particular had seen an increase in the numbers of students opting for subjects such as information technology, psychology, media studies and art.

  12.  Because of increased numbers overall, the retention of subjects, such as some languages, which attracted relatively few takers, was often possible.

  13.  The impact on the curriculum as experienced by the individual student was often modest.

  14.  Students, especially in schools, were much less well-informed about training and employment routes than about academic and vocational options in schools and colleges.

  15.  Generally, too, post-16 institutions, particularly schools, were insufficiently responsive to the views and needs of employers.

SUBJECT MONITORING

  1.  Modular arrangements in some subjects were seen to sit very uneasily with the desire to "maintain standards".

  2.  Candidates were often retaking AS modules later in the course, and with the benefit of significant maturation, so that their grade profile in advance of taking A2s could be raised.

  3.  Candidates were occasionally retaking modules when they already had high grades (including, in business studies, candidates with grade A at AS).

  4.  In order to maintain standards, awarding bodies appeared to have resorted to statistical manipulation. In the past, under the Code of Practice, awarding panels were required to take account of statistical information after they had set provisional grade boundaries. This meant that judgmental awarding was informed by the overall statistics, and significant changes in grade distributions had to be justified. This was perhaps more difficult this year as examiners were working in a new context.

  5.  With regard to this year's awards, these processes perhaps explained the eccentric patterns of attainment. In the "new" system the moderated module grades had been declared to schools, as had the AS grades by the time the AL awarding took place. Inevitably, any adjustment would therefore fall disproportionately upon the remaining components, usually A2 coursework and the terminal synoptic paper. Thus some candidates, for example, had CID adjusted to U in these components although their overall grade shifted less.

  6.  In subjects where modules were newly introduced, there are concerns. For example, in history there was a danger of "pick and mix" incoherence or the focus on particular periods, such as Europe of the Dictators. In art, there was a view that the demise of the more "open-ended" Year 12 course had narrowed the students' experience, inhibiting experimental approaches.

  7.  The synoptic papers were an aspect of the A2 which suffered from the outset from unclear definition. In history, for example, different awarding bodies interpreted the synoptic requirements in different ways. The role and nature of specifications in their definition of synoptic and papers in carrying this forward would merit early review.

  8.  There was evidence to suggest that the scrutiny process was still not robust. Before the Code, scrutiny was by peer review, chaired by the relevant professional officer. Currently, scrutiny teams had membership from outside the normal pool of chief examiners, but as a consequence could lack experience.

  9.  The gravity of unresolved comparability issues among the examining groups was illustrated by the inexplicable differences in proportions of candidates reaching particular grade boundaries. In 2001 D&T, for example, the variations were very wide:
GroupPercentage A Percentage A-E
AQA13.289.6
Edexcel2.374.5
OCR16.390.1


  10.  The proliferation of examinations had exacerbated the difficulty in getting sufficient markers and moderators. For example, this summer Edexcel used student teachers to mark history, and it was suggested for other subjects, such as art and design.


 
previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries index

© Parliamentary copyright 2003
Prepared 14 April 2003