Memorandum submitted by H Nickels, Headteacher, Silverton CE Primary School

Have we lost our way?

 

The target driven testing regime of today's primary education has evolved over a number of years. Successive governments have added, bit by bit, to the evolution of the system we have now. Political expedience has not allowed serious consideration of the impact of piecemeal policy making.

The resultant situation has led to:

annual tests for children from the age of seven

judgements of school performance mainly based on five hours of summative testing

schools 'teaching to the test'

a data driven inspection regime

a data driven local authority monitoring programme

a narrower curriculum

increased pressure on pupils, teachers and senior management

 

DCS Annual Review Visit ( ARV)

The Annual Review Visit ( ARV ) has, to my mind, been reduced to a narrow focus on pupil data. The advisers now spend too much time in the Headteacher's office analysing data on a laptop screen. Ironically this very issue has been highlighted in a recent article in "Headteacher Update" - "Why fewer want the top job today."

B. Aston (Feb 2007) :

' The relentless drive to raise standards, which predominantly is seen as improvement of mathematics, English and science SATs results is a constant worry to Heads. This has had a subtle influence on the nature of the relationship between the LA advisers and Headteachers. The professional dialogue used to be a wide-ranging supportive discussion of ways of improving all aspects of school life, but now visits are focussed exclusively on how to raise SATs results..............No Headteachers...were willing to go on the record, as admitting to feeling stressed seems to be an admission of weakness and was seen as not helpful to career progression.'

 

Should we really be judging schools on such a narrow band of summative tests - the validity and statistical significance of which, are open to much doubt? ( See Appendix A for extracts from various articles / studies)

 

In Wales, Scotland and Northern Ireland, thinking has moved on. Summative tests are less significant and used at teachers' discretion. They support, where needed, the teacher assessments and this has an obvious knock on effect for pupils, staff, teaching, learning. Interestingly, commentators in the USA, are identifying the same trend. They see a narrowing of teaching and learning to satisfy Maths and English test result regimes, promoted by the federal Elementary and Secondary Education Act.

They identify Maine and Nebraska, as states that are moving away from this narrow trend towards the models in Wales, Scotland and Northern Ireland.

( See Appendix B )

 

 

In my view, if SATS have to be used, they should be available to teachers who need support to make a final judgement on pupils who fall on the boundary of levels. Most pupils can be assessed by teachers, from classroom work, to be working at a particular NC level.

Furthermore the test would be given selectively and in a less threatening atmosphere, to 'cement' the final few judgements.

 

 

Furthermore, some European countries have NO testing , inspectorial regime, or even National Curriculum ( Finland). They( Finland) have some of the highest pupil standards and have a higher proportion of young people in higher education than we do.( Education Guardian w/b Feb 19th 2007)

 

Statistical Significance

Roland Oxborough ( County LA Statistician) makes the following observations:

 

"Individual pupils

 

SAT results are based on relatively short tests. As a result of this, there is a considerable degree of uncertainty in the outcome for an individual pupil. This could be overcome by increasing the total test time, but at considerable cost to schools. Provided this uncertainty is taken into account the results of SATs do provide a measure of the performance of individuals. One problem is that the users of test results ignore the uncertainties in them.

 

Groups of pupils

 

Generally, group sizes should be 20 or more to be able to regard the group results as a sample of a much larger group and to compare with national data. Smaller sizes should be recognised as a measure of the individuals within the group. This is particularly relevant if 'raw' performance is being measured (for example-percentage achieving level 4 or more at key stage 2).

For small groups, variation between years can be large without the variation being significant and hence providing information about the changing performance of an institution.

 

 

Pupil progress ('Value added')

 

The use of progress data is one method used to alleviate the effects of sample size. The performance of individuals is measured at two points in time and the differences compared to national differences for similar pupils. The measurements still suffer from uncertainty, but the previous ability of the pupils is taken into account. Since the variation for similar pupils is measured nationally, this information can be used to estimate the uncertainty within the measurement of value added. Provided this uncertainty is not ignored, the results can provide a valid measure of the progress made by pupils within an institution. This method is now improved by contextualising the data with a large number of non-performance factors ('contextual value added'). However, the improvements are usually small once gender and date of birth have been considered. The interpretations of data offered to schools do now take account of these uncertainties (raiseonline, FFT, Smiley), but again, users of the data do not always take account of (or understand) these uncertainties. A good example of this is the raiseonline approach to rank ordering CVA scores. A school with a cohort of 20 pupils might have a CVA score of 100 (i.e. progress exactly as expected) and would have a percentile rank of 50. The degree of uncertainty in this measurement (95% confidence interval) implies that the actual percentile rank is probably somewhere between 10 and 90. It is only at the extremes that the percentile rank is a useful indicator."

 

(One such example from ARVs regarding groups of pupils, illustrates this point:

Analysing a set of test results in a small cohort to try to identify a "trend" in whether e.g. boys are doing better than girls, is a statistical nonsense!)

 

 

 

Other thorough research by Stephen Gorard ( University of York, 2006), has analysed value - added scores in Primary Schools.

A study analysing 457 schools in Yorkshire, concludes that value added does not actually measure what it purports to.

"Scatterplots show that there are no low attaining schools with average or higher value added, and no high attaining schools with below average value added"

The research suggests that value added figures are still at an early stage of development and are "not ready to be move from being a fascinating research tool to an instrument of judgement on schools"( Abstract from paper: S Gorard " Re-analysing the value added of primary schools",2006 University of York)

 

 

Other issues

 

What about raising standards in other subjects? Do advisers ask how the school is improving pupils' experiences/outcomes in Art, History, Geography etc.?

 

Should schools be judged / advised on wider issues such as elements of the ECM agenda? A recent Devon conference, found one OFSTED inspector saying that schools should be judged EQUALLY on Enjoy and Achieve and Be Healthy.

 

 

 

Are there alternative models to national assessment? (See Appendix C). Should we really be testing pupils at the end of every year from Year 2 to Year 6

( as advocated by advisers at ARVs)? Are we, in fact, turning some pupils into failures by the age of 11?

 

"One of the questions worth sharing with schools concerns their use of 'optional' national tests, and other tests that are designed to gather data about students' learning. There is a risk that requiring students to undergo more testing than the basic national requirements could have a damaging long-term effect on the self-esteem, self-efficacy and effort - and thereby the future test performance - of some students."

[http://eppi.ioe.ac.uk/EPPIWeb/home.aspx?Page=/reel/review_groups/assessment/review_one_summaries_adviser.htm ]

Should advice be more proactive, as advocated in the eppi.ioe.ac.uk article :

"Above all, schools will need guidance and support in establishing assessment strategies that actually improve learning rather than merely measure it."

School Improvement Partners

The new SIP arrangements are an opportunity for the LA ( through its monitoring authority ( DCS) ) to actually focus on SCHOOL IMPROVEMENT by genuine PARTNERSHIP. Focus on data should be proportionate, relevant and reflect statistical significance. The overall agenda should be school wide looking at curriculum, pupils and staff, including potential for career development. The "rules of engagement" should be made explicit before each meeting, so that all parties are aware of the process and outcomes.

Conclusion

In conclusion, should the narrowness of summative testing and its associated "judgement" regime, be an educational path that our schools find themselves forced down? Should monitoring of performance be biased towards the analysis of data that can be seen to be statistically unreliable and full of uncertainties?

Have we indeed, lost our way, through gradual year on year Government pressure to test, meet targets and analyse data.

 

 

 

 

 

 


Appendix A

 

QCA admits primary test improvements to some extent illusory

(TES, 6 May 2005)

http://eppi.ioe.ac.uk/EPPIWeb/home.aspx?Page=/reel/review_groups/assessment/review_one_summaries_adviser.htm

http://eppi.ioe.ac.uk/cms/Default.aspx?tabid=614#assessment

 

http://www.teachingexpertise.com/articles/assessing-what-matters-schools-694

http://www.gtce.org.uk/newsfeatures/features/136105

 

Summative Tests - definition (NFER)

This is used for the recording of the overall achievement of the pupil in a systematic way.  It occurs at the end of a scheme of work or phase of education, and a norm-referenced assessment is often used for this final summing up of performance. KS2 and KS3 tests and GCSE exams are quintessential summative assessments.

As with all tests CRTs and NRTs, no matter what they are called, should not control curriculum and instruction, and important decisions about students, teachers or schools should not be based solely or automatically on test scores.

 

 


"The Learning Curve" BBC Radio 4 ( May 15th 2006) - 
The following  point was made:
Finland has the highest levels of children's academic achievement in Europe. 
And they do it with nothing that resembles KS tests - the National 
Curriculum - or OFSTED!  And how?  By just leaving it to the teachers!
 

APPENDIX B

http://www.fairtest.org/facts/csrtests.html

Growing Resistance to "No Child Left Behind"

Monday, December 11, 2006 Monty Neill

http://www.districtadministration.com/pulse/commentpost.aspx?news=no&postid=17931

 
APPENDIX C
http://www.cambridgeassessment.org.uk/research/confproceedingsetc/publication.2006-12-21.8282008956
 

June 2007