Memorandum submitted by Rebecca Allen,
Institute of Education, University of London and Simon Burgess,
CMPO, University of Bristol
INTRODUCTION
1. This submission relates to the following issues
highlighted by the Select Committee:
"The
performance of Ofsted in carrying out its work;
The
impact of the inspection process on school improvement"
2. We offer two pieces of evidence:
We
report some simple statistics on the nature of schools failing
their Ofsted inspections
We
provide a preliminary report on some work in progress on the causal
impact on a school's progress of failing their Ofsted inspection.
3. This evidence is based on the authors' analysis
of data from the National Pupil Database (NPD) and Ofsted judgment
data. This relates to secondary schools only, between the dates
of 2003 and 2009 inclusive, this timing being determined by the
existence of the NPD data. A few details on the data are provided
at the end of this note.
SUMMARY
4. Our preliminary results show that there is
a positive causal effect of failing an Ofsted inspection on the
school's subsequent GCSE performance. This effect is statistically
and substantively significant, but it appears to be transient.
It peaks at two years after the inspection but has disappeared
by four years after.
5. We show that schools in affluent neighbourhoods
with few poor pupils and high scoring intakes are only rarely
judged to be failing by Ofsted. By contrast, schools with less
able intakes and greater poverty have much higher failure rates.
The bulk of schools judged to be failing look like this. There
are two possible interpretations of this: either that Ofsted is
truly measuring the quality of teaching and learning in a school,
in which case, these results show that the least effective schools
are disproportionately to be found in poor neighbourhoods, serving
poor, low ability pupils. Alternatively, it could be that Ofsted
judgments are strongly influenced by actual grades achieved, rather
than the quality of teaching per se. In this case, it would
be important to consider what an Ofsted inspection adds to the
information available through test score outcomes themselves.
PERFORMANCE OF
OFSTED IN
CARRYING OUT
ITS WORK
6. We consider the social profile of schools
by Ofsted judgment, analyzing the distribution of judgments by
the intake ability profile of schools, the level of poverty of
the students in schools, and the level of neighbourhood poverty
around schools.
7. This is informative on Ofsted's performance
because it provides some information on the outcomes of the process.
8. In Table 1 we tabulate the characteristics
of schools judged to be failing in comparison to other schools
inspected by Ofsted in 2009. In the first column, we see that
the schools judged to be failing have similar average Keystage
2 (KS2) test scores to the next category up, but distinctly lower
KS2 scores than the two higher categories. In column 2 we report
the average number of students eligible for Free School Meals
(FSM). There is a strong pattern showing that the schools judged
to be failing have more poor students. Finally, in column 3 we
report the neighbourhood poverty rates of schools. Specifically,
this is the mean IDACI score of students in the schools by Ofsted
judgment. Again, students in failing schools tend to live in poorer
neighbourhoods than students at schools deemed to be excellent.
9. We examine trends in these comparisons over
time in Figures 1 to 3. While there is some variation from year
to year, the gap in the mean characteristics of schools deemed
to be failing and those deemed excellent remains largely constant.
10. We now cut the data the other way and compute
the percentage of schools inspected that are judged to be failing,
by the schools' characteristics. The results are in Table 2. Column
1 shows that 9.7% of schools with student intake ability in the
lowest quintile were judged to be failing, while only 2.3% of
schools with the highest ability students were so judged. There
is a similar gap looking at the poverty rate in schools: 8.3%
of the poorest schools were failed relative to 2.2% of the least
poor schools. This pattern is also reflected in the final column,
looking at the schools' neighbourhood poverty rates.
11. We examine whether these patterns have changed
over time. Figures 4 to 6 display the trends for the lowest and
highest quintiles of KS2 scores, school poverty and the poverty
of the schools' neighbourhoods. The failure rate of the most affluent
and the high intake ability schools remains constant and low throughout
this period, around 2% to 3% throughout. The failure rate of schools
in the highest poverty quintile, and the lowest intake ability
quintile, are more variable - all having a spike in 2006 - but
are uniformly much higher, averaging around 12%.
12. To summarise, our results show that schools
in affluent neighbourhoods with few poor pupils and high scoring
intakes are only rarely judged to be failing by Ofsted. By contrast,
schools with less able intakes and greater poverty have much higher
failure rates. The bulk of schools judged to be failing look like
this. There are two interpretations of this:
(a) Ofsted is truly measuring the quality of
teaching and learning in a school. In this case, these results
show that the least effective schools are disproportionately to
be found in poor neighbourhoods, serving poor, low ability pupils.
(b) Ofsted judgments are strongly influenced
by actual grades achieved, rather than the quality of teaching
per se. In this case, it would be important to consider what an
Ofsted inspection adds to the information available through test
score outcomes themselves.
13. Both of these interpretations are likely
to be true in part. One of the criteria Ofsted use is the level
of test scores achieved by the school. There are also a number
of reasons why teaching effectiveness may be lower in schools
in poorer neighbourhoods. For example, it may be that more effective
teachers and headteachers are to be found more often in the more
affluent schools.
THE IMPACT
OF THE
INSPECTION PROCESS
ON SCHOOL
IMPROVEMENT
14. In on-going work, we are analysing the consequences
for subsequent GCSE exam performance of failing an Ofsted inspection.
This directly addresses the question of the impact of Ofsted on
school improvement. This paper will be finished and available
in November.
15. There are a number of statistical difficulties
in establishing the impact of failing an Ofsted inspection. First,
we need to distinguish the true causal impact of that from simple
mean reversion: that is, that the least effective schools in one
year are almost bound to improve a little in the following year.
This is not part of the effect of Oftsed on school improvement
and needs to be taken out of the estimate. Second, the schools
highlighted by Ofsted are necessarily going to be poorly performing,
and are likely to remain quite poorly performing. We therefore
need to take account of their circumstances and look for any improvement
given those circumstances.
16. To deal with these issues, we need to compare
schools that have just failed their Ofsted inspection with very
similar schools who just passed. Accordingly, we adopt a Regression
Discontinuity Design (RDD), a well established identification
technique in economics. The idea is to use as a "control"
group for the failed schools the schools that only just passed
their inspection. Details of the statistical procedure and the
definition of the assignment variable for the discontinuity will
be provided in the forthcoming paper. We also focus on changes
in GCSE performance, not levels. This approach takes account of
the fixed but unobservable characteristics of the schools such
as teacher effectiveness, the school's environment and so on,
and also a few observable but changing factors such as the characteristics
of the student intake.
17. Before reporting our results, it is worth
considering that the outcome could be negative, zero or positive.
Negative if there is a major impact on staff morale, or key highly
effective staff leave, or dealing with the process diverts resources
and time from teaching. Positive if the failure (re-)focuses attention
on the right things, and provides increased information and motivation.
18. Our preliminary results show that there is
a positive causal effect of failing an Ofsted inspection on the
school's subsequent GCSE performance. This effect is statistically
and substantively significant, but it appears to be transient.
It peaks at two years after the inspection but has disappeared
by four years after. This is the average effect; we are now investigating
any heterogeneity in this effect.
19. This seems to us to be a reasonable and credible
outcome. The effect is positive not negative on average, which
obviously is an encouraging report of the impact on school improvement
of Ofsted. On the other hand, failing an Ofsted, and the aftermath
of that, are not highly resourced and high impact interventions.
It is unlikely that that would have a permanently transformative
effect on a school.
20. We have also conducted a preliminary analysis
of the question of whether the contemporaneous year 11 cohort
suffers in the year of the Ofsted inspection. We find a rather
small negative effect.
FURTHER DATA
DETAILS
21. The number of categories used by Ofsted varies
year by year over this period, so we have amalgamated them into
four groups, with the bottom category being "judged to be
failing".
22. The first set of results here reported adopts
a straightforward analysis of the inspections data. We have not
adjusted for the fact that schools deemed to be failing are inspected
more often than those deemed to be excellent. This is likely to
have two offsetting effects on the first part of our results described
here. On the one hand, this means that the sample will contain
more schools strongly at risk of failing, and these are likely
to be schools in poor neighbourhoods. On the other hand, schools
having been deemed to be failing are more likely to have improved
- either through mean reversion or the impact of the Ofsted judgment
- and so are less likely to fail next time around.
TABLES AND
FIGURES
Table 1
CHARACTERISTICS OF SECONDARY SCHOOLS BY OFSTEJUDGMENT,
2009
OFSTED judgment |
School means:
| Number of inspections |
| KS2 test score of cohort1
| Eligibility for FSM (%) | Neighbourhood Poverty (IDACI)
| |
Excellent | 0.276 | 9.5
| 0.193 | 151 |
2 | 0.052 | 12.2
| 0.225 | 283 |
3 | -0.099 | 14.6
| 0.250 | 192 |
Fail | -0.073 | 16.7
| 0.267 | 42 |
| | |
| |
Total | 0.051 | 12.5
| 0.227 | 668 |
1. KS2 score: normalised within-year to mean zero and standard
deviation 1.
Table 2
SCHOOLS JUDGED TO BE FAILING, BY SCHOOL CHARACTERISTICS,
2009
Quintiles of School Characteristics |
KS2 test score of cohort1 |
Eligibility for FSM (%) | Neighbourhood Poverty (IDACI)
| Number of inspections |
| % of schools judged to be failing:
|
Lowest | 9.7 | 2.24
| 2.24 | 134 |
2 | 7.46 | 5.92
| 6.72 | 134 |
3 | 7.52 | 7.58
| 6.77 | 133 |
4 | 4.48 | 7.46
| 8.21 | 134 |
Highest | 2.26 | 8.27
| 7.52 | 133 |
| | |
| |
Total | 6.29 | 6.29
| 6.29 | 668 |
1. KS2 score: normalised within-year to mean zero and standard
deviation 1.
Figure 1
MEAN KS2 TEST SCORE OF SCHOOLS JUDGED TO BE FAILING (F)
AND EXCELLENT (E)
Figure 2
POVERTY RATE OF SCHOOLS JUDGED TO BE FAILING (F) AND EXCELLENT
(E)
Figure 3
NEIGHBOURHOOD POVERTY RATE OF SCHOOLS JUDGED TO BE FAILING
(F) AND EXCELLENT (E)
Figure 4
PERCENTAGE OF SCHOOLS JUDGED TO BE FAILING BY SCHOOL KS2
TEST SCORE
Figure 5
PERCENTAGE OF SCHOOLS JUDGED TO BE FAILING BY SCHOOL POVERTY
RATE
Figure 6
PERCENTAGE OF SCHOOLS JUDGED TO BE FAILING BY NEIGHBOURHOOD
POVERTY RATE
October 2010
|