Select Committee on Agriculture Minutes of Evidence

Annex A

Notes on Statistical Aspects of the Badger Culling Trial


  The randomised trial, which is but one of the responsibilities of the ISG (chaired by Professor John Bourne), was recommended by the Krebs Report (1997) to resolve the long-standing issue of the role of badgers in bovine TB. The power calculations, which are the usual sort of approximate basis for determining the scale of effort appropriate, were done at the time of the Krebs report and led to a randomised block design of 10 triplets comparing three roughly circular areas randomised between three treatments, survey-only, reactive and proactive culling. It was estimated that about five years of observation would be needed to get the required sensitivity. At the moment work on eight triplets is in progress and it is anticipated that by 2001 all 10 will be in commission.

  Among the other aspects of ISG work is a case-control study of husbandry methods embedded within the trial. George Gettinby has played a major role in this. It is not discussed here.

  The trial is contentious largely, although not entirely, because the animal rights fraternity seem convinced that badgers are irrelevant to bovine TB and oppose any culling and because, by and large, farmers are convinced of the role of badgers and are thus uneasy at a long wait for an answer and recommendations.


  The role of power calculations is to ensure that the scale of effort is neither so limited as to be incapable of leading to useful answers nor exorbitant. While mathematically a precise answer is obtained once a specification of objectives is given quantitatively, in realistic terms this specification is quite arbitrary and the recommended scale of effort is no more than a rough guide, although a very valuable one. The Krebs calculations were based on the assumption that the residual error, after eliminating inter-triplet variation and regression on base-line covariates, will be Poisson. This is likely to be optimistic, although probably not by much. The projected time-span is based on a cautious assessment of future break-down rates and thus probably pessimistic.

  There has been some concern about the power calculations in particular by the Agricultural Select Committee, and these have been expressed in the national press. Much, if not all, of the discussion seems to be based on a total misunderstanding of the role of the power calculations. The precision achieved in the trial will be determined by the data obtained, totally independently of the correctness or otherwise of the power calculations. Nor is the schematic analysis on which the power calculation is based at all like the very careful and detailed analysis to be made of the real data.

  There are a number of considerations that bear on the scale of effort, the number of triplets and the time extent of the trial.

  First, to some extent, the number of triplets and the years of observation are interchangeable in that to a crude approximation precision will be determined by the total numbers of breakdowns observed in the three treatment arms of the trial. But this is only approximately true. It is a well established principle (Yates and Cochran, 1938), and indeed just common sense, that very high precision in just one site would be a very insecure basis for a broad practical recommendation or a sound scientific conclusion. Range of validity demands replication across sites (triplets). In an extreme case of heterogeneous response patterns across triplets (triplet x treatment by interaction) the most cautious analysis would be purely randomisation-based. Reduction of the number of triplets say to eight would reduce the randomisation set in a particular comparison to such a level that power would be drastically reduced.

  A different although somewhat related point is that it might become necessary to estimate the error of a particular contrast, say survey-only versus reactive, from the interaction with triplets and this would leave degrees of freedom of error of one fewer than the number of triplets, or perhaps twice that. Ten triplets is from this viewpoint somewhat minimal.

  It might be tempting to suggest more triplets and a shorter time span. However, more triplets would provide diminishing returns (Krebs et al 1997) and the benefits would be questionable. The trial is also massively demanding logistically in terms of its pressures of field workers, and in terms of cost. There are also welfare considerations in terms of the number of badgers sacrificed. There are sound hopes that work in 10 triplets will be operational by 2001 but to go for more, with the additional concern of compromising the quality of data collected, seems totally out of the question.

  From several points of view an alternative and preferable view of calculations of the scale of effort is not in terms of statistical significance but in terms of precision of estimation. This is in line with a general preference for estimation over significance testing. In particular, some assessment of the size of a reduction of breakdown rates by culling, should there be one, will be essential for a rational policy recommendation. See, for example, Cox and Reid (2000, section 8.1 and p 222). With the same approximations used by Krebs the present trial design leads to a fractional standard error of 7 per cent in the comparison of two rates.


  The trial will be a rich source of data in particular to badger ecologists. These notes concentrate on the methods to be used in the primary comparisons of the three treatments. Such a primary analysis will consist of a regression of log number of breakdowns per trial area into the form:



    Treatments x triplets.

    Poisson error.

  with adjustment for regression on baseline variables such as log geographic area, log number of holdings, log number of herds. A supplementary analysis might include an initial measure of badger activity at survey although it would be important not to interpret any associated effect casually.

  Following conventional wisdom, if an appreciable interaction term arises, a rational explanation would be found if possible. Otherwise the interaction would be treated as an extra source of random variability, ie of overdispersion relative to the Poisson distribution.

  The analysis can be done either by maximum likelihood as a generalised linear model or by empirically weighted least squares, ie if N is a count by assigning log(N) a variance of 1/N in a standard regression calculation. The two are identical to the first order of asymptotic theory. The latter may be more flexible if extended versions of the model are needed.

  There are many aspects that this does not address which will need attention later. For example, as is appropriate for primary comparisons, the above treats the trial area as a unit of study. Yet there is some information within a trial area arising from examining those holdings which do and do not have breakdowns. Also there is the issue of the year-to-year variation within a trial area. Is there evidence of an increasing or decreasing effectiveness of any effects found?


  It was agreed at an early stage that an interim analysis would be done after about 100 breakdowns had accumulated in trial areas and this point has just recently been reached.

  MAFF have a general policy, which we fully support, of making data public but we have argued very strongly that this cannot apply to the detailed breakdown data or the badger TB prevalence data from the trial without potentially catastrophic effects on the whole enterprise. Thus it is likely that at some point suggestive, potentially important but in fact wholly indecisive effects will appear. However strong the "health warning" that might be put on such data, the potential for destroying co-operation in the trial, which is of course voluntary, seems clear. We believe this point accepted, albeit reluctantly in some quarters.

  While there is every prospect that the trial will need to run for several years and probably for the initially projected period there is at least some possibility that clear conclusions about some parts of the trial will emerge earlier.

  There is a large but somewhat controversial statistical literature on early stopping of trials but this largely centres on significance testing of effects which are likely to be stable in time. They involve setting rather rigid rules about when and how many interim analyses are allowed and when a trial should stop early, although no doubt they are rarely applied in so mechanical a way. We do not think these approaches helpful here.

  The reasons are:

    —  the possibility that effects are not constant in time and that there is appreciable inter-triplet variation in effect means that conclusions from a short time period and a small number of triplets, even if in some sense nominally significant would not be a secure basis for a conclusion;

    —  if the objective is regarded, as we believe it should, as primarily that of estimating the magnitude of relative reductions via confidence limits, then the need for various detailed specifications (spending error rate and all that) disappears;

    —  the strategy:

    "continue the investigation, do interim analyses from time to time and stop when and only when the required precision is achieved"

  is entirely appropriate (Anscombe, 1953). Note that this is legitimate from any of the main approaches to statistical interference.

  At some point ISG will have to discuss what level of precision is suitable.


  The trial should be regarded as having several objectives.

  One is to provide a firm scientifically-based answer to the question of the role, if any, of badgers in bovine TB. While this may sound at first like testing a null hypothesis of no proactive effect, in fact it hardly makes sense other than one of estimation. If there is a proactive effect, it will be necessary to know whether it is 20 per cent, 50 per cent or 80 per cent reduction or whatever.

  Assuming some effect of culling is found, the second objective is to provide a basis for a culling policy. The ISG has based its approach on sustainability so all or any of the three approaches could be included in future policy. Any such strategy as:

    "if and only if the projected breakdown rate in an area (a county perhaps) exceeds some threshold rT, institute (or allow or encourage) reactive culling over a distance d from the affected farm"

  would require for further analysis a reasonably precise estimate of the reduction in breakdown rate to be anticipated. This could be fed into an economic analysis to determine suitable values of rT and d.


  Anscombe FJ (1953). Sequential estimation (with discussion), JR Statist Soc B 15, 1-29.

  Cox DR and Reid N (2000), The theory of the design of experiments. Boca Raton and London: Chapman & Hall/CRC Press.

  Krebs J, Anderson R, Clutton-Brock T, Morrison I, Young D, Donnelly C, Frost S and Woodroffe R (1997), Badger tuberculosis in cattle and badger. Ministry of Agriculture, Fisheries and Food.

  Yates F and Cochran WG (1938). The design and analysis of series of replicated field trials. J Agric Sci 28, 556-580.

previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries index

© Parliamentary copyright 2001
Prepared 10 January 2001