Select Committee on Social Security Minutes of Evidence


Memorandum submitted by Professor David Marsland, Brunel University (SM 6)

RESEARCH METHOD

  1.1 In the current important phase of its work, the Committee seems to be concerning itself with some awkwardly academic and technical issues in research methodology.

  1.2 I have enclosed a copy of a paper of mine ("Methodological Inadequacies in British Social Science")[3] in case it may be helpful to Committee members in their later deliberations.

  1.3 I would in particular draw attention to the emphasis I give in the paper to:

    (a)  Data quality—whose main dimensions are validity, reliability, and sensitivity. Much of the data which social scientists use is of scandalously poor quality. Hence misleading descriptions, false explanations, and mistaken evaluations of the social world.

    (b)  The unavoidability of interpretation. However good one's data, and however good one's research design and data analysis, if the interpretative phase of reading off a story for an audience is short on disciplined objectivity, mistaken conclusions will inevitably result. Sloppy interpretation, like poor data quality, produces phoney description, explanation, and evaluation.

METHODS OF EVALUATION

  2.1 More specifically, the Committee's considerations are about methods of evaluation. Evaluation research is just one species of research—practical research designed to investigate the effects and effectiveness of systems, policies, procedures, etcetera, and to improve them.

  2.2 I have enclosed a copy of part of a book which I am just now finishing ("Research Methods for Health Care Professionals"),1 in case it may be useful to the Committee or its officers.

  2.3 Here I would particularly draw your attention to:

    (a)  The necessity—if policy is to be amended rationally—of striking a sensible balance between high quality research (conceptualisation, design, data collection, analysis, and interpretation) on the one hand and realistic practicality on the other.

    (b)  Identification of appropriate and well-measured criteria of effectiveness—and sticking to them.

    (c)  The grave danger of ideological bias—with evaluation routinely prone to mis-use for either critical or exculpatory purposes. Evaluation research is not intended either to prove that "we are right" or to demonstrate that "they are wrong", but to test effects and effectiveness as objectively as possible.

    (d)  The importance—even given the difficulty of elucidating cause and effect—of focusing on effects rather than general conditions. Without this sharp focus, little can be done to ensure that policy innovation and policy development are handled coherently.

    (e)  The value in this context of an experimental approach to policy development. Innovations should be introduced gradually and in variant forms to allow for comparative investigation of their effects.

WELFARE REFORM SUCCESS MEASURES

  Turning now to the Green Paper and the proposed success measures, I have a number of, mainly critical, comments to make. I should preface this by saying that I am pleased to see this work going on. It is an important part of the difficult task of rationalising social policy which has been going forward gradually for many years.

  3.1 Evaluation of complex social processes and products is challenging and difficult. It needs time, thought, and expertise. My reading of the documentation suggests to me that—as happened earlier with evaluation of educational policy and even more with health care—there is a danger that it is being too much rushed at, with too much of mere consultation with interested parties, and too little in-depth, reflective thought and use of research expertise. The outcomes of welfare policy are considerably more subtle, complex, and contentious than those of educational or health policy, so even more care is needed in assessing them.

  3.2 It might be better to commission from a number of competing research agencies a coherent, fully-argued model of welfare evaluation—from basic concepts to the nuts and bolts of measurement. Better still, commission two distinct agencies and test out which model works best.

  3.3 The documentation is less than clear about what is to be evaluated. If the object of evaluation is welfare reform, then the criteria should all be precise comparative measures vis a vis the un-reformed system. If, on the other hand, the aim is to evaluate the reformed system, criteria of success must be stipulated in terms of the purposes of the reforms.

  3.4 I take it that the eight "principles" itemised in the annex on success measures are the purposes and objectives of reform. They seem to me, as an evaluator, too vague and arbitrary to be evaluated properly. They need to be thought through, argued, and sharpened up. This should not be done by specifying the measures to be used—this mistaken procedure, often found in poor evaluations, makes for a futile, vicious, circular argument. First the purposes of reform need to be clarified, argued, and fully defined in detailed, concrete, practical terms. Then, and only then, one can go on to selecting and developing indicators. The first stage requires a fully-argued expert working-paper. A brief ad hoc note is wholly inadequate.

THE PRINCIPLES

  Detailed observations on the eight principles would take more time than I have had available, but I have some brief comments to make. In general, the problem with them is not, as some have argued, that they are apple-pie obvious, but that they are mostly vague and in some cases vacuous. There are also some key omissions—for example to maximise self-reliance, to strengthen real families, to ensure reducing expenditure on welfare, to avoid helping those whose behaviour is unacceptable by comparison with the deserving, and to encourage people to think more in terms of duties than rights.

  4.1 As formulated, it does not specify what the Welfare State as such should be doing about employment, or therefore make any explicit test of the effects of welfare on employment feasible. As a matter of principle, the Welfare State ought to be doing nothing to discourage work—this would be a real change!

  4.2 It is no part of the private sector's proper function to ensure this happens. In any case what does "partnership" mean here?

  4.3 Why the whole community? Who says we all need either services or cash supplied by the Welfare State? Why "high quality"—why not "excellent quality", or by contrast "adequate quality"?

  4.4 Who is to judge "dignity"? How widely and loosely is the term "disabled" being used?

  4.5 Very awkwardly formulated. The phrase "scourge of child poverty" sounds Old Labour, if not older still.

  4.6 Both "social exclusion" and "poverty" are vacuous concepts. Their indicators are therefore bound to be arbitrary.

  4.7 Needs re-writing.

  4.8 Good. I would divide this into three separate principles—flexibility, efficiency, and ease of use—and thus give more weight in the evaluation to this crucial domain.

THE INDICATORS

  Again I can only comment briefly, with some general observations and other specific to particular indicators.

  5.1 Quantitative, or at least plausibly qualitative, thresholds will need to be fixed throughout. For example, the "reduction" of unemployment specified in Indicator 1.1 could be as low as 1 and still pass! We shall need at least to say "significantly" or "substantially", or better "so many thousands per month on average over so many years". In this respect we need to be at least clear and bold with welfare indicators as we are in education and health care.

  5.2 We need to be able to justify the number of indicators for each principle, and to indicate how we would interpret conflicting messages delivered by success on say two out of four indicators and failure on two others. In my view, it is better to have fewer rather than more, including only well-measured indicators which unarguably represent the principle in question. The indicators should properly be selected by empirical research and appropriate statistical analysis rather than by political judgment. We could also work out statistically how best to combine selected indicators.

  5.3 A major weakness of much official evaluation is that the demand for (usually good) news makes it too short term. Annual change is often merely random turbulence. Two year, or better three year, reports would be more meaningful. In any even, if trends are what matters, indicators must be chosen and set for the long term. Thus, for example, all the indicators for Principle 1 may be too modest for the future.

  5.4 There is a case for including at least one indicator which assesses consumer satisfaction (like 8.1) for each principle. This type of indicator is more meaningful than others (except prices) to the general public, and less easily misunderstood, misinterpretated, or manipulated by spin doctors than some other measures. It also has the advantage of being tied closely as an "effect" to a "cause" within the Welfare State's control, or at least influence. By contrast, too many of the chosen indicators could move this way or that despite as much as because of the Welfare State's influence.

  5.5 Add to indicators for Principle 1 job vacancies in specific skill categories. Indicators 1.2 and 1.3 are absurdly too modest.

  5.6 Indicator 2.1 is vacuous—such a guarantee could exist without affecting reality any more than the "rights" in Stalin's constitution. Why should the improved confidence sought in 2.4 be restricted to the private sector? If the Government intends to maintain a state pension, people need to be confident of that too!

  5.7 Indicators 3.1, 3.2 and 3.3 are all meaningless without specific targets. I doubt if the Welfare State as such is responsible for much of the massive improvement in health, education, and housing which has been ongoing for a 100 years or more. Better to have more focused indicators within the control of the Welfare State—in relation to the quality of performance in these areas by state employees.

  5.8 Indicators 4.1, 4.2, 4.3, and 4.4 all seem to me to reflect incoherent thinking underlying Principle 4. 4.3 in particular is a logical mess. 4.4 emphasises the rights side of the equation too much—why no reference here to reduction of fraud, for example? Why not include reduction of the number of disabled among the criteria?

  5.9 Indicators 5.1, 5.2, 5.3, and 5.4 also reflect incoherence in the Principle they are supposed to measure. Does 5.1 mean an increase in the proportion of funding going to families with children compared with families without? Indicators 5.1 and 5.2 may contradict each other—if more money is given to families with children, there is arguably less incentive for the adults involved to seek work.

  5.10 6.1, 6.2, and 6.3 are arbitrary indicators of a meaningless concept. Why these and not others? 6.1 is a glib rag-bag—while truancy must be reduced, exclusion is a completely distinct phenomenon, and not necessarily to be deprecated. What matters is how we treat excluded pupils, rather than how many there are. 6.3 is a Sunday School aspiration which is probably as meaningless as it is infeasible.

  5.11 Indicators 7.1 and 7.2 will require much development work if they are to be coherently assessed. Client satisfaction measures perhaps have a part to play.

  5.12 Indicator 8.2 is surely a part of 8.1 and not separate. 8.3, 8.4, and 8.5 are very important measures, whose progress—or otherwise—will be watched with interest. I would build in some specific comparisons in this section with private sector provider benchmarks.

CONCLUSION

  It seems to me that in this phase of its work the Committee is taking on a most important task. If it is handled well, it could make a very valuable contribution to the quality of public administration across the board. All the public services of a truly modern society stand in need of coherent, rational evaluation. A powerful lead for other spheres of government—beyond even the wide swathe of welfare—could be provided by focusing carefully on two crucial issues:

  6.1 Getting the logic of evaluation sorted out straight once and for all.

  6.2 Securing reliance on up-to-date statistical analysis in the identification, measurement, and combination of indicators.

June 1998


3   See Ev p. 29 para. 4.5. Back


 
previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries

© Parliamentary copyright 1998
Prepared 21 July 1998