Select Committee on Health Written Evidence

Evidence submitted by the Institute for Innovation & Valuation in Health Care (NICE 18)


  1.1  The Institute for Innovation & Valuation in Health Care (InnoVal-HC) welcomes the opportunity to submit a response to the Health Committee's inquiry into aspects of the work of the National Institute for Health and Clinical Excellence (NICE).

  1.2  InnoVal-HC ("the Institute"; is an independent not-for-profit scientific organisation dedicated to research into the principles of economic evaluation of health care technologies and their application. The Institute was founded in June 2005 and has since been formally associated with the University of Applied Economic Sciences Ludwigshafen, Germany.

  1.3  The Institute's remit includes to conduct analyses and research into the methods and ethical foundations of health economic evaluations, the mechanisms of delivery of and financing health care, the valuation of innovative technologies, procedures, and products, and the acceptability of technologies based on their cost-benefit and cost-effectiveness ratios.

  1.4  The Institute does not operate as a contract research organisation. As a matter of principle, the Institute accepts support exclusively under a policy of unrestricted educational grants. To date, the Institute has received support from policy makers', payers', providers', physicians', patients', and pharmacists' organisations, as well as from the pharmaceutical industry.


  We would particularly like to draw the attention of the Committee to the following points:

    —  NICE's international standing

    —  The logic of cost-effectiveness

    —  NICE accountability for reasonableness

    —  Concluding remarks and recommendations


  3.1  Internationally, NICE's technology appraisal programme is broadly considered a role model for health technology assessments including economic evaluation. Following the House of Commons Health Select Committee report of June 2002, NICE commissioned the WHO Regional Office for Europe to carry out a review of its Technology Appraisal Programme. The WHO review team described key principles of the NICE approach as "the use of best available evidence in decision-making, transparency, consultation, inclusion of all key stakeholders, and responsiveness to change." They concluded that, "in all of these areas, it is clear that NICE is setting a new, international benchmark, for which it can and should be congratulated" (Hill et al., 2003).

  3.2  Further to this, NICE has assumed a leading role internationally by fostering methodological advances such as the use of probabilistic sensitivity analyses (intended to capture decision uncertainty) and mixed treatment comparison techniques (in order to enable indirect comparisons of technologies in the absence of head-to-head studies).

  3.3  Against this background, guidance issued by NICE, as well as the underlying technology assessments and appraisals, have attracted much attention internationally. For example, the US National Guideline Clearinghouse routinely lists recommendations by NICE on its website ( In addition, policy makers in jurisdictions other than the UK (such as the US and Germany) have engaged in debate about the adoption of NICE-like processes in the context of National Health Technology Assessment (HTA) programmes.

  3.4  In the context of this international debate, we conducted a qualitative study of the robustness of the NICE approach (Schlander, 2007a). Here we report on some of our key observations.


  4.1  The logic of cost-effectiveness as adopted by NICE—in contrast to traditional cost-benefit analysis—does not represent an application of standard economic theory (eg, Birch and Donaldson, 2003; Birch and Gafni, 2006). The approach was rather developed by decision analysts with an operations research background, striving to transfer methods to optimise the efficiency of manufacturing processes to the production of health (cf. Torrance, 2006). Specifically, NICE has chosen to use cost-utility analysis—a variant of cost-effectiveness analysis—as its reference case, with Quality-Adjusted Life Years (QALYs) as a universal and comprehensive measure of health-related outcomes (NICE, 2004).

  4.2  It is a fundamental and well established principle of decision analysis that "the identification and structuring of objectives essentially frames the decision being addressed. It sets the stage for all that follows" (Keeney and Raiffa, 1993). To be relevant, analytic decision support relies on prior clarification of values and objectives to be pursued (Keeney, 1992). Then, to a great deal, applying the logic of cost-effectiveness to inform health care resource allocation decisions hinges on the assumption that "the principal objective of the National Health Service (NHS) ought to be to maximise the aggregate improvement in the health status of the whole community" (Culyer, 1997; earlier for instance: Weinstein and Stason, 1977). While it appears trivial that health care services (should) produce health, it is by no means self-evident to make a quick leap from here to an assumed "principal objective" of collectively financed health care to simply maximise some construct (QALYs or else) of health-related consequences.

  4.3  In fact, there is little if any evidence that the maximisation view (sometimes justified with an asserted "consensus in the literature" without specifying sources; see Torrance, 2006) is shared by the general population (Coast, 2004). On the contrary, there has been a rapidly growing body of studies, which collectively show that this assumption is "empirically flawed" (Dolan et al., 2005; and others). Controversial issues revolve around (but are not limited to) a higher social priority for interventions when the severity of the patient's condition increases, with life-saving interventions most highly valued (this is sometimes referred to as "the rule of rescue", cf. Jonsen, 1986; Hadorn, 1991; Nord, 1999; Ubel, 2000; McKie and Richardson, 2003), and for people in so called double jeopardy (ie, with more than one condition causing impairment) who have less QALYs to gain from successful interventions compared to otherwise healthy individuals (cf. Singer et al., 1995; Harris, 1995; McKie et al., 1996). As a consequence, there has been a call for more research into "empirical ethics" by leading health economists (eg, Richardson and McKie, 2005).

  4.4  The maximisation assumption has also been critiqued from a normative perspective. Concerns prominently include the implied valuation of human life as a function of health status, as opposed to viewing the value of life as a dimension distinct from health (Harris, 1987; Arnesen and Nord, 1999; and many others).

  4.5  In the absence of a gold standard against which to the judge criterion validity of the logic of cost-effectiveness, it has been proposed to use the so-called reflective equilibrium approach to examine the social acceptability of the resulting rankings of health care programmes (Daniels, 2001; Nord, 1992). Thus the problems involved in the application of standard decision rules derived from the logic of cost-effectiveness are perhaps best illustrated using an example: Assuming the (incremental) cost per QALY gained was, for example, approximately £3,600 for sildenafil in erectile dysfunction (Stolk et al., 2000), approximately £7,000 for pharmacotherapy of children with attention deficit hyperactivity disorder (NICE, 2006), and >£120,000 for beta-interferons and glatiramer in multiple sclerosis (NICE, 2002), would this ranking reflect the comparative social desirability of these interventions (cf. McGregor, 2003)?

  4.6  Far from representing a phenomenon encountered in England and Wales only, the issue of counterintuitive rankings had been a major obstacle already faced by the protagonists of cost-effectiveness analysis for resource allocation in the Oregon Health Plan (cf. Hadorn, 1991). It is a conspicuous observation that reviews of the usefulness of such rankings ("QALY league tables") by many health economists have addressed a variety of technical issues in detail but did not pay attention to the issue of the validity of such rankings. (eg, Drummond et al., 1993; Mauskopf et al., 2003).

  4.7  Importantly, the issue of counterintuitive rankings should not be confused with the problem of distorted human judgments due to "heuristics and biases" (Gilovich et al., 2002), as moral intuitions in the sense of reflected values and beliefs cannot be invalidated simply on grounds of their incompatibility with competing normative claims. Of note, it has even been argued by philosophers that there may exist an irreducible pluralism at the foundations of normative ethics (cf. Nagel, 1979).


  5.1  Recognizing both the difficulty of democratic societies to achieve consensus on distributive principles for health care and the need for legitimacy of allocation decisions, Norman Daniels and James Sabin (2002) proposed a framework for institutional decision-making, which they call "accountability for reasonableness" (A4R). In order to narrow the scope of controversy, A4R relies on "fair deliberative procedures that yield a range of acceptable answers" and consists of four conditions.

  5.1.1  Publicity, ie, resource allocation decisions must be public, including the grounds for making them. Transparency should open decisions and their rationales for scrutiny by all affected, not just the members of the decision-making group.

  5.1.2  Relevance, ie, "the grounds for decisions must be ones that fair-minded people can agree are relevant to meeting health care needs fairly under reasonable resource constraints." Arguments should rest on scientific evidence, though not necessarily a specific kind of, and appeal to the notion of "fair equality of opportunity." Although Daniels and Sabin acknowledge that stakeholder participation may improve deliberation about complicated matters, they believe it is neither a necessary nor a sufficient condition of A4R.

  5.1.3  Revisions and appeal, ie, there must be an institutional mechanism to engage a broader segment of society in the process, providing those affected by a decision to reopen deliberation, and to offer decision-makers an option to revise funding decisions in light of further arguments.

  5.1.4  Enforcement entails some form of regulation to make sure that the first three conditions are met.

  5.2  Seeking to combine legitimacy and pragmatism, and realizing that utilitarianism "has next to nothing to offer in eradicating health inequalities" (Rawlins and Dillon, 2005), NICE put aside questions whether matters of content can "be resolved solely with a reference to `due process'" (Hasman and Holm, 2005) and has explicitly subscribed to the principles of accountability for reasonableness (Rawlins and Dillon, 2005). At the same time, NICE reaffirmed its preference for cost-utility analysis with QALYs "as its principal (though not only) measure of health gain."

  5.3  A preliminary case study of a recent NICE Technology Appraisal (No. 98; see focused on the processes adopted by NICE. It confirmed the high (albeit not prefect) level of transparency, predictability, and the participatory nature of the NICE approach (Schlander, 2007b). While largely in agreement with the positive WHO review (Hill et al., 2003), the analysis also indicated a need for further in-depth inquiry.

  5.4  A subsequent more comprehensive in-depth review focusing on the technology assessment report informing NICE Technology Appraisal No. 98 did not confirm the expected robustness of the NICE evaluation process, revealing a striking number of limitations and anomalies (Schlander, 2007c). Collectively these left the assessment open to critique regarding all essential components of a technology review question, namely the population studied, the choice of interventions, the clinical and economic criteria used, as well as the study designs and selection criteria (cf. CRD, 2001). Furthermore, the structure of the economic model itself was prone to distortion and bias in various ways, and an unsettling number of consistency problems were identified within the assessment report. As a consequence, the assessment did not fully consider the best available evidence and was unable to identify any differences in clinical effectiveness between the treatment options evaluated.

  5.5  A number of underlying problems were suggested to explain the observed limitations, notably including an insufficient integration of clinical and economic perspectives, a high level of standardisation demanding to make the problem fit a preconceived solution approach (including [but not limited to] the use of QALYs as effectiveness measure), and issues related to the technical quality of the assessment itself (Schlander, 2007a).

  5.6  Process-related observations may be compared to the conditions of accountability for reasonableness:

  5.6.1  Publicity. The overall process was well structured and followed well-defined timelines with predictable opportunities for (some) stakeholders to provide input; key documents were continuously published at the NICE website. Major limitations of transparency were related to the use of commercial-in-confidence information (a situation on which NICE has taken action meanwhile), the economic model developed by the assessment group, and decision-making criteria beyond cost-effectiveness used by the appraisal committee. Designating economic models as "proprietary" insulates a major component of technology assessments from public scrutiny and does not meet established standards of good economic modelling practice (eg, Philips et al., 2004; Brennan and Akehurst, 2000). It might be added that this practice prevents academic debate as well and, therefore, is not conducive to the further development of health economic evaluation methods. As admitted by NICE (cf. above, 5.2), quasi-utilitarian maximisation of QALY gains irrespective of their distribution does not provide for a sufficient basis for health care resource allocation in tune with social preferences. Thus, it is a critical transparency issue that decision criteria other than cost-effectiveness have not (yet) been codified by NICE.

  5.6.2 Relevance. In the absence of codified criteria for fairness and with its heavy (albeit not exclusive) reliance on cost-effectiveness benchmarks, the specific NICE approach may be characterised as an "efficiency-first" strategy (cf. Richardson and McKie, 2006). It has been argued by observers that this approach in practice will result in the marginalization of other factors "as outside of NICE's terms of reference" (Redwood, 2006). It seems unlikely that the current approach will enable to adequately capture social preferences for health care provision. A current example nicely illustrating these issues is the debate about the cost-effectiveness of expensive drugs to treat patients with rare disorders ("orphan drugs"). Given the high fixed (ie, volume-independent) and low variable cost structure of the pharmaceutical industry, applying the logic of cost-effectiveness would inevitably deprive these patients of any chance to receive effective treatment (cf. McCabe et al., 2005, 2006; Hughes, 2005, 2006). While not meant to dismiss any need to make thorny trade-off decisions, this example may serve to illustrate the role of budgetary impact for reimbursement decision-making—which NICE has repeatedly denied to take into consideration (Rawlins and Culyer, 2004; Pearson and Rawlins, 2005), despite at least some indications to the contrary (Dakin et al., 2006). While this position taken by NICE appears questionable on both theoretical and pragmatic grounds, it is evident that recognition of the relevance of budgetary impact would have fatal implications for any attempt to interpret the logic of cost-effectiveness in a normative way (Donaldson et al., 2002; Schlander, 2003, 2005).

  5.6.3 Revisions and appeal. NICE provisions for appeal are more restrictive than those provided for by A4R. Appeals are limited to specific grounds and do not allow to reopen debate. It seems unlikely that these limitations are compensated for by opportunities for (invited) consultees and commentators to provide inputs during the process, owing to the relatively short windows of opportunity compared to the massive amount of data to be reviewed and due to their limited transparency (cf. above, 5.6.1).

  5.6.4 Enforcement. There is no indication that NICE has implemented an effective quality assurance system for technology assessments. Design of effective provisions would have to take into account that conventional peer-review processes are unlikely to be up to the task to assess the quality of economic evaluation models (Brennan and Akehurst, 2000; Hill et al., 2000).

  5.6.5 Implementation. Following Hasman and Holm (2005), proper enforcement of decisions should ensure that reasoning is "decisive in priority setting and not merely a theoretical exercise". Although NICE and the NHS have made substantial efforts to improve actual implementation of guidance, there remain issues in this area as well (cf. Sheldon et al., 2004; Freemantle, 2004). It has been suggested that guidance may be "more likely to be adopted when there is strong professional support, a stable and convincing evidence base" and that "guidance needs to be clear and reflect the clinical context" (Sheldon et al., 2004)—conditions that were arguably not fulfilled in the case of Technology Appraisal No. 98 (Schlander, 2007a,c).

  5.7 NICE has established a "Citizens Council" to provide input "on the topics it wants the council to discuss" and to ensure that its "value judgments resonate broadly with the public" (Rawlins and Culyer, 2004), while maintaining that its guidance "is based on clinical and cost-effectiveness evidence" (NICE, 2007). The Citizens Council has shown some concern for considerations of social justice but endorsed NICE's approach, concluding that "cost-utility analysis is necessary but should not be the sole basis for decisions on cost-effectiveness" (NICE, 2005a,b). It might be worthwhile to explore in more depth whether the Citizens Council was confronted with the issue of cost-per-QALY rankings such as those cited above (see 4.5), ie, with the logic that providing 10 people with a utility gain of 0.1 for the rest of their life (equivalent to sildenafil treatment for men with erectile dysfunction) is indeed considered equivalent to saving the life of a single (otherwise healthy) person.

  5.8 Summing up, there are good reasons to be suitably impressed by the attempts by NICE to ensure rigorous systematic reviews, objective economic evaluation, stakeholder participation, and transparency of process as well as value judgments. This notwithstanding, NICE is still in its infancy (cf. Williams, 2004), and—in our conclusion—there remains a long way to go before conditions of accountability for reasonableness will have been met.


  6.1  At this point in time, our observations do not confirm "NICE's use of cost effectiveness as an exemplar of a deliberative process", as one of its founding fathers recently claimed (Culyer, 2006). In our conclusion, a more balanced perspective would seem commendable, as there is reason for concern as to the robustness of NICE health technology assessment processes as well as their specific focus on "efficiency" in terms of aggregated QALY maximisation.

  6.2  In particular, in our view it would seem justified to (re)consider (a) more flexible approaches in terms of process as well as analytic procedures (enabling to adapt the problem-solving strategy to the clinical decision problem at hand), (b) the extent of reliance on QALYs as (exclusive?) clinical effectiveness measure, (c) the level of integration of clinical and economic perspectives, (d) the implementation of an effective quality assurance system for technology assessments. From an international perspective, we further note that the value judgments of NICE are not universally shared.

Professor Michael Schlander

InnoValHC, Eschborn, Germany

16 March 2007

previous page contents next page

House of Commons home page Parliament home page House of Lords home page search page enquiries index

© Parliamentary copyright 2007
Prepared 17 May 2007