House of Commons - International Development

Select Committee on International Development Written Evidence

Supplementary memorandum submitted by the Department for International Development (DFID)

ANSWERS TO THE INTERNATIONAL DEVELOPMENT COMMITTEE'S FOLLOW-UP QUESTIONS SECTION B—1 OCTOBER 2007

PSA TARGETS AND PERFORMANCE

1. Question 20

In response to the last sub-question on the scoring exercise for DFID's performance against the PSA target, you say that "DFID commissioned an independent review of the accuracy and consistency of scores over the last three years." We would be grateful to know who carried out the independent review and to receive a copy of the relevant report.

This review entitled Assessing The Quality Of DFID's Project Reviews was carried out by Nigel Thornton of Agulhas. The relevant part of the report which relates to project scoring is attached.

2. Efficiency Gains

You sent us a spreadsheet ("Turnaround sheet" spreadsheet, "£-Gains" worksheet—attached for ease of reference) showing forecast and actual efficiency savings each quarter. This shows currently forecast savings for each quarter in 2007-08—£163 million, £275 million, £370 million and £517 million. Do these figures represent savings that will be identified in that year, and thereby exclude any earlier identified savings such as the £434 million identified by March 2007?

Yes.

We understand that for the Efficiency Programme departments are required to record savings that are sustainable and will recur in subsequent periods. Is any adjustment made to claimed savings to date for any savings which have only a one-off (ie non-recurring) impact?

No.

ASSESSING THE QUALITY OF DFID'S PROJECT REVIEWS

INTRODUCTION

1. The assignment was commissioned to answer:

(a) whether the quality of DFID project documentation (particularly reviews) is changing over time, and

(b) whether scoring of reviews is consistent across the portfolio.

2. It is primarily a descriptive report of findings. In some cases comments are made, but the principal function is to identify what is taking place.

BACKGROUND

3. The review was undertaken in late 2006 and early 2007. The consultant was provided with data and documentation by Corporate Planning and Performance Group (CPPG) for a sample of projects which PRISM reports as reviewed in 2004 and 2006. The sample was intended by CPPG (as detailed in Annex 1) to allow comparison between these two time periods.

4. The sample included only projects/programmes with a total commitment value of £5 million or over. It was thus expected that documentation would be easily available (as required by corporate guidance). Unfortunately, this did not prove to be the case. Not all the projects had sufficient information to allow full review. It also provided difficult to identify the version, date and authorship of many of the documents. The effort expended by key staff (notably Steve Martin and his team) in trying to obtain the sample's PHS, Log Frames, Project Memoranda and reviews was considerable, and is acknowledged here.

5. Ultimately information relating to 219 projects was captured (876 documents in all, provided both electronically and in hard copy), split between 42 different principal MIS codes (ie countries or regional funds). Since this was a representative sample from across the entire DFID regional portfolio, larger programmes dominated; thus 38 projects in the full sample were from India, 26 from Bangladesh.

6. Unfortunately, during the assessment process it emerged that for the 85 projects reviewed prior to 1 July 2005, only 44 had log frames and 63 reviews. CPPG's view was this was too few for the assessment to be seen as statistically representative of DFID's activities from this period. The focus of the assignment thus shifted to the 134 projects reviewed after July 2005, 97 of which had sufficient information to answer this review's questions. Data had, however, been collected for 28 projects from the pre July 2005 sample prior to the decision to focus on the later information. Whilst not formally significant, CPPG has requested the inclusion of this data in the report as an indicative comparator.

THE TASK

7. Based on the information provided, the consultant was asked to assess the quality of reporting for each project, using a common set of questions agreed with CPPG (see Annex 2 for details of these, Annex 3 for the responses). Sixty items of information per project were collected. These were both enumerative and evaluative, covering quality, clarity and accuracy of information. When in doubt the scoring tried to be generous.

8. The quality review considers (a) target setting, and (b) reviews. Some preliminary recommendations are made at the end of the report.

9. Given the lack of data for earlier projects, it proved impossible to conclusively track changes in quality through time. However data from the two time periods is presented later in the report.

10. Disclaimer: It is important to note that the consultant was asked to make judgements based only on the information available. The findings here are not a detailed primary evaluation of project performance. The views in this report are the consultant's alone and caution should be exercised in using the findings.

The relevant extract on project scoring is attached.

THE QUALITY OF REVIEWS

11. This assignment was asked to compare project performance entered on PRISM with an assessment based on the evidence provided in the documentation.

12. The following table compares the scores entered on PRISM with the assessed scores. It identifies whether the PRISM scoring was justified by the documentation. As can be seen, the scoring of PRISM and the assessment from the evidence correlated in 62% of all cases.

Purpose Level Scores

	Assessment (Post July 2005, 97 projects)
PRISM	1	2	3	4	5	6/X
1	5%	3%				2%
2		35%	11%			10%
3		1%	20%	5%		4%
4
5					1%
6						1%

— Assessed score higher than PRISM by 1

— Scoring of PRISM and assessment the same

— PRISM score higher than assessed by 1

— PRISM score higher than assessed by 2 or more

— PRISM scores assessed as too early or insufficient data.

13. The table indicates that in 19% of cases scores may have been inflated by one position. Only in one case did the evidence suggest that a project was clearly under scored on PRISM.

14. For a further 16% of the sample, whilst scores were provided, there was either insufficient evidence in the documentation to support any assessment (ie a score with inadequate or no justification), or from the evidence provided it was clearly too early to tell what the performance was.

15. It will have been seen that the most variation in scoring was across the boundary between box two and box three scores. 63% of all the PRISM box two scores were also assessed as box two. However, 19% were assessed as only having sufficient information to warrant a box three score. There was insufficient data or evidence to score 17% of all the projects that PRISM indicates warranted a box two score.

16. If there is a tendency to inflate scores, it is possible to hypothesise about some causes. The wording of the scoring ("likely to be achieved") and the desire of staff to be positive perhaps influences many reviewers to be optimistic (which will particularly be the case early in a project's life). If there is an inflation of scores across the box two and three boundary, it may be possible this relates to the current focus on box 1 and two scores for corporate reporting.

Has the quality of reviews changed?

17. The assessment of the reviews undertaken prior to July 2005 indicates a broadly similar distribution to those reviewed after July 2005, albeit a higher proportion of the overall over-scored (two-thirds from this small sample compared to 35% from post July 2005).

Purpose Level Scores

	Assessment (Pre July 2005, 28 projects actual numbers)
PRISM	1	2	3	4	5	6/X
1	1	3		1		3
2		3	4			3
3			1
4				1		4
5
6

— Assessed score higher than PRISM by 1

— Scoring of PRISM and assessment the same

— PRISM score higher than assessed by 1

— PRISM score higher than assessed by 2 or more

— PRISM scores assessed as too early or insufficient data.

18. If this small sample is indicative of the pre-July 2005 situation, it would show that whilst there is a degree of over-scoring, this trait has reduced in the more recent past. However, this sample is not statistically significant and it is not possible from this evidence to conclusively deduce whether this is a pattern which has recently changed.

In addition to the specific analysis on project scoring (above), the report also looked at the clarity of targets, the composition of review teams and a range of issues around lesson learning.

March 2007



© Parliamentary copyright 2007	Prepared 15 November 2007