Memorandum submitted by John Graham-Cumming
(CRU 55)
I am writing at this late juncture regarding
this matter because I have now seen that two separate pieces of
written evidence to your committee mention me (without using my
name) and I feel it is appropriate to provide you with some further
information. I am a professional computer programmer who started
programming almost 30 years ago. I have a BA in Mathematics and
Computation from Oxford University and a DPhil in Computer Security
also from Oxford. My entire career has been spent in computer
software in the UK, US and France.
I am also a frequent blogger on science topics
(my blog was recently named by The Times as one of its top 30
science blogs). Shortly after the release of emails from UEA/CRU
I looked at them out of curiosity and found that there was a large
amount of software along with the messages. Looking at the software
itself I was surprised to see that it was of poor quality. This
resulted in my appearance on BBC Newsnight criticizing the quality
of the UEA/CRU code in early December 2009 (see http://news.bbc.co.uk/1/hi/programmes/newsnight/8395514.stm).
That appearance and subsequent errors I have
found in both the data provided by the Met Office and the code
used to process that data are referenced in two submissions. I
had not previously planned to submit anything to your committee,
as I felt that I had nothing relevant to say, but the two submissions
which reference me warrant some clarification directly from me,
the source.
I have never been a climate change skeptic and
until the release of emails from UEA/CRU I had paid little attention
to the science surrounding it.
In the written submission by Professor Hans von Storch
and Dr Myles R Allen there are three paragraphs that concern me:
"3.1 An allegation aired on BBC's "Newsnight"
that software used in the production of this dataset was unreliable.
It emerged on investigation that the neither of the two pieces
of software produced in support of this allegation was anything
to do with the HadCRUT instrumental temperature record. Newsnight
have declined to answer the question of whether they were aware
of this at the time their allegations were made.
3.2 A problem identified by an amateur computer
analyst with estimates of average climate (not climate trends)
affecting less than 1% of the HadCRUT data, mostly in Australasia,
and some station identifiers being incorrect. These, it appears,
were genuine issues with some of the input data (not analysis
software) of HadCRUT which have been acknowledged by the Met Office
and corrected. They do not affect trends estimated from the data,
and hence have no bearing on conclusions regarding the detection
and attribution of external influence on climate.
4. It is possible, of course, that further
scrutiny will reveal more serious problems, but given the intensity
of the scrutiny to date, we do not think this is particularly
likely. The close correspondence between the HadCRUT data and
the other two internationally recognised surface temperature datasets
suggests that key conclusions, such as the unequivocal warming
over the past century, are not sensitive to the analysis procedure."
I am the "computer analyst" mentioned
in 3.2 who found the errors mentioned. I am also the person mentioned
in 3.1 who looked at the code on Newsnight.
In paragraph 4 the authors write "It is
possible, of course, that further scrutiny will reveal more serious
problems, but given the intensity of the scrutiny to date, we
do not think this is particularly likely." This has turned
out to be incorrect. On February 7, 2010 I emailed the Met Office
to tell them that I believed that I had found a wide ranging problem
in the data (and by extension the code used to generate the data)
concerning error estimates surrounding the global warming trend.
On 24 February 2010 the Met Office confirmed via their press office
to Newsnight that I had found a genuine problem with the generation
of "station errors" (part of the global warming error
estimate).
In the written submission by Sir Edward Acton
there are two paragraphs that concern the things I have looked
at:
"3.4.7 CRU has been accused of the effective,
if not deliberate, falsification of findings through deployment
of "substandard" computer programs and documentation.
But the criticized computer programs were not used to produce
CRUTEM3 data, nor were they written for third-party users. They
were written for/by researchers who understand their limitations
and who inspect intermediate results to identify and solve errors.
3.4.8 The different computer program used
to produce the CRUTEM3 dataset has now been released by the MOHC
with the support of CRU."
My points:
1. Although the code I criticized on Newsnight
was not the CRUTEM3 code the fact that the other code written
at CRU was of low standard is relevant. My point on Newsnight
was that it appeared that the organization writing the code did
not adhere to standards one might find in professional software
engineering. The code had easily identified bugs, no visible test
mechanism, was not apparently under version control and was poorly
documented. It would not be surprising to find that other code
written at the same organization was of similar quality. And given
that I subsequently found a bug in the actual CRUTEM3 code only
reinforces my opinion.
2. I would urge the committee to look into whether
statement 3.4.8 is accurate. The Met Office has released code
for calculating CRUTEM3 but they have not released everything
(for example, they have not released the code for "station
errors" in which I identified a wide-ranging bug, or the
code for generating the error range based on the station coverage),
and when they released the code they did not indicate that it
was the program normally used for CRUTEM3 (as implied by 3.4.8)
but stated "[the code] takes the station data files and makes
gridded fields in the same way as used in CRUTEM3." Whether
3.4.8 is accurate or not probably rests on the interpretation
of "in the same way as". My reading is that this implies
that the released code is not the actual code used for CRUTEM3.
It would be worrying to discover that 3.4.8 is inaccurate, but
I believe it should be clarified.
I rest at your disposition for further information,
or to appear personally if necessary.
March 2010
|