1. Members of the Sub-Committee visited the Human
Genome Campus at Hinxton, Cambridge on 31 January 2001 to see
at first hand the technology of genome sequencing and to discuss
present and projected developments in both the technology and
the science with the leading practitioners who worked there.
2. The visiting party consisted of Lord Oxburgh
(Chairman of the Sub-Committee), Lord Jenkin of Roding, Lord Patel,
Lord Perry of Walton, Lord Rea, Lord Turnberg, Lord Wade of Chorlton
and Baroness Wilcox. The party was supported by the Sub-Committee's
Specialist Adviser (Professor Paul Elliott) and Clerk (Mr Roger
Morgan), and the Select Committee's Specialist Assistant (Dr Adam
Heathfield). Dr Richard Pitts of the HGC Secretariat was also
present to assist in the discussions.
INTRODUCTION
3. Martin Bobrow, Professor of Medical Genetics
at Cambridge University and a governor of The Wellcome Trust,
welcomed the Sub-Committee to the Trust's Human Genome Campus.
He noted that the science of genomics was a step change in biology.
This would be beneficial in revolutionising understanding as well
as producing practical benefits in the short to medium term. He
said that those involved in creating the Sanger Centre had been
visionary in perceiving that long and complex genomes really could
be sequenced. The expertise concentrated on the campus placed
the United Kingdom at the forefront of this work, and the Centre's
role in ensuring that genome databases were publicly available
had been vital.
4. Professor Allan Bradley, the recently appointed
Director of the Sanger Centre, gave a brief outline of the Centre's
history, from its conception in 1992 and its establishment on
a joint campus with the European Bioinformatics Institute (EBI)
in 1993. He described the data sequencing strategy the Sanger
Centre used, and the way the end data were annotated and stored.
The Centre was the largest public contributor to the Human Genome
Project, and employed about 575 people. Its principal work programmes
involved sequencing human, mouse, zebrafish and pathogen genomes;
annotating them; and investigating sequence variations and their
association with disease[73].
5. Professor Bradley discussed the sorts of research
that would be needed to understand how complex biological systems
were influenced by different parts of the genome sequences. It
was now possible to perform experiments with entire sets of genes
(30-40,000 in humans, for example), but these required enormous
computational resources and skilled people to analyse and interpret
all the data they produced. Laboratory work was an important component
of the research and, in this context, Professor Bradley stressed
that the effects of variations in DNA on a complex organism could
not be fully understood by simple cell culture studies. Studying
the effects of mutations in whole animal models would be vital
to understanding the genome data. If the Government failed to
support the need for animal procedures, and to protect researchers
from intimidation, the United Kingdom's ability to profit from
its work on genome sequencing would be severely compromised. Biological
sample collections, like cancer cell lines, and patient databases
would also be vital to interpreting all the sequenced genome data.
6. Dr Graham Cameron, joint head of the EBI, outlined
the work of the Institute. The EBI is part of the European Molecular
Biology Laboratory (EMBL); it stored, organised, updated, and
made publicly available all of the data sets assembled with EMBL
help. The types of data it held included nucleotide sequence databases,
gene expression databases, and protein structure databases.
7. Dr Cameron described the growing computational
demands of bioinformatics, contrasting Moore's law, which predicted
a doubling of computer power every 18 months, with the growth
rate of sequenced DNA data, which doubled every 10 months. Thus
the computer resources to manage the data needed not only to be
updated but to be significantly expanded. However, he highlighted
the fact that only a small proportion of the expenditure on genome
projects had been allocated to the information resources - the
collection of the data was far more costly than its storage and
analysis. He also highlighted the need for secure institutions
to act as custodians of the electronic scientific records.
THE SANGER
CENTRE
Guided by Christine Rees, the visiting party then
toured the Sanger Centre, receiving a number of short presentations
from members of staff:
- Dr Julian Parkhill described the sequencing of
various pathogen genomes such as TB and plague;
- Dr Stefan Beck spoke about the technical process
of sequencing DNA and checking the data accuracy;
- Dr Rachel Ainscough showed the Sub-Committee
the equipment used in the sequencing work;
- Mr Phil Butcher talked about the IT infrastructure
that supported the processing, storage and dissemination of the
data; and
- Dr Ewan Birney outlined the joint EBI/Sanger
'Ensembl' project which maintained a constantly updated, annotated
version of the genome sequence.
DISCUSSION
8. In general discussion with Professor Bobrow,
Professor Bradley, Sir John Sulston (Founder Director of the Sanger
Centre), Dr Cameron, and Professor Michael Ashburner[74]
(the other joint head of the EBI), and other senior staff from
both organisations, the following main points were noted.
- The Sanger Centre was a world leader in the Human
Genome Project. In providing about a third of the data, it was
the single largest public provider. Its "finishing"
arrangements also meant that the finalised data were of a very
high degree of accuracy.
- The Project was not mapping the 'standard' genome.
The aim was to provide a reference model as a basis for further
work and refinement as variations between individuals became better
understood. Keeping track of all this material, together with
information on annotations and functionality was undertaken by
the EBI, whose co-location with the Sanger Centre provided great
synergy.
- The work of both centres made full use of the
latest technologies. The demands, particularly as regards the
computing resource, were substantial. The Sanger Centre deposited
information on about 50 million base pairs of genetic data every
24 hours. Nearly 20 per cent of its 575 staff worked on bioinformatics.
- Actual genes made up only about 3 per cent of
the human genome. Even so, the full might of the Sanger Centre's
operation would take about a week to sequence all an individual's
genes. Even if only sites of possible key variations (in about
400 genes) were targeted, the Centre could process only about
100 cases a week. Handling such data made substantial demands
of IT in terms of both storage and processing. There would probably
need to be improvements of an order of magnitude or more to secure
the full benefits of the 500,000 person MRC/Wellcome Trust study.
- Planning for large cohort studies should proceed
with a clear view of the time scale and the investment needed.
It would be foolish to embark on a project where computing capacity
would provide a real limitation, but work of universal benefit
should be undertaken even when the exact outcomes were not clear.
Sequencing the human genome was a good example of a project where
it would have been easy, but unwise, to be deterred by apparent
initial difficulties.
- The potential benefits of generating genetic
and biomedical databases could not be overstated. The EBI's experience
already showed that the scientific community's use of the databases
it held vastly speeded up the development of hypotheses for testing
in the laboratory and otherwise.
- The rapid development of the databases created
problems in archiving and time-stamping references for citation
which had not yet been fully resolved. Research based on a genome
database could be hard to replicate or build on once the database
had been updated or re-annotated.
- The Sanger Centre had been highly influential
in ensuring that basic human genome data were publicly available
from centralised databases at no charge. The need to provide open
access to such basic scientific data was also the driving force
of the EBI. The public good of these data was such that making
them available should be supported by public investment.
- The position over intellectual property rights
and patenting in the new situation had not been fully thought
through. Biology had advanced dramatically in the last 20 years,
but IPR and patent law had not kept up. It was clear that underlying
data should be freely available, but unclear as to where in the
development of commercial products protection should be available
for the developer.
- Concern was expressed about opposition to animal
testing and potential changes to consent arrangements for medical
information and sample-taking - these would impede progress in
important biomedical research. It was noted that the scientific
community could do more to communicate the benefits of its work
that depended on such tests and procedures.
9. Members endorsed the Chairman's thanks to The
Wellcome Trust and the various participants for having provided
a most stimulating and informative day.
73 The Sanger Centre had also submitted written evidence
for the Inquiry (p 88). Back
74
Dr Cameron and Professor Ashburner had also submitted written
evidence for the Inquiry (p 110). Back
|