Select Committee on Science and Technology Written Evidence


Memorandum submitted by the School of Computer Science, Cardiff University

  This submission from the School of Computer Science at Cardiff University, is made by staff who are members of the School's Knowledge and Information Systems Research Group, whose work provides design research and practical software and database resources to support the Species 2000 Catalogue of Life, a UK-led international programme constructing and making available, in digital form, the first comprehensive listing of all the world's species of biological organisms (animals, plants, fungi and micro-organisms). The edition of the Catalogue of Life to be released in April 2008 documents 1.1 million of the estimated 1.8 million known species. The Catalogue is an essential tool for organising biodiversity data, improving retrieval and minimising the loss of data which can occur because of the necessary changes in the names and classification of organisms as knowledge improves, and is an essential core component of international biodiversity knowledge organisations such as GBIF (Global Biodiversity Information Facility) and the Encyclopedia of Life.

  As non-taxonomists ourselves, we wish to emphasise some of the ways in which systematics and taxonomy research in the UK is linked to other kinds of research, and to the sources of information used by scientists and professionals in other disciplines. We have become associated with the Species 2000 Catalogue of Life and its Secretariat at the University of Reading through such linkages. Richard White is a member of the Species 2000 Project Team (the Executive), and is the Convenor and Andrew Jones is a member of the Species 2000 Information Systems Group, which oversees the technical computing aspects of the Catalogue of Life, such as its adherence to international standards to enhance its interoperability with other information and knowledge systems. Alex Gray is a Director of Species 2000 which, although an international co-operative programme, is registered as a UK not-for-profit organisation in order to handle matters of finance and ownership.

Response to question 9

  There are two interpretations of the phrase "web-based taxonomy", deriving from the dual meaning of "taxonomy" as both the science and process of carrying out taxonomic revisions and also the result of carrying out these processes on a particular group of organisms, which usually results in revised and improved classifications. To make a mechanism for carrying out taxonomic revisions accessible on the Web to those actually performing it (taxonomists and other providers of the information they use), is different from and more challenging than the delivery of the results of a taxonomic revision on the Web (to scientists, professionals and the general public). It is important to make this distinction clear in the context of what is meant by "web-based taxonomy". We will refer to them as "web-based revision" and "web-based delivery" respectively.

CARRYING OUT TAXONOMIC REVISIONS ON THE WEB

  Web-based revision is in its infancy, and working taxonomists are not all convinced of its value. But research in progress shows how it could be done. It has many parallels with performing other complex collaborative tasks on the Web. It will be able to make use of principles and practice being developed in other disciplines, especially in commerce and education. Unlike web-based blogs, wikis and the like to create what are essentially simple documents collaboratively, the process of taxonomic revision requires rigorous recording of "provenance" (the originators, dates and details of data values, analyses, decisions and changes) and the ability to back-track to substantiate or reverse past decisions. The NERC CATE project is beginning to tackle these issues.

  These requirements can in turn be addressed, for example, by a suite of techniques collectively known as "virtual organisation" facilities, which are being developed in collaborations between computer scientists and business and commercial organisations. The overall goal is to allow partners to discover each other and work together in a secure Internet environment to achieve more through their collaboration than they could have achieved separately. There is much here of potential mutual benefit to taxonomic practitioners and computer scientists, and this is one of the reasons for the joint activities of our group with those who are creating and distributing taxonomic products such as the Catalogue of Life. In Cardiff, we are involved in initiatives and programmes which will put in place elements of a system which may make web-based taxonomic revision widely available in the future.

DELIVERING TAXONOMIC OUTPUTS ON THE WEB

  At its simplest, web-based delivery of taxonomic results is a much easier task, and many organisations, projects and individuals are doing this already. Web pages are much easier for scientists, other professional users and the general public to find than the printed publications in which taxonomic revisions and classifications are traditionally published. The Species 2000 Catalogue of Life is delivering taxonomic outputs (the Catalogue of Life itself) on the web, at http://www.sp2000.org.

  However, there is a translation and packaging process which is necessary if user communities are to make full use of the results of taxonomic revisions. This point also addresses questions 2 and 3 in the request for submissions. What most users want to use is not the taxonomic revisions themselves but improved and reliable resources based on them: outputs and services such as stable nomenclature, checklists and improved classifications which can be used as the framework for assembling information. What is important to them is a stable framework of classification and nomenclature organised and made available on top of the foundation established by the taxonomic revisions. These resources are often not created by the taxonomists themselves, but by organisations such as Species 2000 who understand the need for them and the data, information and knowledge they will help to organise.

  In delivering these outputs and services, the Catalogue of Life supports an increasing variety of user communities, and also demonstrates the need for continuing taxonomic and computing activities to complete them. It provides a consensus view of the taxonomic outputs which makes them easier to use, by effectively filtering out the "noise" in the process of delivering taxonomic summary outputs to the users, so that they need not consider individually every revision and name change or worry about whether it is accepted by all taxonomists before they adopt it.

  Despite the vital role of the Species 2000 Catalogue of Life in helping to organise biodiversity knowledge, it currently receives little funding for either its data content (filling in the taxonomic groups which still lack reliable checklists) or its computing infrastructure; improved techniques and the software to implement them can accelerate its completion and increase its usability for many users).

Response to question 10

  Both of our interpretations of "web-based taxonomy" involve processes which encourage the full exposure to scrutiny that always tends to improve quality, reliability and user-friendliness. If web-based, every step in the processes of both taxonomic revision and delivery can be made open and accessible to scrutiny. Taxonomy should not be seen as an impenetrable process of preparation carried out by experts before their conclusions are finally revealed. What makes information useful is not hiding it away until it is deemed to be complete and finished, but providing the right access methods to give taxonomists and users the views of the data that they individually want and can understand, even while the taxonomists are still working on it. After all, many taxonomic revisions take a long time, but the data which is being used by the taxonomists and their preliminary conclusions may be of use to users, who may even be able to add to them. The intermediate layer of "resources" between the taxonomic revisions and the knowledge layers that user communities are building, described in the previous section, can be seen as a set of tools to provide the views that the users need.

  Openness encourages the development of these tools. Standards for the various levels of data and computing interoperability are important to encourage diversity and innovation in tool development, rather than dependence on one supplier of tools, and diversity of use of the tools and data will lead to broad knowledge development.

  International organisations, especially GBIF and the Biological Information Standards organisation (TDWG), have a vision for the delivery of taxonomic information to users as part of a complete, organic, dynamic, distributed information system. This will facilitate the growth of both interpretations of web-based taxonomy. They have activities and plans for assisting with the growth of such a system, involving the Catalogue of Life, and are also encouraging the development of open standards for data and information exchange and the use of software tools to create and deliver the resources that users need. This was very clear at the recent European EDIT project's Symposium "Future Trends of Taxonomy" and General Meeting in January 2008, both in the talks and in the corridors. There is a clear and timely opportunity for the UK to maintain and demonstrate its lead in these areas, with relatively small amounts of additional funding.

4 February 2008


 
previous page contents next page

House of Lords home page Parliament home page House of Commons home page search page enquiries index

© Parliamentary copyright 2008