Select Committee on Science and Technology Fifth Report

CHAPTER 2: Overview: the Internet and personal security

The Internet: basic definitions

2.1.  A computer network is group of computers connected by means of a telecommunications system, so that they can communicate with each other in order to be able to share resources or information. An internet is a set of interconnected computer networks, and the Internet (capitalised to distinguish the specific example from the generic term) is the global network of interconnected networks that transmits data by means of the "Internet Protocol" (IP)—a specific set of rules and conventions that defines how information is communicated over the many disparate networks that make up the Internet.

2.2.  As illustrations (such as the widely disseminated image that appears on the front cover of this Report) make clear, the Internet is not a single network, but rather a complex network of networks. These networks are linked by virtue of a shared paradigm for communicating information known as "packet switching".

2.3.  Packet switching was first developed in the 1960s for the United States Department of Defense-sponsored ARPANET, the precursor of the modern Internet. When end-users communicate in a traditional "circuit switching" system a dedicated channel is established between them that others cannot use. In a "packet switching" network, the data sent between the end points are broken down into "packets", which are then routed between the various "nodes"—that is, devices—that make up the network. The routing may change from packet to packet and, at any given time, a link between particular nodes may be shared by many packets passing between different end users. Each packet carries the address to which it is sent and it is only at that end-point that the data stream is reconstructed. The way in which information is processed within the network as generic packets means that different technologies (wireless networks, fibre-optic cables, and so on) can be used interchangeably.

2.4.  Packet switching underpins the Internet Protocol, allowing a more efficient and robust use of communications networks. It has also contributed to the astonishing creativity and innovation of the online world, by allowing the separation, or "abstraction", of the functions of the various layers of the network. This was described in very clear terms in a briefing paper[2] annexed to the written evidence by LINX, the London Internet Exchange:

"The principle of Abstraction of Network Layers states that there are different layers in a network and each one has a specific function, with clear boundaries between adjacent layers. For example, only the application layer understands the content that is being carried over the network. The networking layer is only responsible for addressing and routing, and understands neither the data that it is transporting nor the physical characteristics nor location of the underlying physical layer."

2.5.  Thus the fundamental core of the network, the wires, cables, and so on, can remain relatively stable whilst new communications technologies, such as wireless networking, can be used to supplement, without needing to replace, existing infrastructure. Above the physical and datalink layers is the network layer, which deals with the transmission of packets, via intermediate routers, to their intended destinations. At the topmost layer are the applications that run on the end-user machines, interpreting data and providing a user interface. This layering is enormously valuable in allowing innovation at all levels. In the words of Malcolm Hutty, of LINX,

"By keeping all these things separate and by keeping all the complexity at the edges, we are able to create new services and to upgrade existing services over time, without having to rewrite everything and without needing the co-operation of every single party in it … This, to our mind, has been the principle reason why the Internet has been so successful … because it allows everybody to bring along their own contributions without needing everybody else's co-operation" (Q 725).

2.6.  The most striking example of such innovation was the development of the World Wide Web, by Tim Berners-Lee and his colleague Robert Cailliau at CERN, which unlocked the potential of the Internet for the general user. Their proposals for a World Wide Web, published in 1990, described in outline a system that allowed both the location of pages of information by means of Uniform Resource Locators (URLs, more correctly now known as Uniform Resource Identifiers or URIs), and the creation of links between such pages of information by means of "hypertext".

2.7.  Many terms that are commonplace today, such as "web page" and "website", not to mention activities such as "browsing" or "surfing", derive from the World Wide Web. Indeed, the World Wide Web and the Internet are often confused, so that there is little distinction in popular speech between "surfing the Web" and "surfing the Internet". But in reality, the World Wide Web is a system of linked documents and files, which operates over and is accessible by means of the Internet, but is entirely distinct from the network of networks, the Internet itself. Indeed, many other forms of communication, such as Internet Relay Chat (IRC), or Voice over IP (VoIP), using different protocols, co-exist with the World Wide Web on the Internet. The fact that the World Wide Web could be introduced in the early 1990s without requiring a fundamental redesign of the Internet is the most striking demonstration of the huge potential for innovation and growth inherent in the principle of abstraction of network layers.

2.8.  However, the abstraction of network layers has other consequences as well. It is sometimes said that the Internet was built with no "identity layer"—in other words, the network level is designed to operate without knowing to whom and to what you are connecting. This is a necessary corollary of the abstraction of information into packets and the abstract layering of the Internet's design. In traditional telecommunications the existence of a dedicated connection between two identified end-points allows identity to be known by every part of the system. On the Internet, however, packets are effectively anonymous; they are simply chunks of data, routed highly efficiently—though to all appearances indiscriminately—around the network of networks. The information is then reassembled at the end point, by means of applications installed on end-user machines. It is these applications, not the network, that are concerned about the identity of the source of the information.

2.9.  This creates fundamental problems for end-user security, which were outlined for us by Professor Jonathan Zittrain, of the Oxford Internet Institute: "the way the Internet was built was to be able to carry data from one arbitrary point to another without any gate-keeping in the middle. It has been a wonderful feature, so-called end-to-end or network neutrality. This design principle means that any desire to control the flow of data, including data which might be harmful data, is not very easy to effect on today's Internet" (Q 957).

Tracing Internet traffic

2.10.  The previous section describes in general terms the structure of the Internet and the difficulty of identifying and tracing the packets of data that traverse it. This section provides more technical detail on traceability.

2.11.  Every machine directly connected to the Internet is given a unique identity, a 32-bit value called its "IP address".[3] The routing systems ensure that packets are delivered to appropriate machines, by consulting the destination IP address placed into the packet by the sender. To avoid every router having to know the location of every machine, the address space is arranged in a hierarchical manner with blocks of addresses (of varying sizes from hundreds to millions) being allocated to Internet Service Providers (ISPs). The ISPs then make allocations from these blocks to their individual customers. Thus routers need only ascertain the address block and relay the packet to the appropriate ISP. Once the packet arrives at the ISP, it can use more fine-grained routing information to deliver it to the correct machine.

2.12.  When a new connection is made to a computer that is offering an Internet service, it will determine where to respond by inspecting the "source address" of the incoming packet. It sends a packet back to that source, and—provided that an acceptable reply is received from that source (some random numbers are included in these "handshake packets" to prevent spoofing)—it will then open the connection and be prepared to send and/or receive real data.

2.13.  If the connection turns out to be abusive—for example, it is an incoming spam email advertising fake medicines—then the source address can be traced back by determining which block of addresses it comes from, and hence which ISP allocated the address. The records at that ISP can then identify the customer to whom the IP address was issued. Since many ISPs allocate the same address to different customers at different times, the exact time of the connection will be often be needed, in order to correctly identity the customer who was using these "dynamic addresses".

2.14.  This "traceability" of IP addresses therefore permits the identification of the source ISP—who may be prepared to act to prevent further abuse. It also permits the identification of the customer account, although the ISP may not be prepared to divulge this information until the necessary legal paperwork has been processed in the appropriate jurisdiction.

2.15.  However, if the requirement is to identify who is ultimately responsible for the abusive act, then considerable further investigation may be required. The source may be a machine in a cyber-café, or a hotel, available for many people to use. The source may be a wireless connection, in an airport, a company or an individual's home that can be used by anyone within transmission range. Most commonly of all, the source will be an identifiable consumer's machine—but if it is insecurely configured or is inadvertently running a malicious program, then it may be innocently relaying traffic from elsewhere and the tracing will need to be recommenced to determine where that might be. In practice, "multi-hop" tracing is seldom attempted and even less often successful.

Security threats on the Internet today

2.16.  The design of the Internet Protocol permits the mounting of "denial of service" attacks. Here, many machines running malicious programs will send packets to a single machine—which is overwhelmed by the traffic and cannot respond to legitimate connections. Since the senders are not interested in return traffic, they can fake the source addresses in their packets, making it much harder to identify the source of the attack. Alternatively in a "reflection attack", they can send packets to legitimate machines, but with the source address set to the machine to be attacked—which will then receive responses from lots of machines that are perfectly identifiable, but are merely providing valid responses to the packets they are sent.

2.17.  These types of attacks are usually called "distributed denial of service" (DDoS) attacks, and there will be large numbers, normally thousands, of machines participating in them. In some cases they can threaten the integrity not of individual machines, but of Government or company networks or top level domain names (such as ".uk" or ".com"). On 7 February 2007 a DDoS attack, emanating from sources in the Asia-Pacific region, was launched on nine of the 13 "root servers" that support the domain name system. It was unsuccessful, but as we heard when visiting Verisign, which runs two of these root servers, the level of bad traffic is now peaking at 170 times the basic level of Internet traffic; by 2010 it is predicted to be 500 times the basic level. Massive over-capacity and redundancy is built into the network to allow enough headroom to accommodate such traffic. This affects critical national infrastructure rather than personal Internet security in the first instance, and we have therefore not explored this issue in detail.

2.18.  A major cause of abusive traffic on the Internet, be it DDoS attacks or the sending of email spam, is the presence of malicious code, or malware, on consumer machines. It used to be considered to be important to distinguish between "worms" that spread to vulnerable machines without human intervention and "viruses" that attach themselves to other traffic, such as email. However, the distinctions have blurred considerably in recent years and we will use the generic term "malware". This malware can still arrive via email, or via direct connections from other machines—but an important new source of infection is from visiting a website and inadvertently downloading the malicious code. The website may have been specially devised to spread infection, or it may be a legitimate site that is itself insecure, the owner unaware of its unwanted new functionality.

2.19.  In general terms, malware used to be created by individuals who wanted to become famous and gain the admiration of their peers. The aim was to spread as far and as fast as possible—demonstrated most famously by the "ILOVEYOU" worm of May 2000, created by a disaffected student in the Philippines. This has now changed, and the prevailing motivation for those creating malware is to make use of infected machines in order to make money. This means that considerable effort is now put into creating malware that will spread in a low-key manner. It is designed to be hard for the infected machine's owner to detect.

2.20.  Although traditional defences such as virus checkers (which determine whether a piece of code is known to be malicious) continue to be useful, they are no longer the universal shield that they once were. Jerry Martin, of Team Cymru, a network of researchers who monitor underground traffic and support Internet security, told us of the team's database of samples of malicious code, which is currently being added to at an average rate of 6,200 new samples a day. Of these samples, typically, around 28 percent were immediately detected by anti-virus software. They submitted the samples to the anti-virus companies, and a month later the average detection rate would rise to around 70 percent. In face of the flood of new malware the anti-virus companies have little option but to adopt a risk-based approach, prioritising the most dangerous malware and the most widespread.

2.21.  Putting malware onto machines is often done in order to create a "botnet". The individual machines, usually called "zombies", are controlled by a "botmaster" who can command them to act as a group. Botnets are hired out by their botmasters for the purpose of hosting illegal websites, for sending email spam, and for performing DDoS attacks. These activities take place without the knowledge of the individual machine's owner—although normal traceability will enable the source of individual examples of the traffic to be identified. The total number of "zombies" is unknown, but in the course of our visit to the Center for Information Technology Research in the Interest of Society (CITRIS) at the University of California, Berkeley, we heard an estimate that the number might be of the order of five percent of all machines, or up to 20 million in total. The cost of renting a platform for spamming is around 3-7 US cents per zombie per week.

2.22.  Malware can also search the hard disk of the compromised machine to locate email addresses to add to spammers' lists of where to send their email—and, more significantly for the machine's owner, it will search the hard disk for CD keys or passwords for systems such as online games. Additionally, it may install a "keylogger" which will record any passwords used for online banking, permitting the criminal to access the account and steal the money it contains.

2.23.  Online banking or trading can also be compromised by so-called "phishing" attacks. The user is sent an email purporting to come from their bank or some other company with which they do business, such as eBay. It contains some sort of urgent message—an imminent account suspension, an apparently fraudulent payment that they will wish to disavow, or even a monetary reward for answering some marketing questions. Clicking on the link within the email will result in a visit to a fraudulent website that will record the user's credentials (name, account number, password, mother's maiden name and so on) so that the criminal can—once again—take over the account and transfer money.

2.24.  Although phishing emails were originally written in poor English and were relatively easy to detect, they have grown in sophistication, and millions of individuals have been misled.[4] The number of phishing emails is enormous: in the second half of 2006 900-1,000 unique phishing messages, generating almost 8 million emails, were blocked by Symantec software alone on a typical working day[5]—though according to MessageLabs phishing still represents just 0.36 percent of total emails.[6] Bank payments association APACS recorded 1,513 unique phishing attacks directed at United Kingdom banks in September 2006, up from just 18 in January 2005 (p 29).

2.25.  United States banks are by far the most targeted by phishing, with their losses estimated to be around $2 billion. Most United Kingdom banks have also been attacked, though losses have been much lower, with losses from direct online banking fraud reaching £33.5 million in 2006. However, the United Kingdom trend is firmly upwards; losses were £23.2 million in 2005 and just £12.2 million in 2004. Total losses from "card-not-present" fraud (that is, the use of stolen credit card numbers for Internet or telephone ordering of goods) in 2005 were £183.2 million (up 21 percent from 2004), of which some £117.1 million were estimated to be Internet-based (p 30). But these figures tell only part of the story, as in many cases the losses from credit card fraud are off-loaded by the banks onto merchants.

2.26.  There has also been some "identity theft", where significant amounts of information about individuals is stolen and then used to impersonate them by, for example, obtaining loans in their name. However, the scale of online identity theft is unclear, with "card not present" credit card fraud also being treated as identity theft.

The scale of the problem

2.27.  Figures on the scale of the problem are hard to come by. Indeed, the lack of data on identity theft is symptomatic of a lack of agreed definitions or detailed statistics on almost all aspects of Internet security. In February 2006 the Financial Services Authority estimated the cost of identity fraud to the United Kingdom economy at £1.7 billion per annum.[7] But this included over £500 million losses reported by APACS, the United Kingdom payments association, covering counterfeit cards, lost or stolen cards, card not present fraud, through to full account takeover (the latter put at just £23.8 million). It also included £215 million for missing trader VAT fraud, £395 million for money-laundering and even £63 million for the anti-fraud procedures in the UK passport office. It is impossible to deduce from these figures how much online identity theft costs the United Kingdom economy.

2.28.  Still less clear is the scale of online fraud and theft. The problem here is compounded by the lack of clear definitions that might help to differentiate online fraud from "traditional" fraud. For example, Tim Wright, of the Home Office, asked how many prosecutions there had been for "e-crimes", responded, "Not only do the police databases not distinguish between whether crimes are committed electronically or not, but nor do the Prosecution or the Home Office figures distinguish between the two. So we do not know how many people have been prosecuted for e-crimes as distinct from offline crimes" (Q 25).

2.29.  We understand the logic of this—fraud is fraud, child abuse is child abuse, regardless of whether offences are initiated in person or online. But in the absence of any attempt to identify crimes committed online it is simply impossible to assess the scale of the problem. Thus when we asked John Carr, Executive Secretary of the Children's Charities Coalition on Internet Safety, about the relative frequency of online abuse and abuse committed by family members, he commented that "the way the crime figures are collected does not help us with providing an objective answer … even today in the crime statistics it is not recorded whether or not a computer was a key part of the way in which the crime was committed" (Q 251). Bill Hughes, Director General of the Serious Organised Crime Agency, argued that there "would be benefit" in identifying the e-component of conventional crimes, which "would help us to pick up on quantifying what the actual problem is" (Q 1042).

2.30.  Where data are collected, they often lack context. In the United States the National Cyber Security Alliance in 2005[8] published a survey showing that 81 percent of home computers in that country lacked core protection such as up-to-date anti-virus, firewall or anti-spyware software. This survey was backed up by scans of equipment, which showed that 12 percent of users had some sort of virus infection, and 61 percent some form of spyware or adware installed on the system. But this survey was based on a sample of just 354 individuals. Nor is it possible to deduce from these figures the actual level of economic damage that these security breaches were causing to the individuals concerned.

2.31.  What is abundantly clear is that the underground economy living off Internet crime is flourishing, and shares information openly online. Team Cymru have studied this phenomenon in detail, and have recently published some of their research.[9] Focusing on just one conduit of communication, Internet Relay Chat (IRC), Team Cymru show that entire IRC networks are devoted to the underground economy, with 35 to 40 particularly active servers. On a single server in a typical month in late 2005, compromised card details for sale included 31,932 Visa cards, 13,218 MasterCards, 31 American Express cards and 1,213 Discover cards (an American card company). Basic card details are on sale to fraudsters for $1 each (or $2 for United Kingdom cards); the "full info" for an account, including passwords, address details, dates of birth, mother's maiden names, and so on, can cost up to $50, allowing entire accounts to be cleared. The total value of accounts on offer on a single IRC channel over a 24-hour period was $1,599,335.80.

2.32.  With money available on this scale, it is hardly surprising that those responsible for e-crime, commonly known in the IT world as the "bad guys", include major organised crime groups, typically, though not exclusively, based in eastern Europe. They are well resourced, and employ specialists to perform particular tasks, such as hacking vulnerable websites, cashing cheques, receiving goods fraudulently purchased online, and so on. In summary, the Internet now supports a mature criminal economy.

2.33.  We were unable to get a clear answer to questions regarding the overall cost to the United Kingdom economy, let alone the global economy, of e-crime. One of the few witnesses prepared to take a holistic approach to the question, and, in the absence of firm data, to indicate at least the sort of areas that would have to be included in a comprehensive answer, was Bruce Schneier. He drew attention, for instance, to identity fraud, with costs "in the billions", and to the "multibillion pound industry" in computer security, as well as to unknowns, such as the costs to banking, to companies whose reputation and share price are affected by security breaches, and so on. In conclusion, he could not give an answer on the cost of e-crime, just a "flavour" for what it might be (Q 527).

2.34.  It is not surprising therefore that public anxiety over e-crime is growing. A survey by Get Safe Online, a partnership of Government and industry, which appeared shortly before our inquiry started, produced the startling and headline-grabbing conclusion that 21 percent of people thought e-crime was the type of crime they were most likely to encounter. It also showed that e-crime was feared more than mugging, car theft or burglary. Yet when we asked the Government about these results it was clear that they felt that this was an aberration. Geoff Smith from the Department for Trade and Industry (DTI; now replaced by the Department for Business, Enterprise and Regulatory Reform) described it as "counter-intuitive", and added that his department had been "a bit uneasy about using that as our headline message" (Q 38).

2.35.  Despite the DTI's down-playing of a survey they themselves had sponsored, the lack of hard data, combined with the alarmist stories appearing day to day in the press, means that public anxiety will probably continue to grow. This raises the question, whether the Government need to do more to help establish a true picture of the scale of the problem, the risks to individuals and the cost to the economy. We believe the answer is yes. Unless the Government take action—starting with the establishment of a framework for collecting and classifying data on e-crime, and moving on to a more rigorous and co-ordinated analysis of the incidence and costs of such crime—they will never be able to develop a proportionate and effective response. Without such a response, the risk is that the enormous benefits to society that the Internet continues to offer will be wasted.

Research and data collection

2.36.  The Internet is a relatively new technology, and online security is a correspondingly new academic discipline. The evidence from the Research Councils (RCUK) claimed that "The UK has a very strong Information and Communications Technology Research Community, and the underpinning research into both hardware and software is of a high international standing." RCUK also provided a helpful annex of major IT research projects funded by the Engineering and Physical Sciences Research Council. However, RCUK also conceded that "the UK does not specifically have a leading reputation for academic research on IT Security". It drew attention to discussions on improving collaboration between academic researchers and industry, but gave few concrete examples. The reality appears to be that there are only a few centres of IT security research in the United Kingdom—indeed, our evidence reflects the views of researchers from virtually all these centres.

2.37.  Despite the quality of the research undertaken at these few centres, overall the investment in IT security research does not appear to us commensurate to the importance of the Internet to the economy or the seriousness of the problems affecting it. During our visit to the United States in March we were fortunate to be able to visit the Center for Information Technology Research in the Interest of Society (CITRIS), at Berkeley. CITRIS receives a small amount of funding from the State of California to cover operating costs, but the bulk of its funding comes from partner organisations, either within federal government or industry. It brings together technologists, social scientists and other experts in a range of multi-disciplinary, time-limited research projects. While there are several research centres within the United Kingdom working on aspects of the subject, there is a clear need for the development of a large-scale, multi-disciplinary centre such as CITRIS to act as a focus for academic and industry expertise.

2.38.  It is notable that while the private sector partners supporting CITRIS include major companies in the IT and telecommunications industries, companies from manufacturing, energy and other sectors also contribute.[10] As computing becomes ever more pervasive, more and more private sector companies—for example, those providing financial services—rely on IT security, and will have an interest in sponsoring research into IT security. There is therefore an opportunity to attract a wide range of private sector partners, with diverse interests, to support a major research initiative in this area.

2.39.  At the same time, there are new legal constraints affecting IT security researchers. There has been a strong tradition within the IT community of "ethical" hackers—experts, generally unpaid enthusiasts, who test out networks and security systems by attempting to "hack" them. We agree wholeheartedly with the remarks of Bruce Schneier on the importance of their work: "You learn about security by breaking things. That is the way you learn. If you cannot break things, you cannot learn. The criminals are always going to learn, always going to break stuff. We need to be smarter than them. We are not going to be smarter than them unless we can break things too" (Q 565).

2.40.  However, the amendments to the Computer Misuse Act 1990, which were introduced by means of the Police and Justice Act 2006 and are expected to come into force in April 2008, introduced a new offence of making, supplying or obtaining articles likely to be used to commit computer crimes; there are also related provisions in the Fraud Act 2006. As Alan Cox told us, these are "unfortunately the same tools that you need to identify the security holes and test a security hole has been fixed and so on" (Q 327). At the time of writing, Crown Prosecution Service guidance on the application of these provisions had yet to be published—the Minister, Vernon Coaker MP, promised that they would appear "by the end of the summer" (Q 886).

2.41.  More general issues, affecting IT security experts in many countries, were touched on in our discussions at CITRIS in California. Vern Paxson drew attention to restrictions on wire tapping, as well as to difficulties encountered in monitoring the incidence of malware—the only way to monitor, say, the incidence of botnets, was to set up a platform that would both receive and respond to messages from botmasters. This meant that the researchers could find themselves guilty of negligence in allowing their computer to be used to propagate malware or spam to other users.

Conclusions and recommendations

2.42.  The benefits, costs and dangers of the Internet, are poorly appreciated by the general public. This is not surprising, given the lack of reliable data, for which the Government must bear some responsibility. The Government are not themselves in a position directly to gather the necessary data, but they do have a responsibility to show leadership in pulling together the data that are available, interpreting them for the public and setting them in context, balancing risks and benefits. Instead of doing this, the Government have not even agreed definitions of key concepts such as "e-crime".

2.43.  We recommend that the Government establish a cross-departmental group, bringing in experts from industry and academia, to develop a more co-ordinated approach to data collection in future. This should include a classification scheme for recording the incidence of all forms of e-crime. Such a scheme should cover not just Internet-specific crimes, such as Distributed Denial of Service attacks, but also e-enabled crimes—that is to say, traditional crimes committed by electronic means or where there is a significant electronic aspect to their commission.

2.44.  Research into IT security in the United Kingdom is high in quality but limited in quantity. More support for research is needed—above all, from industry. The development of one or more major multi-disciplinary research centres, following the model of CITRIS, is necessary to attract private funding and bring together experts from different academic departments and industry in a more integrated, multi-disciplinary research effort. We recommend that the Research Councils take the lead in initiating discussions with Government, universities and industry with a view to the prompt establishment of an initial centre in this country.

2.45.  Legitimate security researchers are at risk of being criminalised as a result of the recent amendments to the Computer Misuse Act 1990. We welcome the Minister's assurance that guidance on this point will appear later in the summer, but urge the Crown Prosecution Service to publish this guidance as soon as possible, so as to avoid undermining such research in the interim.

2   Not published as evidence.  Back

3   Although 32-bit addresses are by far the most prevalent, some machines operate with "IPv6", a more recent version of the Internet Protocol, which uses 128-bit addresses. Back

4   See  Back

5   Symantec Internet Security Threat Report, July-December 2006,  Back

6   MessageLabs 2006 Annual Security Report,  Back

7  Back

8   See  Back

9   The figures quoted are taken from The underground economy: priceless, by Rob Thomas and Jerry Martin, December 2006, available online at  Back

10   See  Back

previous page contents next page

House of Lords home page Parliament home page House of Commons home page search page enquiries index

© Parliamentary copyright 2007