IT failures in the Financial Services Sector Contents

3Common causes of IT incidents

82.We took evidence on the common causes of IT incidents in the financial services sector. Many respondents referred to data published by the FCA. The most common root causes of incidents reported to the FCA in the year to September 2018 are shown in the graph below:103

Figure 1: Root Cause Trend: Oct-17 to Sep-18

Some of the common causes are examined in more detail in the following sections. The risks created by legacy systems are covered first as this permeates multiple causes of IT failures.

Legacy systems

83.Aging architecture, or the use of legacy systems is often referred to as a cause of IT incidents. While some firms described the modern state of their systems, and significant investment in upgrading legacy systems, the Regulators outlined the continuing prevalence of legacy architecture:

We have observed that legacy systems still support important business services in some firms and FMIs. [ … ] Firms and FMIs have built or procured digital services for their customers which often sit over the top of legacy systems rather than fully replacing them.104

Firms were themselves concerned about this risk. The Regulators told us larger firms were concerned about:

The scale of system obsolescence, and the absence of a common mechanism by which to assess the range of risks (for example, strategy, costs, service availability, security etc.) and therefore to inform mitigation. 105

84.PwC explained that legacy systems can make managing cyber risk particularly difficult:

One of the challenges with legacy systems is that many mainframe systems were designed before the introduction of the internet. In our experience, it has been problematic for some firms over recent years to ensure that these systems are protected from cyber attacks and that information is secured.106

85.Legacy systems can also make change difficult, partly because they are often complex, as their use has evolved over many years. In oral evidence to our Economic Crime inquiry, Chris Hemsley, Co-managing Director of the Payment Systems Regulator, explained that the introduction of confirmation of payee would mean that “the individual IT systems all then need to be updated. That in itself, because of the legacy systems and the range of different systems that exist, is much more complicated.”107 Similarly, Robin Bulloch, Lloyds Banking Group, told us in our Consumers’ access to financial services inquiry, that:

We have systems that are sizeable in scale [ … ] but, when we are lifting the bonnet and going to change the engine, it takes time, because we have to be very careful about the changes that we make, and we would absolutely wish to guarantee that the changes are done once, rather than trying to do things quickly and finding that we have not done what we intended.108

86.Furthermore, the Regulators explained that legacy systems often involve key person risk:

The challenges of maintaining older systems is exacerbated where the engineers and other experts with the knowledge to support them have left or retired, reducing the knowledge base available over time. The documentation on these systems relied on by their successors may not always be adequate.109

87.Yet some firms observed that not all legacy systems are problematic. Barclays explained that “While a system running a service will inevitably age, this does not necessarily mean that it poses greater risk.”110 Also, PwC commented that “we don’t believe that “legacy” and “fragile” should be used interchangeably when applied to systems [ … ]. There are some very stable and secure systems that have been in place for a number of years”.111

88.Many respondents agreed that updating legacy systems, even if the upgrade was needed, created significant risks. The Regulators explained that “Firms and FMIs may prefer to patch and upgrade their systems where the risk appetite for wholesale transformation and system replacement is low”.112 Similarly, the Center for Evidence-Based Management explained that:

Deciding to replace a firm’s core systems is possibly one of the riskiest strategic technology decisions that a board can make. [ … ] It is often referred to as the apocryphal ‘changing the engine while the plane is in flight’, the risk being that business-as-usual may be severely disrupted during the several years of strategy execution.113

There are also cost reasons for not investing in legacy architecture. PwC emphasised that:

Since the financial crisis profitability in the banking sector in the UK (and globally) has been depressed compared to previous periods. Most major financial institutions have attempted to cut costs significantly, and this, in our experience, has resulted in reduced expenditure on technology upgrades and other important infrastructure improvements.114

89.Yet, there are also factors that may prompt the reduction in legacy risks in the future. Lyndon Nelson, Deputy Chief Executive Officer and Executive Director for Regulatory Operations and Supervisory Risk Specialists, PRA, admitted that firms with legacy systems “cannot carry on with the legacies and the approach that they have, because I think it would become a front office business issue about them not being agile enough, when consumers are demanding those things.”115 Similarly, PwC explained the role of competition, as “New entrants to the market that are able to build bespoke and coherent systems will, in our view, have an advantage over incumbent firms.”116

90.The Regulators may also need to promote change themselves. Charles Randall, Chair of the FCA, told us that “As a regulator, we need to have a level of intervention that ensures the management of a firm does not sit there saying, “This is such a nightmare; I will leave it to the next lot”.”117 Lyndon Nelson, PRA, described how the business services approach in the Regulator’s Discussion Paper may reduce the use of legacy systems:

I am hoping that the discussion paper [ … ] will effectively eliminate that; because, [ … ] the firm will have to think about what services to provide to the consumer, for example, and what is in the production line to get that service to them. Our best estimate is that, if there is a legacy system in there, their response time or their recovery time is going to be a lot higher.118

91.On regulatory role, the Center for Evidence-Based Management recommended that where firms are not addressing legacy risks they should be mandated to make preparations on how to mitigate those risks:

Supervisors should require that firms produce concrete plans to mitigate any serious core systems risks, if necessary, initiating a CSR [core systems replacement] programme.119

The Regulators also have the option of other tools to understand the risks of legacy systems, for example commissioning Section 166 skilled persons reviews.120

92.Many financial institutions face the challenge of aging, legacy infrastructure that is hard to maintain, yet expensive and risky to replace. We do not believe enough is being done by firms to mitigate the operational risks they face from their own legacy technology, such as by moving to newer technology.

93.While legacy systems can in some cases be robust, firms must ensure that their use remains appropriate. This should include considering the availability of expertise to maintain the systems, and the system’s resilience, and their remaining useful life. Firms must not use the cost or difficulty of upgrades as excuses to not make vital upgrades to legacy systems. Regulators should have a strong framework to oversee firms’ assessments, and challenge these where necessary.

94.We welcome the indications from the Regulators that the approach set out in the Discussion Paper, if adopted, should trigger an improvement in firms’ management of legacy systems. However, given the potential for short-sightedness by management teams, if improvements are not forthcoming, the Regulators must intervene to ensure that firms are not exposing customers to risks due to legacy IT systems. The Regulators should make use of their full range of the tools to achieve this, including commissioning independent Section 166 skilled person reviews.

Level of change and change management

95.The level of change the financial services sector is currently undertaking is significant. RBS highlighted that:

We are facing rapid and significant changes at an industry level. Industry changes include Open Banking [ … ] computing capability associated with cloud computing and agile delivery methodologies.121

96.Change management was reported as one of the largest causes of operational incidents in the financial services sector, accounting for 20 per cent of the incidents reported to the FCA in the year to September 2018.122 Some of the most high-profile and damaging incidents have occurred as a result of poorly implemented change, and these IT failures caused significant disruption and inconvenience for customers. The Regulators highlighted two cases:

97.The Regulators explained how firms should manage risks of change programmes:

We expect firms and FMIs to have robust controls in place [ … ], including strong governance and senior management oversight, clear approvals processes, and independent testing. 125

Megan Butler, FCA, highlighted in a speech that “we are worried that a lot of firms seem overly confident about their ability to manage flagship IT change programmes and keep their systems up to date”.126 Similarly, Lyndon Nelson, PRA, explained that “there are a number of weaknesses in risk management”.127

98.One of the most important preventative measures against IT incidents as a result of system change is testing before the new system goes live to all customers. Speaking about Barclays’ incident in September 2018, Graham Bastin, Head of Operational Resilience, Barclays, explained that “The change itself had been tested thoroughly, but when it went into the live production environment, we should have tested it for a little bit longer with customers, so that we might have been able to see this issue and back it out”.128

99.The Center for Evidence-Based Management explained that “Every time a system is changed and either code released or hardware upgraded there is a risk of making a mistake”, and that problems can arise when you see “aggressive management pushing changes to be released before they are fully tested in order to meet deadlines”129. In the case of TSB’s incident in 2018, Andrew Bailey, FCA, told us that “testing is going to be one of the key questions”.130

100.We also heard that the approach to change matters. Anne Boden, CEO of Starling Bank, explained that:

Modern technology is now released and change managed by implementing a little bit of change often. By having a little bit of change [ … ] and doing it several times a day, you minimise the impact of that change. We all know that when you do big change—big migrations, big separations of banks, and big migrations of systems—we put customers at risk.131

101.Despite the risks, many instances of change have been successful. The FCA highlighted in a previous Committee session the successful implementation of ring-fencing.132 This demonstrated that in the right circumstances the industry had the capability to deliver major change initiatives.

102.Given cases of good and poor change management, the Regulators claimed that “This is an area where greater information-sharing about best practice would help to strengthen operational resilience”.133 Industry collaboration is covered further in Chapter 5 of this report.

103.The Regulators have considered initial learnings from the TSB migration. Sam Woods, Deputy Governor Prudential Regulation and Chief Executive Officer of the PRA, told us that the PRA:

Put in place an extra capital requirement [ … ] precisely to cover an unspecified possibility of it going wrong. I can tell you, now that it has gone wrong, it has proved very expensive. It is a very good thing we have that capital requirement in place. That is one learning for us: that we should always do that where firms have a big programme of this kind.134

104.Poor change management is one of the primary causes of IT failures. As firms embrace new technology to improve customer experience, and grapple with upgrading legacy systems to meet the expectations of digital banking, further IT change in the financial services sector is inevitable. It is important that firms have strong and well-rehearsed change management procedures. As a matter of urgency, firms should address any issues identified in their risk management, including ensuring that they have sufficient skills and experience to manage change.

105.We are concerned that time and cost pressures may cause firms to cut corners when implementing change programmes, for example by compressing testing schedules. Firms engaging in change programmes should not be allowed to gamble with their service availability.

106.While we accept that the ultimate responsibility for executing change programmes lies with firms, there is a role for the Regulators where customers are at risk. In their unique position with oversight over many change projects, the Regulators should ensure that best practice and lessons learnt from past change projects are disseminated to the industry.

107.The Regulators must also review their approach to supervising firms’ large-scale change programmes to ensure that proactive intervention is possible, ahead of IT failures, so that customers are protected. This should include the level of engagement with firms, the level of specialist resource required, and the degree of assurance sought.

Outsourcing and third-party failure

108.In addition to in-house provision of services, financial services sector firms rely on third parties to provide services, for example technology and business processes. The Regulators wrote that “Industry trends show that firms and FMIs are increasing their use of third parties”.135 Some firms described a reversing trend and that they had begun to bring some services back in house. For example, Graham Bastin, Barclays, explained that “with outsourced providers we have insourced about 65 per cent of our suppliers”.136

109.Overseeing firms’ outsourcing arrangements is important to the Regulators, who explained that “Outsourcing and wider use of third-party providers is a priority area of focus for the Authorities137”.138 Alison Barker, Director of Specialist Supervision at the FCA, explained that the FCA does not have a preference for insourcing or outsourcing but firms “can’t outsource the responsibility for overseeing that it is working and understanding the impact of it when it does not work.”139

110.There are risks involved in outsourcing. The Regulators stated that increased outsourcing has “a consequent implication for the operational resilience of firms and FMIs, and potentially the market, should an issue arise at a third-party supplier”.140 The FCA reported that third party failure is the second most common cause of incidents in the financial services sector.141

111.Some firms have difficulties in managing third parties, which can weaken their operational resilience. PwC described some of the issues that firms face:

We observe that some firms currently struggle with third party management across their operations. [ … ] The current approach in firms to vendor management and supplier risk is often siloed, with individual teams focusing on different areas. In our opinion, there is more work needed by firms to gain visibility across the vendor landscape, to reduce the risk of outages and accidental information disclosure.142

Also, Megan Butler, FCA, highlighted in a speech that:

Only 66 per cent of large firms, and 59 per cent of smaller firms, tell us that they understand the response and recovery plans of their third parties. On top of this, we know there is a real problem at the moment around recruiting the right skills at the top level; to steer, set strategy and oversee this model.143

112.Despite the risks inherent in outsourcing there are also many benefits. UK Finance explained that “Outsourcing should also allow a firm’s management to increase its focus on the core business functions, expand the availability of business services, and accelerate the delivery of such services”.144 Furthermore, PwC stated that firms are “increasingly seeking to outsource critical functions to a concentrated set of vendors to reduce costs and gain access to external capabilities”.145

113.Given the prominence of operational incidents caused by third parties, we support the need for the industry to improve risk management of third-party relationships. Firms cannot use third party failures as an excuse when incidents occur. If the Regulators are not observing a good standard of management of third parties by regulated firms, they should amend, as appropriate, their rules or guidance to prompt an improvement.

Cyber risk

114.Cyber attack was the fourth most common cause of incidents, as reported by the FCA.146 PwC described how cyber risk can affect financial services sector firms:

Cyber-attacks on the financial services sector are increasingly common and represent a growing risk. Denial of service (DoS) attacks are one of the most common [ … ] and are designed to shut down machines or networks by flooding the target with traffic, making them unavailable to intended users. Over the last few years a number of banks have been victims of DoS attacks with disruption lasting up to 48 hours.147

115.Cyber attacks on the financial services sector can also take the form of malicious attacks for financial gain. For example, Tesco Bank suffered a cyber attack in 2016,148 which resulted in current account holders having unauthorised transactions on their accounts. Attackers may also seek to gain access to the wealth of data held by financial services firms. This could have a significant impact, not only on data security, but also if the data is corrupted.

116.The impact of cyber attacks can be long lasting. David Bailey, Executive Director for Financial Market Infrastructure at the Bank of England, explained that slow recovery times might be necessary in the “case of a cyber-attack which compromises the data integrity sitting within a financial market infrastructure, because quite frankly it is not worth coming back up if the data is corrupted”.149 Lyndon Nelson, PRA, commented that recovery time following a data integrity issue “could be months”.150

117.Many financial services sector firms described combatting cyber risk as a priority. TheCityUK found that cyber attacks were referred to as the “most urgent concern” amongst industry executives.151 Furthermore, there is a good level of coordination between firms on cyber risks. Graham Bastin, Head of Operational Resilience at Barclays, told us that:

The highest levels of collaboration that I see are around cyber. The banks and the financial industry generally have determined that there is no competitive advantage from being better than the other guy at cyber. So we work with the intelligence agencies, the cyber-defence agency, GCHQ. We share information and intelligence for the benefit of all.152

118.Cyber attacks are increasingly a concern for financial services sector firms. We welcome the level of coordination and priority given by firms in combatting cyber risks. We encourage the participation of all firms and the Regulators in these interactions.

104 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

105 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

106 PwC (OPR0008)

107 Treasury Committee: Oral evidence: Economic Crime, HC 940, 15 May 2019 [Q860]

108 Treasury Committee: Oral evidence: Consumers’ access to financial services, HC 1642, 5 February 2019 [Q242]

109 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

110 Barclays (OPR0009)

111 PwC (OPR0008)

112 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

113 Center for Evidence-Based Management (OPR0003)

114 PwC (OPR0008)

116 PwC (OPR0008)

117 Treasury Committee: Oral evidence: The work of the Financial Conduct Authority, 15 January 2019, HC 475 [Q420]

119 Center for Evidence-Based Management (OPR0003)

120 Financial Services and Markets Act 2000 (as amended), Section 166.The Regulators can appoint an independent skilled person to provide them with information or documents to assist them in regulation of a firm or the industry.

121 RBS (OPR0004)

123 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

124 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

125 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

126 Megan Butler, Speech: Cyber and technology resilience in UK financial services. 27 November 2018.

129 Center for Evidence-Based Management (OPR0003)

130 Treasury Committee: Oral evidence: Service Disruption at TSB, HC 1009, 6 June 2018 [Q180]

132 Treasury Committee: Oral evidence: The Work of the Financial Conduct Authority, HC 475, 15 January [Q419]

133 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

134 Treasury Committee: Oral evidence: The Work of the Prudential Regulation Authority, HC 704, 23 January [Q170]

135 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

137 The Authorities refers to the Financial Conduct Authority, the Bank of England, and the Prudential Regulation Authority

138 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

140 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

142 PwC (OPR0008)

143 Megan Butler, Speech: Cyber and technology resilience in UK financial services. 27 November 2018.

144 UK Finance (OPR0005)

145 PwC (OPR0008)

147 PwC (OPR0008)

148 Financial Conduct Authority, Bank of England and Prudential Regulation Authority (OPR0012)

Published: 28 October 2019