Imagine a scenario where the internet grinds to a halt for not just one organisation but entire nations. This isn't a hypothetical situation; it's a reality that struck 13 African countries last month due to severed subsea communication cables. It’s an event you couldn’t predict – or could you?
Some events might seem like they are novel risks you could never predict, but history and a little bit of critical thinking tell us otherwise. This blog explores the importance of operational resilience in facing such disruptions, using recent incidents as a case study to unveil strategies for navigating risks.
We will cover:
- The March 2024 Internet disruption in Africa
- Black Swans and Grey Rhinos
- The importance of operational resilience
- Using scenarios to identify and address vulnerabilities
Subscribe to our knowledge hub to get practical resources, eBooks, webinar invites and more showing the latest developments in risk, resilience and compliance, direct to your inbox:
Internet disruption in Africa
On 14 March 2024, the internet went down for 13 countries across Africa. The cause was due to damage of four undersea fibre optic cables off the coast of West Africa that deliver communication services to countries ranging from Senegal to South Africa.
The initial disruption lasted from five minutes to several hours, but even as of 19 March – five days later – some African countries were still experiencing partial outages. While not yet confirmed, the suspected cause is seismic activity. The location of the cable damage, being 3 km below the surface, is believed to rule out accidental human activity.
On Grey Rhinos and Black Swans
Imagine this scenario from your organisation’s perspective. Your initial instinct might be “This is something we could never have predicted”, and to an extent that’s true. You certainly couldn’t have predicted the exact scenario of four cables being cut off the coast of Africa on 14 March 2024.
But let’s consider some other information that is known:
- On average, 100 undersea cables are cut or damaged each year
- In February 2024, three undersea cables were damaged in the Red Sea, slowing internet connectivity between Asia, Africa and Europe
- In October 2022, cable cuts impacted millions of users across countries in Middle East, Africa and Asia, and also affected major services such as Google Cloud. Disruption lasted around four hours before repairs were made
- In January 2022, Tonga suffered an outage due to a volcanic eruption that damaged a nearby undersea cable. The outage lasted 38 days
- In 2007, Vietnamese fishermen cut subsea cables in efforts to resell the materials. Vietnam lost most of its connectivity to the wider world for three weeks
In risk parlance, Black Swans are events that we could not reasonably foresee, but in hindsight we can identify all the potential causes. The term was popularised by Nassim Taleb1. Prior to European contact with Australia, Europeans may have referred to the bird “black swan” – but only as a myth based on the assumption they did not exist, having never seen or heard of one.
The term Grey Rhino was coined by Michelle Wucker as a contrast to Black Swans2. They represent highly likely or almost inevitable events that are widely ignored until they are too late. That ignorance may come from a few different factors:
- Being distracted by something urgent, until the rhino is almost upon you
- An assumption that when the rhino gets close, you’ll ‘figure it out then’
- A hope (not a good risk management strategy!) that if it does hit, it won’t be that bad
When considering the evidence above, severed subsea cables look a lot more like Grey Rhinos than Black Swans.
The importance of operational resilience
The concept of operational resilience has made waves over the last few years – particularly in financial services and critical infrastructure due to regulatory drivers, but awareness is increasing across all sectors. We typically see operational resilience comprising of the following characteristics:
- Prevention: Reducing the likelihood of disruption by addressing its causes or being exposed
- Absorb: Reduce the immediate impact at the time of disruption
- Recover: To the extent you are impacted, recover as quickly and effectively
- Adapt: Pivoting quickly and permanently if you need to adapt to a ‘new normal’
- Learn: Embed learnings from disruption to further improve your resilience. This shifts from ‘bouncing back’ to ‘bouncing forward’
One of the key concepts of operational resilience is being able to manage through disruption. How will you continue to provide outcomes to your customers, as least to some minimum acceptable level, if some of your key resources are unavailable?
At a high level, an operational resilience programme comprises the following iterative steps:
- Identifies the important business services that customers rely on
- Sets tolerance levels for those important business services
- Map out the processes needed to deliver the important business services, and the resources that support those processes
- Assess vulnerabilities that may exist in those processes and resources
- Run scenarios to assess the ability to operate within defined tolerance levels
- Based on scenario outcomes, address any vulnerabilities
- Integrate with enterprise risk management activities, such as controls assurance, attestations, and risk assessments
- Report to executives, board and other stakeholders on the overall resilience of the organisation
Let’s take a closer look at scenarios – addressing a hypothetical Grey Rhino can be much more cost effective than facing down a real one.
Using scenarios to identify and address vulnerabilities
Event-based scenarios need to be paired with resource-based scenarios as part of an operational resilience programme. The outage in Africa highlights why that difference can be important.
One way to consider operational resilience scenarios and retain a consistent format is to follow a standard scenario statement template that includes key components of causes, affected resources and/or processes, disrupted services, and impact on customers. If you are using technology to support your operational resilience programme, you can link your scenario to the affected resources or processes to ensure consistent mapping.
Here is an example scenario statement for a financial services provider:
“Transatlantic telecommunications cable broken or disrupted, resulting in offshore contact centre operations being unavailable that support phone-based transaction, resulting in customers being unable to access funds.”
If we run this scenario against our capability via a desktop walkthrough, we might identify that we have business continuity plans for failure of this call centre, which is to activate contingency arrangements with a third party in the same region. In this scenario, that third party would also be affected by the disruption. Further discussion during the walkthrough identifies that without a pre-arranged contingency in place with an onshore backup, we are unlikely to meet our defined impact tolerance.
Conclusions and next steps for your organisation
It’s easy to look at an improbable event in isolation and assume they might never harm us. That assumption might result in zero preparation for not only that specific event, but other events like it. There is no such thing as perfect resilience, but we can aim to continually improve it. Here are a few questions to ask about your operational resilience programme:
- What Grey Rhinos are we ignoring, and what evidence supports their inevitability?
- Can I use that data to highlight the need for action and identify solutions?
- What is the difference in impact between being prepared, and not?
- Do you have the frameworks, systems and tools to implement a comprehensive operational resilience programme?
Operational resilience is relevant and important to everyone on the planet. It should form an integral part of any organisation’s risk management framework, and we believe the best solution is to incorporate the complete end-to-end operational resilience process within your ERM system.
Download our operational resilience eBook to learn how you can develop your own operational resilience capability and integrate it with your ERM framework:
1 The Black Swan, Nassim Taleb
2 The Grey Rhino, Michele Wucker