The drivers of operational resilience are creating a perfect storm. On one hand, the financial services regulators are demanding action while on the other, COVID-19 and other external shocks have made us all too aware of the necessity to be more resilient.
In this article, I share with you the Q&A from our latest thought leadership webinar on Operational Resilience. Click on the links below to jump to the question.
APAC Session questions:
- How is BCP different from Operational Resilience?
- The connection between Risk Bow Tie Analysis and Operational Resilience
- Advice on how to kick off a BIA in a siloed organisation?
EMEA Session questions:
- Using subjective and objective assessments in Operational Resilience
- When will the visual mapping/process maps be available in Protecht.ERM?
- Would using the Protecht Ops Resilience modules help organisations meet FCA requirements?
- Do you have any dashboards/reports to monitor ops resilience health and gaps?
- How many IBS could you expect to see? How granular would you expect the process maps to go?
- Is there a regulatory interest to request organisations "Assurance" over these Operational Resilience programs?
- Who should be responsible for operational resilience within an organisation?
Questions and Answers from the Live Session - APAC
1. The operational resilience example that David went through is similar to what is done in the business continuity (BC) process. Any comments on how they are (or meant to be) different?
David Tattam (DT): That's a really good question. It seems that a lot of risk-based companies are kind of saying operational resilience is basically DRP in sheep's clothing or BCP in sheep's clothing.
David Bergmark (DB): That is a great question, because even throughout our multiple discussions internally, there is a lot of similarity between BCP and Op Resilience. I think the key, that we're sort of focusing on at this stage is that definition of the critical service - certainly from the regulator's mind tends to be more focused on the delivery of that service to customers.
BCP often gets a bit more granular in various parts of the organisation. So I think it's almost like starting at that top level, figuring out what those critical services are and then working through those first, like the Maximum Acceptable Outage (MAO) and Recovery Time Objective (RTO).
DT: Yes, I totally agree. There is a big overlap there but in a practical sense, Operational Resilience is more than BCP. It is BCP, it is DRP, it is contingency planning and all of those but it's more than that.
So if we think about something like COVID hitting and for some industries, suddenly the customer was gone or they don't have the legal license to operate the business. Now, I'm not convinced that many BCP plans would have a scenario where customers can't use you anymore.
Maybe I haven't seen the right BCP plans, where in resilience that should be a scenario where we look at "customers can't utilise us anymore, what are we going to do?" And obviously, the response to that in BCP/DRP, and to a shock, is typically to go to back ups, to go to our BCP/DRP site and recover accordingly.
Where with COVID 19, losing a customer was very much about the ability to pivot. How quickly can we change our business to be able to capture another customer? And we had all the breweries, creating hand sanitiser, particularly gin breweries!. That was a great example of pivoting. Now that showed resilience. With the airlines however, it was and is much more difficult. What else do you do with an aircraft when no one is allowed on it? So all I would say is there is definitely a linkage. It's definitely a connection, but it is more than that.
So when I do see some firms simply talking operational resilience and when you drill down they're saying and it's basically taking their BCP/DRP capability and just rebranding it with the word operational resilience, I have a bit of a problem.
I think it is more than that. We need to see it as a much bigger picture, complete end to end rather than just say BCP/DRP. That would be my personal view.
DB: I might just add to that. I think, you know, coming back to the question, is also traditionally when I look at a lot of BCP plans, I might focus on an event, right? So if we think about Dave's C-19 example in a scenario, it might be pandemic, right? And often that was the starting point for the BCP. Or a flood. You know, I know certainly one of the ADIs I've worked with are close to a river and it's more focused on the flood. Where, I think, for resilience, it really is the key focus on critical services.
There will be BCP scenarios that are connected to those critical services, but I think that's the fundamental difference. It's starting with "what are your critical services" and working it down from there, but personal view, as always.
DT: All I would say to sum it up, is I wouldn't suggest you just simply rebrand BCP/DRP as resilience. I think it's more than that, and the regulator will be looking for more than that. And they've openly said that it is not just BCP, so just be careful.
2. Recovery = right hand side of a bow tie, resilience is both sides?
David Tattam (DT): Absolutely spot on for those of you that understand or have done bow tie risk analysis. Put very simply, the main event is in the centre, and this is the point in the whole scenario that you effectively lose control.
Now, up to that point, you always try to not lose control. So quite rightly, on the left-hand side of a bow tie from root cause to the event, it's all about reducing the likelihood of having that event occur in the first place.
This is about being resilient when you can dodge the person that's going to punch you. So preventative type controls, all those kinds of things, and that is resilience in one of the five examples I gave you with the boxer that is about dodging the event in the first place.
So I totally agree. Now on the right-hand side of a bow tie after the event has happened, it's all about how do you mitigate and manage the consequences or impacts. So very much on that side. We have recovery controls.
I don't know what you guys call them, reactive/corrective. They all mean the same thing. This is about recovering once the event has occurred. Now, as we said before, is that resilience covers all of these. It covers both the left-hand side about preventing the event from happening in the first place. The internal event, I don't mean the external shock.
And also then it also encapsulates if you do get hit and go down like a sack of potatoes, how quickly are you going to recover? So, Ian, I absolutely love your statement. Recovery is the right-hand side and resilience is on both sides [of the bow tie technique].
Read our ebook to learn what makes Risk Bow Tie Analysis one of the best tools used in risk management.
43% of our APAC Operational Resilience webinar respondents said that Good Risk Management is the main driver for Operational Resilience in their organisation.
3. I'm new to a very siloed organisation without a strong sense of cross-functional processes (non-Financial Services). Just wondering if you have any advice on how to kick off a BIA?
David Bergmark (DB): Rachel, yeah, tough initially. Like certainly, you've got to get a sense of community when you're trying to do business impact analysis, otherwise, you're going to struggle. And maybe it is as simple as starting with the critical processes that the organisation delivers to customer services.
And then, you know, maybe talking to those siloed business units and just getting them to think about how do they contribute to the delivery of that critical service? And once you start perhaps mapping it out and showing that these are the business units that are contributing to that, I think it'd be clear to them that it's not something that should be treated in isolation. You can see that each of the business units have a role in delivering that particular service to the customer. I'm not sure exactly what industry you're in but if a customer is at the heart of what you do, then really, we all should be on the same page to deliver a strong, resilient process to that customer.
David Tattam (DT): Rachel, great question. Silos, the Achilles heel of enterprise risk management and we hear it so often. "Oh we're doing risk in silos, we don't talk to each other and so on. Now this is a huge impediment to operational resilience because operational resilience doesn't care about silos. It doesn't care about business units. It cares about the ultimate delivery of service to a customer or to a stakeholder. And that's, you know, that can cut across a multitude of business areas, business activities and so on.
So as you quite rightly say, you know, business impact analysis, as you particularly said again, doesn't care about silos. So we've got to somehow collaborate with our partners in crime across the whole value chain to be able to work out what something happens in a business unit number one impacts business unit number two, three, four, five and then bang. There is a huge problem with the customer at the end. Now the issue is that it makes it sound easy to say it, but to solve that problem is incredibly hard.
The first huge problem is change management. A lot of humans love being in their silos and they never talk to anybody else and they go, we're all right, Jack, you know, you can worry about yourself. So we've got to break down the silo culture, which is incredibly difficult.
The second thing we've got to do is be able to understand that end to end process. Now the risk you run of trying to understand the end to end process is granularity. If you do it at the higher level, I think you've got some hope of doing it.
If you get at a more granular level, you can have death by flowcharts. And without naming the organisation, I've been involved with an organization. They aren't a system client, quite a large organisation. They wanted to understand their end to end processes. And walking around the organisation a number of times, I did see rooms with glass walls on with people in that had paper wrapped around the whole room, mapping out one end to end process. And the last message I got, it got so cumbersome and difficult that they kind of gave up maybe half their problem was doing on bits of paper with a pen.
Maybe they needed a decent system to map it all out. But I think your comment, Rachel, is absolutely spot on that we need to be able to break down silos to be able to do it and honestly change management culture and getting people also to be accountable for an interim process. And this is very important. You can be responsible for something yourself, which might be my business unit outcomes, but I'm accountable across the end to end chain with my fellow managers. So the sense of accountability is critically important to get that end to end view.
54% of respondents in our APAC Operational Resilience webinar say that they manage their Operational Resilience using Excel, Word or PowerPoint.
Questions and Answers from the Live Session - EMEA
1. My executive would like to know what is the likelihood that we miss our operational/budgeted target. As I see this, using qualitative and biased human assessments - I cannot see how you could ever respond to that, rather relevant, question to a risk manager.
David Tattam (DT): It's such a great question and I kind of, I got a quick answer. The first and foremost is the argument that a lot of people will not rely on subjective qualitative judgments. We often call it finger in the air, licking a finger in the air, and people get upset that we need to do mathematical formulas to come up with this because there's human biases and so on.
Now the only problem with this is you've got a come back to fundamentally "what is risk", risk relates to a future potential event that we're not sure about. No, I haven't found the mathematical formula that tells us that. People often use statistics. They use Monte Carlo simulations. These are still art forms. They just happen to be art forms with numbers. So as a result of this, I think the importance is to appreciate that risk management is always going to have an element of art form in it, and it's an element of subjectivity.
However, that said, what we're trying to do is move from purely subjective to a combination of subjective and quantitative. So for us, as a firm, we don't just look at a qualitative assessment of, say, a risk using likelihood and consequence of wet fingers.
That's one component, but we also took the same risk. We map our risk metrics, key risk indicators. We map out incidents. We map the results of our controls assurance. We met our outstanding audit points. And all of these components are brought together often called the triangulation of data to make sure they're consistent.
That is what talks to the likelihood that we missed that budget and by how much we do it. Now at Protecht, we call that risk in motion, which is trying to get that kind of dynamic, integrated view of risk and it does to a degree reduce subjectivity, which is still there.
But it reduces the subjectivity because you've got multiple sources of information that together gives you that picture. So I do take your point, but the issue is going to be how do you move from purely subjective to slightly more objective? But I don't believe you're going to ever eliminate a bit of subjectivity because we're dealing with a thing called risk.
David Bergmark (DB): Yeah, I absolutely agree, and Hans, you know, I certainly have seen as an example when doing a risk assessment, some customers putting in Min-Max expected loss, running it through a Monte Carlo simulation and then determining that he'res our estimated loss for that particular risk. I think there's problems with that, not only on the bias that's associated with those min-max values, but also correlation issues across the risk profile as well, which makes it tricky.
And when we think about what gives that view of a risk, I think Dave sort of nailed it in, you know, in the sense that even if you were forming a view on min-max-expected, you're probably going to look at incidents or maybe some data that's out in the market, one of the loss databases. But that's, you know, just another set of data as opposed to all of the other pieces, like internal audit findings that are overdue, control test failures that are within the organisation, indicating that that particular control supporting a key risk is breaking down.
Trying to quantify that and then roll that up into some sort of magical algorithm that enables you to determine how you're mapping against budget would be really quite difficult. But I do appreciate the methodologies that are out there.
I think they've all got a role to play, but I'm really scared to just base it on a single Monte Carlo sim in that sort of count.
DT: Hans just came out with a brilliant comment. "All models are wrong, but some are more useful than others." I've used that so many times, and I think that is the key to this is that all models are wrong, but some are just a bit better than others and more useful than others.
And all we're trying to do is always get to that point where they're valuable.
2. When will the visual mapping/process maps be available?
David Bergmark (DB): That will be available first quarter, next year in fiscal year 2022 by the end of March, we're hoping to deliver.
56% of respondents in our EMEA Operational Resilience webinar said that they use Excel, Word and Powerpoint to manage Operational Resilience in their organisation.
3. If a financial services firm were to use the new Protecht Ops Resilience modules effectively across the organisation, would you say that the FCA requirements would be met?
David Bergmark (DB): I can tell you I've certainly built and the product team has certainly built the initial cut on FCA requirements, so we would hope so. I think it's like anything with the regulator, I mean, looking to the market to evolve over time. But I think for us first cut, it's very close to their paper.
4. Do you have any dashboards/reports as best of breed to monitor ops resilience health, gaps, etc?
David Bergmark (DB): We will. They're being designed at the moment. I think there's a lot of great ideas coming around, you know, in terms of having that summary of all IBS and all of the connected components aggregated into a single dashboard to show the health of those and the health of the connected resources is where we're heading with that, but they'll definitely be coming as part of that build.
5. In a typical organisation, how many IBS could you expect to see? How granular would you expect the process maps to go?
David Bergmark (DB): Yeah, that is that is something that's going to be though, that will evolve as customers start getting into it. Sort of initial feedback we're hearing is somewhere between eight to ten as a starting point. We'll see how we go with that volume.
I mean, I know we've not mapped it out. I sort of had a few more than that that you could probably consolidate those. So that would be the starting point. In terms of the granularity, yeah, you can go pretty deep, I think if you get carried away.
So there's obviously connections. If we think about a process that is in a sub process to that, if there's a resource, there might be a resource dependency on that. And you can keep going. I think I think you just got to be careful.
We've got to sort of think about what are we really trying to do? And I think the focus should be initially, get your IBS mapped, get the critical resources mapped first. Think about those and then we can start worrying about going more granular. I think we've all got to learn to walk before we run in this particular area.
63% of respondents in our EMEA Operational Resilience webinar say that Good Risk Management is the main driver for Operational Resilience in their organisation, followed by 50% who say Regulatory Pressure is the main driver.
6. Is there a regulatory interest to request organisations "Assurance" over these Operational Resilience programs?
David Tattam (DT): The answer's yes if you're FCA regulated and I'm sure they're going to set the global standard. So with the FCA, they are requiring an annual self-assessment of your operational resilience program that must be made available to the regulator. Now, I can't quite remember exactly what they say, but that is an annual self-assessment that you have to complete and provide to the regulator.
Now, I can't remember, but that has to independently be checked. But given the general regime in financial services regulation, they generally require an independent review of that every, say, three years. Now I can't remember exactly, don't quote me for the FCA, but the quick answer is yes, and it's going to be done by self-assessment.
38% of respondents from our EMEA Operational Resilience webinar said that the Operational Resilience program in their organisation is in the early stages
.
7. Who do you think would be responsible or should be responsible for operational resilience within an organisation?
David Tattam (DT): I'll try to answer that question, and the quick answer is I don't know. Why? Because there's a COO element here, it's very operational. David talked about the mapping of processes from end to end. So there's that. There's a lot of risk management stuff sitting in there as well. So is it with risk management? There's a lot of DRP stuff there as well. So does it sit with a DRP business continuity planning and so on? Now, in a training course I did not long ago on operational resilience, I actually did a poll and asked that, and it was all over the place as to where they think the responsibility sits.
Probably the main one initially was risk management/BCP/DRP with strong involvement from the Chief Operating Officer and so on.
David Bergmark (DB): I'm going to say that that really depends on the size of the organisation and the resources committed to the risk function. Mid-market we're typically seeing it just resting in the risk function. Larger organisations will have specific BCP teams and it's migrating into those.
And now we're starting to see people with resilience titles. So it'll be their area would be my comment that.
Watch the full webinar recording here.