APRA’s July 2024 date for entities to define critical operations, and identify material service providers is rapidly approaching, with defining tolerance levels set to follow. Protecht’s CPS 230: Road to readiness webinar, led by Protecht's Michael Howell and Hela Ebrahimi, considered the practical steps needed to navigate these requirements successfully.
It was great to see such a great turnout and engagement with our audience. We received a large number of questions, with lots of interesting comments and challenges our customers are facing. While some questions cross multiple topics, we’ve broken down into five rough areas that we covered in the webinar:
- Identifying critical operations
- Setting tolerance levels
- Material service providers
- Process mapping
- Governance and general questions
In case you missed the live session, or want to share with colleagues, the webinar is available to watch on-demand:
Polls
How far are you into your CPS 230 journey?
At the beginning of the webinar, we quoted a recent statement from APRA’s Deputy Chair, Margaret Cole – essentially that entities not already started are headed towards noncompliance. While I don’t think it is unreasonable to prepare a list of critical operations and material service providers in the next 3 months, there is a lot of other preparatory work to be done. With almost a quarter still in those early phases, now is the time to get kicked into gear.
As expected, the majority are on track for APRA’s first milestone. A small but notable portion are already powering ahead, and moving towards setting of tolerance levels or beyond. We certainly don’t think it is too early to be talking about tolerances because…
Which will be the most challenging part of CPS 230 in the lead up to July 2025?
…defining tolerance levels was the highest proportion of responses for the most challenging element of CPS 230. Given our own discussion on our worked examples in preparing for this webinar, this was not unexpected! I expect the setting of tolerance levels will generate significant debate in organisations. It’s also not surprising that managing material service providers is the runner up. The arm’s length and of those service providers, and the assurance required over their controls and ability to support your critical, present some challenges to manage efficiently while maintaining visibility over your risk profile.
How will Protecht support CPS 230?
Protecht are here to be your partners as you travel the road to readiness for CPS 230. We do this through consulting, training and systems. Our Operational Resilience & Business Continuity Management module and our Vendor Risk Management module support the business continuity and material service provider requirements of CPS 230, and these are integrated with our core ERM platform, in alignment with APRA’s view of an integrated risk profile.
Questions
Identifying critical operations
Q1: In response to the Generic Bank Co worked example, which of the following (complaints management, home loan origination, fraud management) would count as critical operations?
Q2: We are having some debate over the definition of ‘custody’ and ‘settlements’ APRA have given no definition. What is your view?
Q3: Would APRA be releasing more material to guide on the process to identify Critical Operations?
Q4: What should be the approach to defining end-to-end processes, should this be process or product focused?
Q5: If complaints management is a critical process would that make AFCA a material service provider?
Q6: We are considering Claims as a Critical Operation, and any subprocess that touches or impacts customers to be Critical Services under Claims. Is this as a correct approach?
Q7: What’s your view on payroll being a critical operation?
Q8: If the impact of disruption is felt only after a week, do I still classify that as critical – how do we set the limit?
Q9: Is an outsourced service provider still a material provider if we have the ability to insource in house if the need arises?
Setting tolerance levels
Q10: Doesn't costs/resources need to be considered when determining risk appetite, which implies it is a factor in determining tolerance levels?
Q11: Should you set tolerances before identifying materiality of service providers?
Q12: How accurate/granular do you need to be with setting the tolerance levels?
Q13: How can you define ‘intolerable’ tolerance levels, given the subjectivity?
Material service providers
Q14: How is the material service provider framework related or different from third party risk management?
Q15: How can you adapt CPS 230 processes to deal with what your material service providers are willing/able to provide?
Q16: Fourth party assessments, how much/how far will you need to go down?
Q17: Does “assurance of material service providers” mean assurance of their compliance and adherence to the agreed SLA to ensure they are meeting the required compliance?
Q18: Can you assess a cohort of similar service providers on the basis of a service management plan, rather than the individual providers?
Process mapping
Q19: Can you please comment on key data?
Q20: In the CPS 230 paper, you presented a model which links risks with processes. In this presentation you have linked risks to resources. Which one is correct?
Q21: Do you see clients mapping their processes in a dedicated business process mapping tool, and then just replicating the critical operation in Protecht ERM?
Q22: Can we rely on detailed Standard Operating Procedures instead of process maps?
Q23: Do you have any advice to what level of process maps entities should go to?
Q24: What are your definitions in relation to roles of risk owner, action owner, etc?
Q25: Does the reference to Senior Managers in the standard include Division Directors or Chief Operating Officers?
Q26: How would you define a ‘credible’ business continuity plan?
Identifying critical operations
Q1: In response to the Generic Bank Co worked example, which of the following (complaints management, home loan origination, fraud management) would count as critical operations?
Given this was a worked example, we made a lot of assumptions. The key to assessing critical operations is whether disruption would result in material adverse impact to customers, or affect your role in the financial system. Here is some of our initial justification - you will of course, have much more context and data to support the way you classify your critical operations.
You might include complaints management as part of customer enquiries. We separated it on the basis that the tolerance level might be different, and that they likely use a different set of resources. Different tolerance levels means we can apply different recovery requirements for the underlying resources.
Home loan origination was on the basis that if someone was in the process of purchasing a property, the failure to settle in a timely manner may result in material adverse impact.
Fraud management (which was not in our example) may depend on context. Fraud management may be embedded as a sub-process within other critical operations. If considered in a broader context, such as preventing external fraud and scams that would impact on customers, is that a critical operation? I’m going to be honest and say I’m not sure whether I’d classify it as one or not. On the one hand, it is not an end-to-end process. But if it fails – it can cause material adverse impact on customers.
Q2: We are having some debate over the definition of ‘custody’ and ‘settlements’ APRA have given no definition. What is your view?
I’d assume custody covers the safekeeping of financial assets, which might be considered an ongoing service. As a critical operation, it can include the related activities to ensure the assets held are managed correctly by the custodian, which might include keeping appropriate record, acting on and providing reporting, administration, dividend collection.
Settlement may include the processes that ensure trades are executed correctly, verified and recorded appropriately.
APRA may provide additional guidance, but if they do I doubt they will be prescriptive. If you have different types of custodian services or types of settlements, you may want to separate them if having different tolerance levels is appropriate. Go back to the core definitions of what might cause material adverse impact, or affect your role in you’re the financial system, and go as granular as makes sense for your operations and customer demographics.
Q3: Would APRA be releasing more material to guide on the process to identify Critical Operations?
CPS 230 Guidance is currently in draft, so they may provide more information when that draft is finalised, based on feedback from industry. Given that they may have an interim list of critical operations from entities in the middle of this year, it may prompt them to communicate with the sector on any particular gaps or misalignment they have observed in work to date.
Q4: What should be the approach to defining end-to-end processes, should this be process or product focused?
Start with the outcome to the customer, or the impact to the financial system. Then consider the process or collection of processes that deliver that outcome – that should be the level of critical operation (assume it is material enough). If the boundaries don’t start at the beginning or the end, you need to think wider. For example, I would suggest ‘Initial lodgement of an insurance claim’ is not a critical process. The customer does not want to lodge an insurance claim – they want it managed through to finalisation of the claim. Initial lodgement is one step in managing that end-to-end claim.
Q5: If complaints management is a critical process would that make AFCA a material service provider?
Great question! If AFCA were not available for an indefinite period of time, could you complete the end-to-end process of complaints management? I’m going to lean towards yes, you could. Even if you were in the middle of resolving a complaint via AFCA, you could resume handling it yourself.
We would not expect AFCA to be a material service provider.
Q6: We are considering Claims as a Critical Operation, and any subprocess that touches or impacts customers to be Critical Services under Claims. Is this as a correct approach?
I’ll avoid the word ‘correct’, but that approach sounds valid. I would say that if you have defined a critical operation, any sub-processes under that should not be considered critical operations themselves, they just form part of the critical operation.
Q7: What’s your view on payroll being a critical operation?
This is the go-to example when comparing traditional business continuity with operational resilience. Business continuity traditionally takes the internal view – what is the impact to the organisation? Operational Resilience, and critical operations under CPS 230, consider the external lens.
On the surface, disruption to payroll does not have a material adverse impact on customers, and is not a process that directly relates to your role in the financial system. However, what happens to your organisation if you can’t process payroll for a week? A month? How will your staff respond? Internally it might result in major morale problems, walkouts and resignation of lots of staff, and related legal issues. On the other hand, it may end up undermining your ability to operate at all, in which case you might justify it as a critical operation.
Typically, I would assume that it is not a critical operation, but you’ll need to make your own assessment. Note that nothing in CPS 230 prevents you from including internal processes that are not classified as critical operations in your business continuity plans.
Q8: If the impact of disruption is felt only after a week, do I still classify that as critical – how do we set the limit?
While every operation serves a purpose, some may not ultimately be critical – at least not to the extent that they cause material adverse impact. The indefinite disruption to some services may be really annoying to customers, but they might have other alternatives, or it simply isn’t important (we used rewards cards in our example).
You may choose to introduce a time-based component or threshold when identifying what makes it into your critical operations register. Just make sure there is reasonable justification.
Q9: Is an outsourced service provider still a material provider if we have the ability to insource in house if the need arises?
Yes. Material service providers are those that support your critical operations/critical functions. You need to set the tolerance levels for the critical operations, and then ensure you can meet those tolerance levels, regardless of whether service providers are involved. You can then develop scenarios to test your ability to meet that tolerance level. This can include ensuring that, if one of those scenarios includes failure of the service provider, you can insource the operation in a timely manner and resume itself.
Setting tolerance levels
Q10: Doesn't costs/resources need to be considered when determining risk appetite, which implies it is a factor in determining tolerance levels?
While APRA have used the term ‘risk appetite for disruption’ in their draft guidance, you need to consider the key defining factors when setting tolerance levels:
- The material adverse impact to your customers
- Your role in the financial system
I would suggest drafting or establishing your tolerance levels based on these key criteria. You might then identify a gap that requires additional investment in order to meet those tolerance levels during disruption. At this point, cost COULD play a part, but I expect you would have to demonstrate how that additional cost is such a burden that it undermines your ability to be a going concern.
Q11: Should you set tolerances before identifying materiality of service providers?
Firstly, I made an erroneous assumption when answering this during the webinar (the asker sent a clarifying question); I initially interpreted the question as “Shouldn’t I know who my material service providers are in order to know what tolerance levels I can set?” My original answer stands – you should set your tolerance levels based on material adverse to your customer, and then identify whether you, or your material service providers, have the capability to meet those tolerance levels. If there is a gap, you need to close it, rather than adjust your tolerance level to suit your existing capability.
The original intent of the question was a little simpler – what order should you do these in? From an APRA timeline expectation, the list of material service providers comes first.
Q12: How accurate/granular do you need to be with setting the tolerance levels - are bands such as 4-8 hours, 8-24 hours, etc sufficient?
My personal experience with business continuity has been to use wider types of ‘bandings’ when doing business impact analysis on a critical operation / process. However, I’ve used these to then set the Maximum Allowable Outage (or whatever your preferred term), which is usually a single figure. E.g. 4 hours.
The relevant wording of the standard for tolerance levels are “maximum period of time… maximum extent of data loss… minimum service levels”. Given these are maximums or minimums, my view is that tolerance levels should be set at a specific value.
This aligns with the approach we have taken in our Operational Resilience module – ranges for assessing different types of impact but defining a specific value for the tolerance level.
Q13: How can you define ‘intolerable’ tolerance levels, given the subjectivity?
It's worth noting that APRA do not use the term ‘intolerable’ in the standard of the guidance, they refer to material adverse impact. We discussed the concept of ‘intolerable harm’ in the webinar, which comes from very similar regulations in the UK. Those concepts include assessing:
- Whether harm is more than simply annoying
- Whether the customer can be returned to their previous position
- Whether there are non-financial impacts that are difficult to assess or recover from
This is a little more tangible than APRA’s guidance to date, but is just one way to consider adverse material impact. We would suggest that when developing your framework for setting tolerance levels, that you have documented and agreed on what represents material adverse impact.
Material service providers
Q14: How is the material service provider framework related or different from third party risk management?
This is likely in reference to how we framed one of our poll questions. We suggest there is no difference, and that you should consider material service providers as part of your relevant frameworks – whether you call them third party management, third party risk management, vendor management framework, service provider framework etc. Those frameworks should then specify how you identify which of those service providers are material. This is consistent with APRA’s guidance. We were mainly emphasising material service providers and the effort related to the assurance.
Q15: How can you adapt CPS 230 processes to deal with what your material service providers are willing/able to provide?
It’s an interesting dilemma. The biggest material service providers may try and dictate terms – but ultimately it is up to regulated entities to meet the requirements of CPS 230. Given some material service providers serve many in the industry, they are likely to be working towards standardisation from their perspective – potentially giving you data or assurances in a format that is undesirable to you. The implications of this challenge shows up in the delayed transitional arrangements until June 2026 to review all contracts with material service providers.
It might be worth noting that some material service providers, particularly in the data sector may be classified as critical infrastructure under Security of Critical Infrastructure Rules. While these are not identical to CPS 230, it does place the onus on them to have a risk management plan, including over their supply chains. The implementation of those plans may also provide artefacts that will help with the assurance clauses of CPS 230.
Q16: Fourth party assessments, how much/how far will you need to go down?
There is no clear answer here. The most common approach is to ensure that your material service providers have their own service provider frameworks that you can obtain assurance over. They may then require their own third parties (your fourth parties) to have similar frameworks in place. Getting detail of the individual service providers further down the chain may be challenging, for a variety of reasons.
Depending on the maturity of your business continuity testing program, you may choose to involve your third parties in scenario testing, which may also require involvement from fourth parties. Confidentiality is likely to be the biggest barrier, but this type of cross-collaboration may break down barriers when looking across the extended enterprise.
Q17: Does “assurance of material service providers” mean assurance of their compliance and adherence to the agreed SLA to ensure they are meeting the required compliance?
In short yes, though also beyond that. Absolutely you want them to provide assurance that they are meeting their contractual obligations. I would extend that to include risk metrics and controls assurance that helps you evaluate the level of risk your service providers expose you to.
Q18: Can you assess a cohort of similar service providers on the basis of a service management plan, rather than the individual providers?
This sounds like an interesting approach – though I may be making some assumptions about what might be in a service management plan in this context. If I’ve understood the approach, it’s to group similar service providers together, and set some tolerances around them as a group. I assume this avoids the rigorous material service providers clauses being required for each individual service provider, but sets triggers for action if some individual service providers fall short.
I’m sure there is more to explore here – and I’m equally sure APRA will weigh in on innovative approaches to ensure they meet their expectations while not over-burdening regulated entities.
Process mapping
Q19: Can you please comment on key data?
I misinterpreted this question when I answered this in the webinar. Let’s look at section 27(b) of the standard (emphasis added):
As part of maintaining a comprehensive assessment of its operational risk profile, an APRA-regulated entity must identify and document the processes and resources needed to deliver critical operations, including people, technology, information, facilities and service providers, the interdependencies across them, and the associated risks, obligations, key data and controls.
I answered the question based on information, which is one of the resources needed to deliver the critical operations. Using our critical operation example in the webinar, you can’t process a payment at the merchant if you don’t have access to the customer’s balance at the time of payment.
On the other hand, we see the key data as the information that supports your operational risk management processes. This can include risk metrics and key risk indicators, incident data, attestations linked to obligations, controls assurance outcomes, or related audit findings. At Protecht, we pull this together in aggregated reporting that we call Risk in Motion.
Q20: In the CPS 230 paper, you presented a model which links risks with processes. In this presentation you have linked risks to resources. Which one is correct?
Good pick up! Either is ‘correct’. APRA aren’t prescriptive about how you document these relationships. We have customers who have different approaches, and we have adapted our implementation of our Operational Resilience and BCP module to individual customers when necessary. My personal preference is to map to resources, and this is our default configuration in our system.
Q21: Do you see clients mapping their processes in a dedicated business process mapping tool, and then just replicating the critical operation in Protecht ERM?
Our Process Mapping tool allows for decision points or branching. Multiple flows can also be captured in a single map linked to a critical operation. For ‘customer enquiries’ for example, you might map multiple points.
For some critical operations, particularly where you expect or identify vulnerabilities, you may need to drill down into specific details. If you are using more complex modelling techniques (e.g. the Business Process Modelling Notation standard) at the more granular level, these may need to be completed in a dedicated tool. The outputs can be uploaded to your critical operations register if needed.
The major benefit of using Protecht’s process mapping tool is that, once you’ve associated resources with your processes, that linkage is already done for you if you re-use that process in another critical operation. When it comes times to test your scenarios (which affect specific resources), you can easily identify which critical operations would be affected. Similarly, when updating information about a resource (vulnerability assessments, adding or removing resources), it will be replicated in all of the processes in which that resource is used.
Q22: Can we rely on detailed Standard Operating Procedures instead of process maps?
What APRA expects is that you understand your operations. While process mapping is highly encouraged, it is not an explicit requirement. Standard Operating Procedures may include all of the detail required, though I suspect are not as easy to digest or interpret when considering end to end processes.
One of the benefits of process mapping include visualisation that makes it easier to communicate how processes are delivered, and easier to flag the individual resources, and those that are provided by service providers.
Q23: Do you have any advice to what level of process maps entities should go to?
The simplest answer is, at the highest level possible that provide both value and assurance. You need to understand the processes well enough that you can achieve resilience. This means having enough detail to identify potential vulnerabilities. Particularly if you are starting out, we recommend the following:
- Identify the critical operations
- Identify the processes that deliver those critical operations
- Identify the resources needed for those processes
Once you get to this level, you may need to triage which processes may require more in-depth process mapping. This may be systems that are complex, or where processes require data to be pulled from multiple sources.
Governance and general questions
Q24: What are your definitions in relation to roles of risk owner, action owner, etc?
A simple rule we use at Protecht is ‘Whoever owns the objective owns the risk’. It’s not always so simple and sometimes they cross boundaries, but this is a good starting point. From a critical operations perspective, risks related to that critical operation might be owned by the person accountable for that end-to-end process.
More practically, if someone is a risk owner, they should have adequate resources and decision-making capability to manage that risk effectively, such as propose or implement new controls. Making someone a risk owner who has no budget or authority is a waste of time. It’s similar for action owners.
Some risks cross boundaries, such as enterprise risks managed by a central team (e.g. IT or People & Culture) that impact all departments or multiple objectives. In those cases it is essential that information about risks and how they are managed is communicated across those teams. Protecht achieves this through its Risk In Motion dashboard.
Q25: Does the reference to Senior Managers in the standard include Division Directors or Chief Operating Officers?
Typically, Senior Managers will be Executives, though titles are not the defining factor. Some of the criteria that define senior managers include:
- Ability to make or participate in decisions that affect the whole or a substantial part of the organisation
- Having capacity to significantly affect the financial standing of the organisation
- Affecting the whole organisation through responsibilities related to implementing board approved policies and strategies, implement risk management frameworks, and monitoring the adequacy of risk management.
You’ll have to make your own assessment, but I’d assume those titles are likely to fit these criteria.
Q26: How would you define a ‘credible’ business continuity plan?
I would say it is one where you have reasonable assurance that it will help you maintain your critical operations within tolerance levels during disruption, and otherwise meets the objectives of that business continuity plan. That reasonable assurance would typically be obtained through running scenarios, tests and exercises.
My view is that APRA want to ensure that the outcomes you’ve documented in your business continuity plan are actually achievable, not just aspirational.
Next steps for your organisation
Protecht are here to be your partners as you travel the road to readiness for CPS 230. We do this through consulting, training and systems. Our Operational Resilience & Business Continuity Management module and our Vendor Risk Management module support the business continuity and material service provider requirements of CPS 230, and these are integrated with our core ERM platform, in alignment with APRA’s view of an integrated risk profile.
Find out more about Protecht and CPS 230: