Supplier Risk Management of cloud services for business continuity
Supplier Risk Management is the process of evaluating and assessing your suppliers based on the impact they have on your business. If you are highly dependent on your supply chain to be able to deliver to your customers (who isn’t?), managing supplier risk is as important as any of your internal procedures.
The most obvious example of SRM in practice is usually manufacturing. In order to manufacture your product, you need the components or raw materials to make it.
In IT, SRM is often tied to disaster recovery. Sourcing replacement hardware is time-critical so we negotiate relationships with our suppliers for replacement servers in specific time-frames to meet our recovery needs.
As use of cloud computing increases, this risk is decreasing. Cloud based DR for allows for resource available on demand. We don’t need like-for-like replacement hardware in hours anymore.
But, the use of cloud computing for production systems raises its own SRM questions – so what are the new issues we need to consider?
Hyperscale cloud services
By “hyperscale cloud services" we are referring to the very largest, utility computing providers such as AWS, Microsoft or Google.
The first issue to note is that only the very largest of businesses or even governments have the bargaining power to negotiate terms with these businesses. Everyone else is forced to accept standard terms and conditions, or simply not use those services. There are some positives on relying on these businesses to deliver your IT services:
- Their scale makes them less likely to fail
- Their Dependency Ratio will be very low (the percentage of their total revenue that you contribute)
- There is generally, greater visibility - Their financial performance is publicly available for you to assess & it is easier to find out about any legal cases or judgements filled against them
However, that scale also contributes many challenges:
- Lack of power in the supplier-customer relationship. If they change terms at short notice, you may not have a chance to make alternative arrangements
- If they suffer an outage there may not be an alternative supplier to use or it may not be feasible to use an alternative supplier
These aren’t new SRM challenges – but perhaps ones that we’ve not had to worry about in IT outside relationships with the major software vendors.
SaaS products (Office365, G Suite)
With Software as a Service models, the cloud vendor takes responsibility further up the stack, including the application itself. When service is good, this means less operational concerns for the customer, but if service is down, it can be more difficult to use alternative methods.
Let’s assume you use Office365. If that service becomes unavailable for a period of time, what can you do? If you have files saved locally and the applications installed locally, you can continue to work. Email however won’t work. If you need to continue to send and receive email, you will need an additional service to supplement, like a Secure Email Gateway with continuity built in.
Infrastructure services such as Azure, Google Cloud Platform or Amazon Web Services present slightly different challenges.
There are two fundamental SRM continuity challenges you must address:
- Short term operational resilience – if there is a problem with the cloud service provider, how do you maintain continuity?
- Long term supplier problems – if you no longer want to use this cloud service provider, how do you move to another service and maintain ownership of your data?
The first challenge is what we are usually talking about in business continuity. If there is a problem with a server instance or a database or some aspect of the computing infrastructure, how do you recover and maintain continuity?
Broadly, there are two ways you can do this – within the same cloud, or outside that cloud (either recovering back to your premises or to another cloud). This infrastructure is designed to run de-coupled services so your storage, processing and other computing building-blocks scale and operate independently. They have multiple data centres or areas of datacentres. The specific definition varies for each of the services (Google uses ‘Regions’ and ‘Zones’, AWS uses ‘Availability Zones’, ‘Regions’ while Azure just has ‘Regions’) but ultimately, they are a method of building in resiliency, still using the same service. All of the hyperscale clouds make it easy for customers to build this resiliency without leaving their service.
To have resiliency across multiple suppliers is more difficult and more expensive. It’s perfectly possible, and will become cheaper over time. Using Docker and an orchestration tool like Kubernetes, organisations can put a layer over their cloud environments which negates the differences between different providers. After that, so long as organisations are using cloud agnostic code, there’s nothing to stop them quickly switching MySQL between, say, AWS and Google at will.
Long-term supplier independence
If you need to move away from you chosen cloud service provider, how easy is it to do so? Significant data transfer in and out of these services do incur cost. Building applications to best make use of those cloud building blocks also builds lock-in to those services.
Copying data out of the cloud on its own and either back to your site, or to another cloud is relatively easy, getting the applications working properly on another cloud will take far longer.
The answer is to consider these implications at the point of setting up these clouds. Again, orchestration tools used to answer the short-term resiliency issues help solve any long-term problems. The alternative is to have some kind of copy of your data and applications elsewhere, either on premise or with a backup and recovery third party.
SRM is about that you are making sure that you have continuity in your operations. In some cases that might mean decreasing your risk by having several suppliers who can all offer an undifferentiated service.
Using hyperscale cloud providers means being very reliant on a small number of suppliers (or even a single supplier).
It is possible to build continuity and operational resilience within each of the IaaS offerings from the hyperscale cloud providers but greater resilience comes at a higher cost and true resilience necessitates the use of multiple suppliers.