What goes up must come down, as the old saying goes, and the cloud is no exception. Like traditional on-premise networks, even the biggest, best-resourced cloud providers can experience unplanned outages, service disruptions and downtime. Earlier this year Amazon Web Services suffered a significant outage at one of its data centers in the US, affecting major AWS customers such as Netflix, Tinder and IMDb as well as a number of AWS services such as CloudWatch and Cognito.
So if you have business-critical applications running in either a public or private cloud environment, how should you prepare to weather an outage? What contingency plans and procedures should you put in place to help get you through, and minimise the potential disruption to your business when the cloud falls? Here are my key tips:
Watch the skies: Cloud monitoring tools track the performance of your cloud applications, checking that everything’s functioning as it should, and that it’s accessible from where it should be. They can also alert you early to any emerging performance problems, enabling you to stay on top of potential issues and reducing the chance of your networks and applications being taken down unexpectedly.
Backup: Just as you would regularly back up data and resources that are stored on-premise, you should back up what you store in the cloud to ensure that you don’t lose valuable data. For organizations that use multiple cloud providers, they may be able to back up their data between the different service providers, giving a valuable failover capability.
High availability: High availability for cloud services virtually eliminates downtime by guaranteeing that applications never go down for more than a fraction of a second. High availability provides users with the always-on service they expect and protects them from outages and data loss (service providers typically charge a premium for these types of service).
Redundancy: This basically means that if one server goes down, another one takes over so your end users don’t even notice the problem. AWS, for example, has multiple data centers in different regions, so you should put one server in one data center and another in a different center to provide redundancy. If your cloud provider does not offer multiple data centers, you could, again, consider using more than one cloud provider, and distribute your application workloads between them.
Visibility: It’s also critical to have good visibility across your network infrastructure (both cloud and on-premise), as well as the ability to automate processes that involve changing security policies for key business applications that run across the hybrid environment. This ensures that you can maintain security and compliance, while handling all the complex changes and tasks involved in managing a hybrid environment, as we detailed in this blog post.
These points should all be considered as key elements of your cloud IT strategy to help reduce the risk of outages, and maintain continuity when – and it is when, not if – unexpected downtime happens.
Remember that no single server, network, data center, or cloud service can ever be 100% reliable, and so build your infrastructure with this in in mind. However, with these tips, no-one could accuse you of having your head in the clouds when it comes keeping key business applications running.