There are a number of reasons for moving your enterprise systems to the cloud: optimizing costs, reliability, cutting down or eliminating operational costs, security, scalability and availability – to name a few. There’s so much to tell about the pros of cloud services, but these last three will be observed in this blog post.
Some of the cons are vendor lock-in, limited control over infrastructure and the requirement of stable and fast internet connection to reach different systems. These arguments are valid, but the pros are stronger than these and some of them might be looked at from a different point of view. We will see that losing control over the infrastructure is not a con in certain cases, and while vendor lock-in is still there, it is also there when migrating an on-premises solution to the cloud.
Security is a top priority for every cloud service provider, because it would be impossible to build trust with your clients and users without that. This is an extremely broad topic, and it also includes safety, not just security. It includes physical security with its protocols: who can enter a data center, how frequently security reviews are held, what kind of security equipment is installed on-site (cameras, alarms). Another point of view is the standards that the infrastructure meets, data protection (both encryption at rest and encryption in transit), or protection against denial-of-service attacks. These are all factors that we, as customers, do not and cannot influence.
Updating software is also extremely important. The database and web servers both get the latest security patches without our attention, however, it is also true for virtual machines as well. Services and systems are monitored by the cloud service provider, so if there is a suspicion around a possible attack or breach, the cloud provider will most likely notice it even before us. Third-party companies also help with finding vulnerabilities in reporting the issues found benevolently, so that they can be fixed before a malicious attacker would exploit the vulnerabilities.
The sad truth is that these data breaches become public in the media, but this should not be interpreted as a problem. The situation is like the case of a plane crash: as they are also rare, so it has some value in the news, and in the meantime, travelling by plane is considered to be the safest way of travelling. The analogy also reflects the way that both industries have experts who will investigate and fix the issues. Microsoft has a team to continuously improve Azure from a security point of view. For you, it would take a lot of knowledge and human resources to ensure the same safety level for your on-premises environment as a cloud can provide.
Based on an urban legend, another big cloud provider, the Amazon Web Services – their profile being a webshop back then – faced the issue that their systems needed to survive the Christmas rush, so they extended their server capacity. They found out that during the year they had loads of unused resources, and they started to make them available for rent, starting with their own cloud services. Unfortunately, this story is not true. But a lot of people know this story, most likely because it could have happened in a similar case.
Ideally, you might want to optimize your system to the peak traffic, not to the average one. The fewer peak periods you have, the less effective it will be to host your system on-premises, as your system would not require all those resources most of the time. This is a problem that’s solved by cloud providers enabling automation and scalability, so that it can scale up in busy periods, and scale down in quiet periods. You do not need to watch the metrics of the available resources to add new servers manually, as everything is done automatically if configured properly. All cloud providers – including Azure – are designed to provide scalable services, e.g. Azure App Service, Azure Kubernetes Service or Azure Functions.
Netflix, one of the biggest streaming providers nowadays, has a unique way of testing their systems against availability, as their main objective is that their users can watch movies and series whenever they want to, maximizing their satisfaction. The tool is called ChaosMonkey, and it got its name based on its logic: it simulates the case when monkeys are on the loose in the server room pulling out random cables and making some components unavailable – the software turns off services randomly, so that the engineers can check if the system is still functional. So, creating fault tolerant code for them is not an option, but a must.
Cloud providers operate in multiple regions nowadays. They’re often divided into different availability zones that are connected to each other, and in the meantime data and services are replicated into another region to ensure that a system is still available regardless of natural disasters, power outage, etc, because another availability zone can take over its place.
Despite all these measures, sometimes it happens that one region or zone becomes unavailable. These are also rare occasions like discovering the vulnerabilities mentioned earlier. These often turn up in the media sometimes accompanied with some strange symptoms, for example, robot vacuum cleaners stopped working in 2020 September due to an Amazon outage in the US.
As mentioned at the start of the post, there is plenty to write about when it comes to the pros of cloud computing, so moving your systems there is a no-brainer. If you want to read more about how to optimize your deployment to eliminate downtime and maximize availability with that, then read this blogpost about Azure Kubernetes service on our blog!