Cloud NewsNews/PR

Gartner highlights 9 principles to improve cloud resilience

2 Mins read
improve cloud resilience

Partial Failures, Degradations of Service and Local Problems Are Typical

Infrastructure & operations (I&O) leaders must implement 9 core principles to bolster the resilience of cloud environments, as advised by Gartner, Inc.

“The cloud is not magically resilient and software bugs, not physical failures, cause almost all cloud outages,” said Chris Saunderson, Senior Director Analyst at Gartner. “In the cloud, outages almost never involve the entire cloud provider, nor are service outages likely to be total. Instead, partial failures, degradations of service, individual service problems or local problems are typical.”

The I&O team needs to understand the typical characteristics and main reasons behind cloud outages. These disruptions are often only partial, they tend to be intermittent or involve performance reduction, making them harder to detect immediately, and resilience differences exist between the services provided by cloud vendors.

“Resilience is not a binary state,” said Saunderson. “No one can claim absolute resilience — not you, and not any cloud provider. Clouds should be as or even more resilient than on-premises infrastructure, but only if the I&O team uses them in a resilient manner.”

Analysts at Gartner recommend I&O leaders focus on 9 main principles to enhance cloud resilience (see Figure 1).

Source: Gartner (November 2023)

  1. Business Alignment: Ensure that resilience requirements are aligned with business needs. Misalignment can lead to insufficient resilience measures or excessive spending.
  2. Risk-Based Approach: Adopt a risk-based approach to resiliency planning that looks beyond just catastrophic events. Put more emphasis on the more common and manageable failures that organizations can mitigate.
  3. Dependency Mapping: Develop dependency graphs that map all system components, databases, cloud services, and integration points so that they can be architected and configured for resilience and included in both reliability and disaster recovery planning.
  4. Continuous Availability: Aim for uninterrupted availability of applications, services, and data, minimizing downtime and impact in failure scenarios.
  5. Resilient-By-Design: Design applications to be naturally resilient. Infrastructure resilience alone is not sufficient to offer the zero-downtime services that end users expect.
  6. DR Automation: Utilizing complete (or near complete) automated disaster recovery solutions — either through the organization’s own tools or via third-party cloud-native disaster recovery tools — provides the foundation which is needed to meet aggressive recovery time objectives (RTOs) and facilitate regular testing.
  7. Resilience Standards: Embrace resilience standards that extend beyond architecture and disaster recovery, focusing on quality, automation, and continuous improvement throughout an application’s lifecycle.
  8. Favour Cloud-Native Solutions: Leverage the extensive range of resilience solutions offered by cloud providers, avoiding the complexity of creating in-house alternatives.
  9. Business Functions Focus: In disaster recovery, consider lightweight or minimal functionality options that maintain critical business operations, rather than exact full-scale replicas.

Explore detailed insights in Gartner’s ‘9 Principles for Improving Cloud Resilience‘ and ‘Quick Answer: How Should Executive Leaders Plan for Cloud Outages?

Read Next: 66% C-suite leaders get positive ROI from cloud investments, MIT Report

Leave a Reply

Your email address will not be published. Required fields are marked *

35 ÷ = 5