Resilient Architecture Concepts


  • One benefit of cloud is the potential to provide services resilient to failures at different levels
    • e.g., components, servers, local networks, sites, datacenters, and wide area networks
  • CSP uses a virtualization layer to ensure that computer, storage, and network provisions meet the availability criteria set out in its SLA

High Availability

High availability (HA) is a metric that defines how closely systems approach the goal of providing data availability 100% of the time while maintaining a high level of system performance.

  • CSP uses redundancy to make multiple disk controllers and storage devices available to a pool of storage resources
    • data may be replicated between pools or groups
    • each pool supported by separate hardware resources

Replication

Replication is automatically copying data between two processing systems either simultaneously on both systems (synchronous) or from a primary to a secondary location (asynchronous).

  • allows businesses to copy data to where it can be utilized most effectively
  • cloud may be used as a central storage area
    • makes data available among all business units
  • requires:
    • low latency network connections
    • security
    • data integrity
  • several data storage performance tiers
    • e.g., hot storage and cold storage
    • the quicker the data retrieval, the higher the cost
  • Different applications have diverse replication requirements
    • e.g., database needs low-latency, synchronous replication
      • transaction cannot be considered complete until it has been made on all replicas

High Availability Across Zones

  • CSPs divide the world into regions
  • Each region is independent
  • regions are divided into availability zones
    • have independent datacenters with their own power, cooling, and network connectivity
  • Provisioning resources in multiple zones and regions can:
    • improve performance
    • increase redundancy
    • but it requires an adequate level of replication performance
  • several tiers of replication representing different high availability service levels:
    • Local replication
      • replicates your data within a single datacenter in the region where you created your storage account
      • replicas are often in separate fault domains and upgrade domains
    • Regional replication
      • aka zone-redundant storage
      • replicates your data across multiple datacenters within one or two regions
      • safeguards data and access in the event a single datacenter is destroyed or goes offline
    • Geo-redundant storage (GRS)
      • replicates your data to a secondary region that is distant from the primary region
      • safeguards data in the event of a regional outage or a disaster