Disaster Recovery Sites


  • Within the scope of business continuity planning,
    • disaster recovery plans (DRPs) describe the specific procedures to follow to recover a system or site to a working state
      • disaster could be anything from:
        • a loss of power or failure of a minor component
        • to human-made or natural disasters

Spare Sites

A spare site is another location that can provide the same (or similar) level of service.

  • disaster or systems failure at one site will cause services to failover to the alternate site
  • DRP must state:
    • how this will happen
    • what checks need to be made to ensure that failover has occurred successfully
      • without loss of transactional data or service availability
    • how to revert to the primary site once functionality is restored there

Site Resiliency

  • levels of site resiliency:
    • hot site
      • quickest access to restore critical data in the event of a disaster or catastrophe
      • can failover almost immediately
      • site is already within the organization’s ownership and is ready to deploy
      • typically involve the latest and greatest storage equipment and the fastest protocols
      • typically located close to the client or in multiple locations to ensure fast access
      • e.g.,
        • hot site could consist of a building with operational computer equipment that is kept updated with a live dataset
    • warm site
      • similar but with the requirement that the latest dataset will need to be loaded
      • Alternate processing location that is dormant or performs noncritical functions under normal conditions, but which can be rapidly converted to a key operations site if needed
    • cold site
      • offers less frequent access and is maintained on minimal equipment that is considered lower performance
        • Returning to normal operations is slower
        • significant advantage of cold storage
          • it is less expensive than hot storage
      • may be an empty building with a lease agreement in place to install whatever equipment is required when necessary

Geographic dispersion is a resiliency mechanism where processing and data storage resources are replicated between physically distant sites.

  • aims to ensure that recovery sites are located far enough apart to minimize the impact of regional disasters

Cloud as Disaster Recovery

  • redundancy at scale is expensive and complex
    • sites are often leased from service providers
    • business can also enter reciprocal arrangements to provide mutual support
      • cost-effective but complex to setup
  • most cost-effective solution is to use a cloud site
    • cloud operator should be able to maintain hot site redundancy so that a geographic disaster in one area will not disrupt service bc of support from another in a different region
  • cloud providers offer more affordable redundancy and backup options
    • due to their economies of scale
  • offer
    • scalability
    • geographic diversity
    • faster deployment
    • simplified management