High Availability

High availability (HA) is ensuring systems remain operational and accessible with minimal downtime.

involves designing and implementing infrastructure for fault tolerance and redundancy
concept of availability can be measured over a defined period (e.g. one year)
- can be measured as:
  - Uptime
    - the amount of time client data and resources are available on the servers
  - Downtime
    - time or percentage that a system is unavailable
    - maximum tolerable downtime (MTD) metric expresses the availability requirement for a particular business function
    - is calculated from the sum of scheduled service intervals plus unplanned outages over the period
- usually loosely described as
  - 24x7
  - 24x365
- Availability is often measured in the number of nines (including the whole number) found in a percentage
  - E.g., 99.999% uptime, it is stated as “five nines”

Nines Value	Availability	Annual Downtime (hh:mm:ss)
Six	99.9999%	00:00:32
Five	99.999%	00:05:15
Four	99.99%	00:52:34
Three	99.9%	08:45:36
Two	99%	87:36:00

Scalability and Elasticity

Scalability and elasticity

Fault Tolerance and Redundancy

Fault tolerance is protection against system failure by providing extra (redundant) capacity.

fault tolerant systems identify and eliminate single points of failure

Redundancy is overprovisioning resources at the component, host, and/or site level so that there is failover to a working instance in the event of a problem.

Site Considerations

Disaster Recovery Sites

Cloud as Disaster Recovery (DR)

Cloud as Disaster Recovery

Testing Redundancy and High Availability

Load testing
- incorporates specialized software tools to
  - validate a system’s performance under expected or peak loads
  - and identify bottlenecks or scalability issues
Failover testing
- focuses on validating failover processes to ensure a seamless transition between primary and secondary infrastructure
Testing monitoring systems
- validate effective detection and response to failures and performance issues

adam's notes

Table of Contents