Cloud Datacenter Monitoring


  • Cloud providers need to monitor how hardware, software, and network are being utilized in their datacenters
    • to better meet customer needs (and comply with SLAs)
  • fulfills 3 IT service management functions
    • Service-level management
      • IT organization is fulfilling its obligations to internal and external customers
    • Availability management
      • improves resiliency of IT services
    • Capacity mangement
      • ensures sufficient IT resources

Monitoring

  • Tools to monitor datacenter:
    • OS logging
      • monitors performance and events
      • can set logs to alert when usage approaches a certain level of capacity utilization or performance degradation
      • includes:
        • CPU usage, memory usage, disk space, disk I/O timing
    • Cloud application telemetry
      • describes the rich information sources provided by some cloud services that orgs may include in security monitoring
      • introduces new challenges
        • can be difficult to import or export log sources from the cloud
        • log entries may be delayed from real-time
    • Hardware monitoring
      • performance-monitoring tools are often included in device builds
      • can monitor:
        • CPU usage, fan speed, voltages (consumption and throughput), CPU load, clock speed, drive temp
      • can use third-party monitoring products
    • Network monitoring
      • monitor network hardware, software, and distribution components (cabling, SDN control planes)
      • ensure capacity meets customer needs
      • ensure network is not overburdened or unacceptable latency