What is Monitoring and Incident Management for Amazon EKS in AMS Accelerate? - AMS Accelerate User Guide

What is Monitoring and Incident Management for Amazon EKS in AMS Accelerate?

Monitoring and Incident Management for EKS provides the following:

  • A default configuration that creates, manages, and deploys monitors and policies across your managed account for Amazon EKS clusters that you select.

  • A monitoring baseline to allow your Amazon EKS workloads to have increased availability, even if you don't configure any other monitoring for your Amazon EKS clusters. For more information, see Baseline alerts in Monitoring and Incident Management for Amazon EKS in AMS Accelerate.

  • Notifications that are generated by the baseline monitoring configured for your Amazon EKS cluster. These notifications are known as alerts. Alerts are generated when there are imminent, on-going, receding, or potential failures, performance degradation, or security issues. Examples of alerts include a Prometheus alert, an event, or a finding from an AWS service, such as Amazon GuardDuty.

  • Alert investigation with guidance on appropriate remediation actions that you can take. For more information, see Incident reports and service requests in AMS Accelerate.

  • Remediation of alerts and incidents by AMS operations, when possible and with your approval, to prevent or reduce the impact to your applications. For more information, see Incident reports and service requests in AMS Accelerate.

  • Optional predefined Amazon Managed Grafana dashboards that provide visibility into resource utilization, performance, health of CoreDNS, active alerts, and previously resolved alerts. If you configure Amazon Managed Grafana using the AMS-provided template, then you can open the Amazon Managed Grafana console to view metrics and alerts for your Amazon EKS cluster.