

# What is Amazon DevOps Guru?
<a name="welcome"></a>

Welcome to the Amazon DevOps Guru user guide.

DevOps Guru is a fully managed operations service that makes it easy for developers and operators to improve the performance and availability of their applications. DevOps Guru lets you offload the administrative tasks associated with identifying operational issues so that you can quickly implement recommendations to improve your application. DevOps Guru creates reactive insights you can use to improve your application now. It also creates proactive insights to help you avoid operational issues that might affect your application in the future. 

DevOps Guru applies machine learning to analyze your operational data and application metrics and events to identify behaviors that deviate from normal operating patterns. You are notified when DevOps Guru detects an operational issue or risk. For each issue, DevOps Guru presents intelligent recommendations to address current and predicted future operational issues. 

To get started, see [How do I get started with DevOps Guru?](#how-do-i-get-started) 

# How does DevOps Guru work?
<a name="how-it-works"></a>

The DevOps Guru workflow begins when you configure its coverage and notifications. After you set up DevOps Guru, it starts to analyze your operational data. When it detects anomalous behavior, it creates an insight that contains recommendations and lists of metrics, log groups, and events that are related to the issue. For each insight, DevOps Guru notifies you. If you enabled AWS Systems Manager OpsCenter, an OpsItem is created so you can use Systems Manager OpsCenter to track and manage addressing your insights. Each insight contains recommendations, metrics, log groups, and events related to anomalous behavior. Use information in an insight to help you understand and address the anomalous behavior.

See [High level DevOps Guru workflow](high-level-workflow.md) for more detail about the three high-level workflow steps. See [Detailed DevOps Guru workflow](detailed-workflow.md) to learn about the more detailed DevOps Guru workflow, including how it interacts with other AWS services. 

**Topics**
+ [

# High level DevOps Guru workflow
](high-level-workflow.md)
+ [

# Detailed DevOps Guru workflow
](detailed-workflow.md)

# High level DevOps Guru workflow
<a name="high-level-workflow"></a>

The Amazon DevOps Guru workflow can be broken down into three high level steps. 

1.  Specify DevOps Guru coverage by telling it which AWS resources in your AWS account you want it to analyze. 

1.  DevOps Guru starts analyzing Amazon CloudWatch metrics, AWS CloudTrail, and other operational data to identify problems that you can fix to improve your operations. 

1.  DevOps Guru makes sure that you know about insights and important information by sending you a notification for each important DevOps Guru event. 

 You can also configure DevOps Guru to create an OpsItem in AWS Systems Manager OpsCenter to help you track your insights. The following diagram shows this high-level workflow. 

![\[Coverage, insights, and notification integration in a DevOps Guru workflow.\]](http://docs.aws.amazon.com/devops-guru/latest/userguide/images/how-capstone-works.png)


1. In the first step, you choose your coverage by specifying which AWS resources in your AWS account are analyzed. DevOps Guru can cover, or analyze, all the resources in an AWS account, or you can use AWS CloudFormation stacks or AWS tags to specify a subset of the resources in your account to analyze. Make sure that the resources you specify make up your business critical applications, workloads, and micro-services. For more information about the supported services and resources, see [Amazon DevOps Guru pricing](https://aws.amazon.com/devops-guru/pricing/).

1. In the second step, DevOps Guru analyzes the resources to generate insights. This is an ongoing process. You can view the insights and see the recommendations and related information they contain in the DevOps Guru console. DevOps Guru analyzes the following data to find issues and create insights. 
   + Individual Amazon CloudWatch metrics emitted by your AWS resources. When an issue is identified, DevOps Guru collects those metrics together. 
   +  Log anomalies from Amazon CloudWatch log groups. If you enable log anomaly detection, DevOps Guru displays related log anomalies when an issue occurs. 
   + DevOps Guru pulls enrichment data from AWS CloudTrail management logs to find events that are related to the collected metrics. The events can be resource deployment events and configuration changes. 
   + If you use AWS CodeDeploy, DevOps Guru analyzes deployment events to help generate insights. Events for all types of CodeDeploy deployments (on-premises server, Amazon EC2 server, Lambda, or Amazon EC2) are analyzed. 
   + When DevOps Guru finds a specific pattern, it generates one or more recommendations to help mitigate or fix the identified issue. The recommendations are collected in one insight. The insight also contains a list of the metrics and events that are related to the issue. You use the insight data to address and understand the identified problem. 

1. In the third step, DevOps Guru integrates insight notification into your workflow to help you manage issues and quickly address them. 
   + Insights generated in your AWS account are published to the Amazon Simple Notification Service (Amazon SNS) topic chosen during DevOps Guru setup. This is how you are notified as soon as an insight is created. For more information, see [Updating your notifications in DevOps Guru](update-notifications.md).
   + If you enabled AWS Systems Manager during DevOps Guru setup, each insight creates a corresponding OpsItem to help you track and manage the issues discovered. For more information, see [Updating AWS Systems Manager integration in DevOps Guru](update-settings.md#update-systems-manager-integration).

# Detailed DevOps Guru workflow
<a name="detailed-workflow"></a>

 The DevOps Guru workflow integrates with several AWS services, including Amazon CloudWatch, AWS CloudTrail, Amazon Simple Notification Service, and AWS Systems Manager. The following diagram shows a detailed workflow that includes how it works with other AWS services. 

![\[Resources, analysis, and notifications in the detailed workflow for DevOps Guru.\]](http://docs.aws.amazon.com/devops-guru/latest/userguide/images/capstone-workflow-diagram.png)


This diagram shows a scenario in which DevOps Guru coverage is specified by the AWS resources that are defined in AWS CloudFormation stacks or using AWS tags. If no stacks or tags are chosen, then DevOps Guru coverage analyzes all AWS resources in your account. For more information, see [Defining applications using AWS resources](working-with-resource-collections.md) and [Determine coverage for DevOps Guru](setting-up.md#setting-up-determine-coverage).

1. During setup, you specify one or two Amazon SNS topics that are used to notify you about important DevOps Guru events, such as when an insight is created. Next, you can specify AWS CloudFormation stacks that define the resources you want analyzed. You can also enable Systems Manager to generate an OpsItem for each insight to help you manage your insights. 

1. After DevOps Guru is configured, it starts analyzing CloudWatch metrics, log groups, and events that are emitted from your resources and AWS CloudTrail data related to the CloudWatch metrics. If your operations include CodeDeploy deployments, DevOps Guru also analyzes deployment events. 

   DevOps Guru creates insights when it identifies unusual, anomalous behavior in the analyzed data. Each insight contains one or more recommendations, a list of the metrics used to generate the insight, a list of related log groups, and a list of the events used to generate the insight. Use this information to address the identified problem. 

1. After each insight is created, DevOps Guru sends a notification using the Amazon SNS topic or topics specified during DevOps Guru set up. If you enabled DevOps Guru to generate an OpsItem in Systems Manager OpsCenter, then each insight also triggers a new Systems Manager OpsItem. You can use Systems Manager to manage your insight OpsItems. 

## How do I get started with DevOps Guru?
<a name="how-do-i-get-started"></a>

 We recommend that you complete the following steps: 

1. **Learn** more about DevOps Guru by reading the information in [DevOps Guru concepts](concepts.md). 

1. **Set up** your AWS account, the AWS CLI, and an administrative user by following the steps in [Setting up Amazon DevOps Guru](setting-up.md). 

1. **Use** DevOps Guru, following the instructions in [Getting started with DevOps Guru](getting-started.md). 

## How do I stop incurring DevOps Guru charges?
<a name="how-do-i-disable-devops-guru"></a>

To disable Amazon DevOps Guru so that it stops incurring charges from analyzing resources in your AWS account and Region, update your coverage settings so that it doesn't analyze resources. To do this, follow the steps in [Updating your AWS analysis coverage in DevOps Guru](update-settings.md#update-coverage) and choose **None** in step 4. You must do this for each AWS account and Region where DevOps Guru analyzes resources.

**Note**  
If you update your coverage to stop analyzing resources, you might continue to incur minor charges if you review existing insights generated by DevOps Guru in the past. These charges are associated with API calls used to retrieve and display insight information. For more information, see [Amazon DevOps Guru pricing](https://aws.amazon.com/devops-guru/pricing/).

# DevOps Guru concepts
<a name="concepts"></a>

The following concepts are important for understanding how Amazon DevOps Guru works.

**Topics**
+ [

## Anomaly
](#concept-anomaly)
+ [

## Insight
](#concept-insight)
+ [

## Metrics and operational events
](#metrics-and-operational-events)
+ [

## Log groups and log anomalies
](#log-groups-and-anomalies)
+ [

## Recommendations
](#recommendation)

## Anomaly
<a name="concept-anomaly"></a>

An anomaly represents one or more related metrics detected by DevOps Guru that are unexpected or unusual. DevOps Guru generates anomalies by using machine learning to analyze metrics and operational data that are related to your AWS resources. You specify the AWS resources that you want analyzed when you set up Amazon DevOps Guru. For more information, see [Setting up Amazon DevOps Guru](setting-up.md). 

## Insight
<a name="concept-insight"></a>

An insight is a collection of anomalies that are created during the analysis of the AWS resources you specify when you set up DevOps Guru. Each insight contains observations, recommendations, and analytical data you can use to improve your operational performance. There are two types of insights: 
+ *Reactive*: A reactive insight identifies anomalous behavior as it occurs. It contains anomalies with recommendations, related metrics, and events to help you understand and address the issues now. 
+ *Proactive*: A proactive insight lets you know about anomalous behavior before it occurs. It contains anomalies with recommendations to help you address the issues before they are predicted to happen. 

## Metrics and operational events
<a name="metrics-and-operational-events"></a>

The anomalies that make up an insight are generated by analyzing the metrics returned by Amazon CloudWatch and operational events emitted by your AWS resources. You can view the metrics and the operational events that create an insight to help you better understand issues in your application. 

## Log groups and log anomalies
<a name="log-groups-and-anomalies"></a>

When you enable log anomaly detection, relevant log groups are displayed on DevOps Guru insight pages in the DevOps Guru console. A log group lets you know about critical diagnostic information about how a resource is performing and being accessed.

A log anomaly represents a cluster of similar anomalous log events found within a log group. Examples of anomalous log events that may be displayed in DevOps Guru include keyword anomalies, format anomalies, HTTP code anomalies, and more. 

You can use log anomalies to diagnose the root cause of an operational issue. DevOps Guru also references log lines in insight recommendations to provide more context for recommended solutions. 

**Note**  
DevOps Guru works with Amazon CloudWatch to enable log anomaly detection. When you enable log anomaly detection, DevOps Guru adds tags to your CloudWatch log groups. When you turn off log anomaly detection, DevOps Guru removes tags from your CloudWatch log groups.  
In addition, administrators should ensure that only users with permissions to view CloudWatch logs have permissions to view anomalous CloudWatch logs. We recommend that you use IAM policies to allow or deny access to the `ListAnomalousLogs` operation. For more information, see [Identity and Access Management for DevOps Guru](https://docs.aws.amazon.com/devops-guru/latest/userguide/security-iam.html).

## Recommendations
<a name="recommendation"></a>

Each insight provides recommendations with suggestions to help you improve the performance of your application. The recommendation includes the following: 
+ A description of the recommendation actions to address the anomalies that comprise the insight. 
+ A list of the analyzed metrics in which DevOps Guru found anomalous behavior. Each metric includes the name of the CloudFormation stack that generated the resource associated with the metrics, the resource's name, and the name of the AWS service associated with the resource. 
+ A list of the events that are related to the anomalous metrics associated with the insight. Each related event contains the name of the CloudFormation stack that generated the resource associated with the event, the name of the resource that generated the event, and the name of the AWS service associated with the event. 
+ A list of log groups that are related to the anomalous behavior associated with the insight. Each log group contains a sample log message, information about the kinds of log anomalies reported, the times the log anomalies occurred, and a link to view the log lines on CloudWatch.

# DevOps Guru coverage
<a name="coverage"></a>

DevOps Guru addresses and creates insights for a number of different AWS services. For each service that DevOps Guru creates insights for, DevOps Guru displays a variety of analyzed metrics and generated insights. 

Example use case for reactive insights:


| Service Name | Use Case | Examples | Metrics | 
| --- | --- | --- | --- | 
|  AWS Lambda  |  Detect latency or duration anomalies for Lambda functions caused by various root causes like cold starts, increased requests, downstream throttling, or code deployments. Recommend ways to quickly mitigate.  |  Code deployment: Amazon API Gateway latency is affected by an increase in Lambda latency after a recent Lambda code deployment. Downstream throttling: the operator reduced capacity on read units for DynamoDB, causing increased retries. This results in throttling. Cold start: the Lambda function is under-provisioned, so Lambda takes longer when requests are made.   |  Duration Throttles  | 

Example use case for proactive insights:


| Service Name | Use Case | Metrics | 
| --- | --- | --- | 
|  Amazon DynamoDB  |  **The DynamoDB table read consumed capacity is at risk of reaching table limit.** Recommended action: if you are using provisioned capacity mode, use auto scaling to actively manage throughput capacity for tables or purchase reserved capacity in advance for tables. Switch to on-demand capacity mode to pay per read request, paying only for what is used. Detection time: 6 days   |  ConsumedReadCapacityUnits  | 

## Service coverage list
<a name="coverage-services"></a>

For some services, DevOps Guru creates reactive insights. A reactive insight identifies anomalous behavior as it occurs. It contains anomalies with recommendations, related metrics, and events to help you understand and address the issues now.

For some services, DevOps Guru creates proactive insights. A proactive insight lets you know about anomalous behavior before it occurs. It contains anomalies with recommendations to help you address the issues before they are predicted to happen.

**DevOps Guru creates reactive insights for services such as the following:**
+ Amazon API Gateway
+ Amazon CloudFront
+ Amazon DynamoDB
+ Amazon EC2
**Note**  
DevOps Guru monitoring is at an Auto Scaling group level, and not at a single instance level.
+ Amazon ECS
+ Amazon EKS
+ AWS Elastic Beanstalk
+ Elastic Load Balancing
+ Amazon Kinesis
+ AWS Lambda
+ Amazon OpenSearch Service
+ Amazon RDS
+ Amazon Redshift
+ Amazon Route 53
+ Amazon S3
+ Amazon SageMaker AI
+ AWS Step Functions
+ Amazon SNS
+ Amazon SQS
+ Amazon SWF
+ Amazon VPC

**DevOps Guru creates proactive insights for services such as the following:**
+ Amazon DynamoDB
+ Amazon Kinesis
+ AWS Lambda
+ Amazon RDS
+ Amazon SQS