本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 监控 Amazon DocumentDB
<a name="monitoring_docdb"></a>

监控 AWS 服务是保持系统健康和最佳运行的重要组成部分。最好从 AWS 解决方案的各个部分收集监控数据，以便在发生故障或性能下降时更容易调试或修复。在开始监控您的 AWS 解决方案之前，我们建议您思考并回答以下问题：
+ 监控目的是什么？
+ 您要监控什么资源？
+ 您将以什么样的频率监控这些资源？
+ 您将使用哪些监控工具？
+ 由谁负责执行监控？
+ 发生错误时通知谁？以何种方式发送通知？

要了解当前的性能模式，判断性能异常表现，并构想问题解决方法，您应该针对不同时间和不同负载条件设定基准性能指标。当您监控 AWS 解决方案时，我们建议您存储历史监控数据，这些数据既可以供日后参考，也可帮助您设定基准。

通常，性能指标的可接受值取决于您的基准性能以及应用程序执行的操作。应调查相对于基准性能的一致或趋势性变化。有关特定指标类型的建议如下：
+ **高 CPU 或 RAM 使用**：较高的 CPU 或 RAM 使用值可能是正常情况，只要它们符合您的应用程序的目标（如吞吐量或并发度）并且是预期情况即可。
+ **存储卷消耗**：如果使用的空间始终不低于总存储卷空间的 85%，调查存储消耗量 (`VolumeBytesUsed`)。确定是可以从存储卷中删除数据还是可以将数据存档到其他系统以释放空间。有关更多信息，请参阅[Amazon DocumentDB 存储](how-it-works.md#how-it-works.storage)和[Amazon DocumentDB 配额和限制](limits.md)。
+ **网络流量**：对于网络流量，请与系统管理员讨论，以了解域网络和互联网连接的预期吞吐量。如果吞吐量始终低于预期，则应调查网络流量。
+ **数据库连接**：如果发现用户连接数较高，同时实例性能下降并且响应时间延长，请考虑约束数据库连接。实例的最佳用户连接数因实例类和所执行操作的复杂性而异。
+ **IOPS 指标**：IOPS 指标的预期值取决于磁盘规格和服务器配置，因此，请使用您的基准来了解典型状况。调查一下值是否始终与您的基准不同。为获得最佳 IOPS 性能，请确保典型工作集与内存大小相适，以最大限度地减少读取和写入操作。

亚马逊 DocumentDB（兼容 MongoDB）提供了各种亚马逊指标，您可以监控这些 CloudWatch 指标，以确定亚马逊 DocumentDB 集群和实例的运行状况和性能。您可以使用各种工具查看亚马逊 DocumentDB 指标，包括亚马逊 DocumentDB 控制台 AWS CLI CloudWatch 、API 和 Performance Insights。

**Topics**
+ [监控集群的状态](monitoring_docdb-cluster_status.md)
+ [监控实例的状态](monitoring_docdb-instance_status.md)
+ [查看 Amazon DocumentDB 推荐](view-docdb-recommendations.md)
+ [事件订阅](event-subscriptions.md)
+ [使用以下方式监控亚马逊 DocumentDB CloudWatch](cloud_watch.md)
+ [使用 CloudTrail 记录 Amazon DocumentDB API 调用](logging-with-cloudtrail.md)
+ [分析操作](profiling.md)
+ [使用 Performance Insights 进行监控](performance-insights.md)

# 监控 Amazon DocumentDB 集群的状态
<a name="monitoring_docdb-cluster_status"></a>

集群的状态表示集群的运行状况。您可以使用 Amazon DocumentDB 控制台或命令查看集群的 AWS CLI `describe-db-clusters`状态。

**Topics**
+ [集群状态值](#monitoring_docdb-status_values)
+ [监控集群的状态](#monitor-cluster-status)

## 集群状态值
<a name="monitoring_docdb-status_values"></a>

下表列出集群状态的有效值。


| 集群状态 | 说明 | 
| --- | --- | 
| active | 集群为活动状态。此状态仅适用于弹性集群。 | 
| available | 集群状态良好且可用。此状态仅适用于基于实例的集群。 | 
| backing-up | 当前正在备份集群。 | 
| creating | 正在创建集群。正在创建的过程中无法访问。 | 
| deleting | 正在删除集群。正在删除的过程中无法访问。 | 
| failing-over | 正在执行从主实例到 Amazon DocumentDB 副本的失效转移。 | 
| inaccessible-encryption-credentials | 无法 AWS KMS 访问用于加密或解密集群的密钥。 | 
| maintenance | 正在对集群应用维护更新。此状态用于 Amazon DocumentDB 预先计划的集群级别维护。 | 
| migrating | 正将集群快照还原给集群。 | 
| migration-failed | 迁移失败。 | 
| modifying | 正在按照客户请求修改集群。 | 
| renaming | 正在按照客户请求重命名集群。 | 
| resetting-master-credentials | 正在按照客户请求重置集群的主凭证。 | 
| upgrading | 集群引擎版本正在升级。 | 

## 监控集群的状态
<a name="monitor-cluster-status"></a>

------
#### [ Using the AWS 管理控制台 ]

使用 AWS 管理控制台 来确定群集的状态时，请按以下步骤操作。

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**集群**。

1. 在集群导航框中，您将看到**集群标识符**列。您的实例列于集群下，类似于以下屏幕截图。  
![\[“集群”表，显示了如何在集群下嵌套实例。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/choose-clusters.png)

1. 在**集群标识符**列中，找到您感兴趣的实例的名称。然后，要查找该实例的状态，请跨该行阅读至**状态**列，如下所示。  
![\[集群实例，显示了可用状态。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/db-cluster-status-con.png)

------
#### [ Using the AWS CLI ]

使用确定 AWS CLI 集群状态时，请使用`describe-db-clusters`操作。以下代码可查找集群 `sample-cluster` 的状态。

对于 Linux、macOS 或 Unix：

```
aws docdb describe-db-clusters \
    --db-cluster-identifier sample-cluster  \
    --query 'DBClusters[*].[DBClusterIdentifier,Status]'
```

对于 Windows：

```
aws docdb describe-db-clusters ^
    --db-cluster-identifier sample-cluster  ^
    --query 'DBClusters[*].[DBClusterIdentifier,Status]'
```

此操作的输出将类似于下文。

```
[
    [
        "sample-cluster",
        "available"
    ]
]
```

------

# 监控 Amazon DocumentDB 实例的状态
<a name="monitoring_docdb-instance_status"></a>

Amazon DocumentDB 提供有关数据库中每个已配置实例当前状况的信息。

您可以对 Amazon DocumentDB 实例查看的状态有三个类型：
+ 实例状态：此状态显示在中集群表的 “**状态**” 列中， AWS 管理控制台 并显示实例的当前生命周期状况。**状态**中显示的值源自 `DescribeDBCluster` API 响应的 `Status` 字段。
+ 实例运行状况：此状态显示在中集群表的 “**实例运行状况**” 列中， AWS 管理控制台 并显示数据库引擎（负责管理和检索数据的组件）是否正在运行。**实例运行状况**列中显示的值基于 Amazon CloudWatch `EngineUptime` 系统指标。
+ 维护状态：此状态显示在中集群表的 “**维护**” 列中，表示需要应用于实例的任何维护事件的状态。 AWS 管理控制台 维护状态独立于其他实例的状态，并且源自 `PendingMaintenanceAction` API。有关维护状态的更多信息，请参阅[维护 Amazon DocumentDB](https://docs.aws.amazon.com/documentdb/latest/developerguide/db-instance-maintain.html)。

**Topics**
+ [实例状态值](#monitoring_docdb-instance_status-values)
+ [使用 AWS 管理控制台 或监控实例状态 AWS CLI](#monitoring-instance-status)
+ [实例运行状况值](#instance-health-status-values)
+ [使用监控实例的运行状况 AWS 管理控制台](#monitoring-instance-health-status)

## 实例状态值
<a name="monitoring_docdb-instance_status-values"></a>

下表列出实例的可能状态值以及如何对每个状态计费。其中显示是否对实例和存储计费、只对存储向您计费，还是不向您计费。对于所有实例状态，始终会针对备份用量向您计费。


| 实例状态 | 已计费 | 说明 | 
| --- | --- | --- | 
| available | 计费 | 实例正常和可用。 | 
| backing-up | 计费 | 当前正在备份实例。 | 
| configuring-log-exports | 计费 | 此实例已启用或禁用将 CloudWatch 日志文件发布到 Amazon Logs。 | 
| creating | 不计费 | 正在创建实例。无法访问正在创建的实例。 | 
| deleting | 不计费 | 正在删除实例。 | 
| failed | 不计费 | 实例已失败，Amazon DocumentDB 无法恢复其。要恢复数据，请 point-in-time恢复到实例的最新可恢复时间。 | 
| inaccessible-encryption-credentials | 不计费 | 无法访问用于加密或解密实例的密 AWS KMS 钥。 | 
| incompatible-network | 不计费 | Amazon DocumentDB 正尝试对实例执行恢复操作，但无法执行此操作，因为 VPC 正处于一种阻止此操作完成的状态。例如，如果子网中的所有可用 IP 地址都在使用中，并且 Amazon DocumentDB 无法为实例获取 IP 地址，就会出现此状态。 | 
| maintenance | 计费 | Amazon DocumentDB 正在对此实例应用维护更新。此状态用于 Amazon DocumentDB 预先计划的实例级别维护。我们将通过此状态评估向客户公开其他维护操作的方式。 | 
| modifying | 计费 | 按照请求正在修改实例。 | 
| rebooting | 计费 | 按照请求或需要重启实例的 Amazon DocumentDB 过程正在重启实例。 | 
| renaming | 计费 | 按照请求正在重命名实例。 | 
| resetting-master-credentials | 计费 | 按照请求正在重置实例的主凭证。 | 
| restore-error | 计费 | 该实例在尝试从快照恢复 point-in-time或时遇到错误。 | 
| starting | 对存储计费 | 实例正在启动。 | 
| stopped | 对存储计费 | 实例已停止。 | 
| stopping | 对存储计费 | 正在停止实例。 | 
| storage-full | 计费 | 实例超出了其存储分配容量。这是一种严重状态，应立即修复；请通过修改实例来扩展存储。将 Amazon CloudWatch 警报设置为在存储空间不足时向您发出警告，这样您就不会遇到这种情况。 | 

## 使用 AWS 管理控制台 或监控实例状态 AWS CLI
<a name="monitoring-instance-status"></a>

使用 AWS 管理控制台 或 AWS CLI 监控您的实例的状态。

------
#### [ Using the AWS 管理控制台 ]

使用 AWS 管理控制台 来确定群集的状态时，请按以下步骤操作。

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**集群**。
**注意**  
请注意，在集群导航框中，**集群标识符**列既显示集群又显示实例。实例列列于集群下，类似于下图。  
![\[Amazon DocumentDB 控制台中“集群”页面上的集群和实例列表。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/clusters.png)

1. 查找您感兴趣的实例的名称。然后，要查找实例的状态，请跨该行阅读至 **Status (状态)** 列，如下所示。  
![\[“状态”列，其中显示了“集群”页面上集群和实例的可用状态。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/instance-status.png)

------
#### [ Using the AWS CLI ]

使用确定 AWS CLI 集群状态时，请使用`describe-db-instances`操作。以下代码可查找实例 `sample-cluster-instance-01` 的状态。

对于 Linux、macOS 或 Unix：

```
aws docdb describe-db-instances \
          --db-instance-identifier sample-cluster-instance-01  \
          --query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceStatus]'
```

对于 Windows：

```
aws docdb describe-db-instances ^
          --db-instance-identifier sample-cluster-instance-01  ^
          --query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceStatus]'
```

此操作的输出将类似于下文。

```
[
          [
              "sample-cluster-instance-01",
              "available"
          ]
      ]
```

------

## 实例运行状况值
<a name="instance-health-status-values"></a>

下表列出了实例的可能运行状况值。**实例运行状况**列位于的 Clu **st** ers 表中 AWS 管理控制台，显示数据库引擎（负责存储、管理和检索数据的组件）是否运行正常。此列还指明中 CloudWatch提供的`EngineUptime`系统指标是否显示每个实例的运行状况。


| 实例运行状况 | 说明 | 
| --- | --- | 
| 正常 | 数据库引擎正在 Amazon DocumentDB 实例中运行。 | 
| 运行状况不佳 | 数据库引擎未在运行或已在不到一分钟前重启。 | 

## 使用监控实例的运行状况 AWS 管理控制台
<a name="monitoring-instance-health-status"></a>

使用 AWS 管理控制台 来监控您的实例的运行状况。

使用时 AWS 管理控制台，请按照以下步骤了解实例的运行状况。

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**集群**。
**注意**  
在**集群**导航框中，**集群标识符**列既显示集群又显示实例。实例列列于集群下，类似于下图。  
![\[Amazon DocumentDB 控制台中“集群”页面上的集群和实例列表。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/clusters.png)

1. 查找您感兴趣的实例的名称。然后，要查找实例的状态，请跨该行阅读至 **实例运行状况**列，如下图所示：  
![\[“实例运行状况”列，其中显示了“集群”页面上所列实例的正常和不正常状态。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/health-status-1.png)
**注意**  
实例运行状况轮询每 60 秒进行一次，轮询基于 CloudWatch `EngineUptime`系统指标。**实例运行状况**列中的值自动更新。

# 查看 Amazon DocumentDB 推荐
<a name="view-docdb-recommendations"></a>

Amazon DocumentDB 为数据库资源（如实例和集群）提供了一系列自动化推荐。这些推荐通过分析集群和实例配置来提供最佳实践指南。

有关这些推荐的示例，请参阅以下内容：


| Type | 描述 | 建议 | 附加信息 | 
| --- | --- | --- | --- | 
|  1 个 实例  |  集群仅包含一个实例  |  性能和可用性：我们建议添加另一个位于不同可用区中的具有相同数据库实例类的实例。  |  [Amazon DocumentDB 高可用性和复制](https://docs.aws.amazon.com/documentdb/latest/developerguide/replication.html)  | 

Amazon DocumentDB 在创建或修改资源时，为资源生成建议。Amazon DocumentDB 还定期扫描您的资源并生成建议。

**要查看 Amazon DocumentDB 推荐并依照其采取行动**

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**推荐**：  
![\[Amazon DocumentDB 控制台导航窗格，其中已选择“推荐”选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/recommendations-nav-1.png)

1. 在**推荐**对话框中，展开目的部分并选择推荐的任务。

   在以下示例中，推荐的任务适用于只有一个实例的 Amazon DocumentDB 集群。推荐应添加另一个实例以改善性能和可用性。  
![\[“推荐”表单，其中显示了为 Amazon DocumentDB 集群选择的推荐任务。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/recommendations-1.png)

1. 点击**立即应用**。

   对于此示例，将出现**添加实例**对话框：  
![\[具有实例设置选项的“添加实例”表单。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/add-instances-1.png)

1. 修改您的新实例设置，然后点击**创建**。

# 使用 Amazon DocumentDB 事件订阅
<a name="event-subscriptions"></a>

Amazon DocumentDB 使用 Amazon Simple Notification Service (Amazon SNS) 在发生 Amazon DocumentDB 事件时提供通知。这些通知可以采用 Amazon SNS 支持的任何形式 AWS 区域，例如电子邮件、短信或对 HTTP 终端节点的调用。

Amazon DocumentDB 将这些事件分组为您可以订阅的类型，以便您在出现该类事件时收取通知。您可以针对实例、集群、快照、集群快照或参数组订阅事件类别。例如，如果您订阅给定实例的 Backup 类别，那么无论何时出现影响该实例的备份相关事件，您都将收到通知。您还将在事件订阅更改时收到通知。

事件在集群和实例级别发生。所以，如果您针对集群或实例进行订阅，将可收到事件。

事件订阅会发送到您在创建订阅时提供的地址。您可能希望创建多个不同的订阅，如使用一个订阅接收所有事件通知，并使用另一个订阅仅接收针对生产数据库实例的关键事件。您无需删除订阅即可轻松关闭通知。为此，请在 Amazon DocumentDB 控制台中将“**启用**” 单选按钮设置为“**否**”。

**重要**  
Amazon DocumentDB 不保证在事件流中发送的事件的顺序。事件顺序可能会发生变化。

Amazon DocumentDB 使用 Amazon SNS 主题的 Amazon 资源名称 (ARN) 标识每个订阅。Amazon DocumentDB 控制台在您创建订阅时为您创建 ARN。

通过 Amazon SNS 对 Amazon DocumentDB 活动订阅计费。Amazon SNS 费用在使用事件通知时适用。有关更多信息，请参阅 Amazon Simple Notification Service Pricing。除了 Amazon SNS 费用外， Amazon DocumentDB 对活动订阅不计费。

**Topics**
+ [订阅活动](event-subscriptions.subscribe.md)
+ [管理订阅](event-subscriptions.managing.md)
+ [类别和消息](event-subscriptions.categories-messages.md)

# 订阅 Amazon DocumentDB 事件
<a name="event-subscriptions.subscribe"></a>

您可以如下使用 Amazon DocumentDB 控制台订阅活动订阅：

1. 登录到 a AWS 管理控制台 t [https://console.aws.amazon.com/docdb](https://console.aws.amazon.com/docdb)。

1. 在导航窗格中，选择**事件订阅**。  
![\[Amazon DocumentDB 控制台导航窗格突出显示了“事件订阅”选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-event-subs.png)

1. 在**事件订阅**窗格中，选择**创建事件订阅**。  
![\[“事件订阅”窗格突出显示了右上角的“创建事件订阅”按钮。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-create.png)

1. 在**创建事件订阅**对话框中，请执行以下操作：
   + 对于**名称**，输入事件通知订阅的名称。  
![\[“创建事件订阅”表单，其中显示了“详细信息”部分和“名称”输入字段。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-name.png)
   + 对于**目标**，选择您想要发送通知到何处。您可以选择现有 **ARN** 或者选择 “**新建电子邮件主题**”来输入主题的名称和收件人列表。  
![\[“目标”部分包含用于指定将通知发送至何处的选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-target.png)
   + 对于**源**，请选择一种源类型。根据选定源类型的情况，选择您希望接收来自 事件通知的事件类别和源。  
![\[“源”部分用于选择要从中接收事件通知的源类型。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-source.png)
   + 选择**创建**。  
![\[“源”部分，其中“创建”按钮位于右下角。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/subscribe-create-2.png)

# 管理 Amazon DocumentDB 事件通知订阅
<a name="event-subscriptions.managing"></a>

如果您在 Amazon DocumentDB 控制台的导航窗格中选择“**活动订阅**”，则可以查看订阅类别和您当前订阅的列表。您也可以修改或删除特定的订阅。

## 列出当前的 Amazon DocumentDB 事件通知订阅
<a name="event-subscriptions.modify"></a>

1. 登录到 a AWS 管理控制台 t [https://console.aws.amazon.com/docdb](https://console.aws.amazon.com/docdb)。

1. 在导航窗格中，选择**事件订阅**。**事件订阅**窗格中会显示您的所有事件通知订阅。  
![\[Amazon DocumentDB 控制台导航窗格突出显示了“事件订阅”选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/modify-event-subs.png)

1. 在**事件订阅**窗格中，选择您要修改的订阅，然后选择**编辑**。  
![\[“事件订阅”窗格，其中显示了所选订阅和“编辑”按钮。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/modify-edit.png)

1. 在**目标**或**来源**部分中对订阅进行更改。您可以通过在源部分选择或取消选择源标识符来添加或删除它们。  
![\[“修改事件订阅”表单突出显示了“目标”部分。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/modify-target.png)

1. 选择 **Modify**(修改)。Amazon DocumentDB 控制台会表明正在修改订阅。  
![\[“修改事件订阅”表单的末尾突出显示了“修改”按钮。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/modify-button.png)

## 删除 Amazon DocumentDB 事件通知订阅
<a name="event-subscriptions.delete"></a>

当您不再需要时，可以删除订阅。该主题的所有用户都将再也不会收到订阅指定的事件通知。

1. 登录到 a AWS 管理控制台 t [https://console.aws.amazon.com/docdb](https://console.aws.amazon.com/docdb)。

1. 在导航窗格中，选择**事件订阅**。  
![\[Amazon DocumentDB 控制台导航窗格突出显示了“事件订阅”选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/delete-event-subs.png)

1. 在 **事件订阅**窗格中，选择您希望删除的订阅。  
![\[“事件订阅”窗格，其中显示了所选订阅。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/delete-select.png)

1. 选择**删除**。  
![\[“事件订阅”窗格突出显示了“删除”按钮。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/delete-delete.png)

1. 将出现一个弹出窗口，询问您是否要永久删除此通知。选择**删除**。  
![\[用于确认删除事件订阅的对话框，其中右下角突出显示了“删除”按钮。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/event-subs/delete-delete-2.png)

# Amazon DocumentDB 事件类别和消息
<a name="event-subscriptions.categories-messages"></a>

Amazon DocumentDB 会在各种类型中生成许多事件，您可以使用 控制台对它们进行订阅。每个类别应用于一种源类型，可以是实例、集群、快照或参数组。

**注意**  
亚马逊 DocumentDB 使用现有的亚马逊 RDS 事件定义和。 IDs

## 源自实例的 Amazon DocumentDB 活动
<a name="event-subscriptions.db-origin"></a>


| 类别 | 说明 | 
| --- | --- | 
| 可用性 | 实例已重启。 | 
| 可用性 | 实例已关闭。 | 
| 配置更改 | 将修改应用于实例类。 | 
| 配置更改 | 已完成将修改应用于实例类。 | 
| 配置更改 | 重置主凭证。 | 
| 创建 | 实例已创建。 | 
| 删除 | 已删除实例 | 
| 失败 | 由于某个不兼容配置或底层存储问题，实例已失败。从 point-in-time-restore实例开始。 | 
| notification | 实例已停止。 | 
| notification | 实例已启动。 | 
| notification | 实例由于它超过最大允许停止的时间而正被启动。 | 
| 恢复 | 已启动实例的还原。恢复时间会随待恢复数据量的变化而变化。 | 
| 恢复 | 实例的恢复已完成。 | 
| 安全修补 | 操作系统更新可用于您的实例。有关应用更新的信息，请参阅[维护 Amazon DocumentDB](https://docs.aws.amazon.com/documentdb/latest/developerguide/db-instance-maintain.html)。 | 

## 源自集群的 Amazon DocumentDB 活动
<a name="event-subscriptions.cluster-origin"></a>


| 类别 | 说明 | 
| --- | --- | 
| 创建 | 已创建集群 | 
| 删除 | 已删除集群。 | 
| 故障转移 | 再次提升之前的主实例。 | 
| 故障转移 | 已完成到实例的失效转移。 | 
| 故障转移 | 已开始失效转移到数据库实例：%s | 
| 故障转移 | 已开始将同一可用区失效转移到实例：%s | 
| 故障转移 | 已开始跨可用区失效转移到实例：%s | 
| maintenance | 集群已修补。 | 
| maintenance | 数据库集群处于无法升级的状态：%s | 
| notification | 已停止集群。 | 
| notification | 已启动库集群。 | 
| notification | 集群停止失败。 | 
| notification | 集群将由于它超过最大允许停止的时间而正被启动。 | 
| notification | 将群集从 %s 重命名为 %s。 | 

## 源自集群快照的 Amazon DocumentDB 活动
<a name="event-subscriptions.snapshot-origin"></a>

下表显示了 Amazon DocumentDB 集群快照为源类型时的事件类别和事件列表。


| 类别 | 说明 | 
| --- | --- | 
| 备份 | 正在创建手动集群快照。 | 
| 备份 | 已创建手动集群快照。 | 
| 备份 | 正在创建自动集群快照。 | 
| 备份 | 已创建自动集群快照。 | 

## 源自参数组的 Amazon DocumentDB 活动
<a name="event-subscriptions.parameter"></a>

下表显示的是参数组为源类型时的事件类型和事件列表。


| 类别 | 说明 | 
| --- | --- | 
| 配置更改 | 借助应用方法 %s 将参数 %s 更新为 %s | 

# 使用以下方式监控亚马逊 DocumentDB CloudWatch
<a name="cloud_watch"></a>

Amazon DocumentDB（兼容 MongoDB）与亚马逊集成， CloudWatch 因此您可以收集和分析集群的运行指标。您可以使用 CloudWatch 控制台、Amazon DocumentDB 控制台、 AWS Command Line Interface (AWS CLI) 或 API 来监控这些指标。 CloudWatch

CloudWatch 还允许您设置警报，以便在指标值违反您指定的阈值时收到通知。您甚至可以设置 Amazon Ev CloudWatch ents，以便在发生违规行为时采取纠正措施。有关使用 CloudWatch 和警报的更多信息，请参阅 [Amazon CloudWatch 文档](https://docs.aws.amazon.com/cloudwatch/index.html)。

**Topics**
+ [Amazon DocumentDB 指标](#cloud_watch-metrics_list)
+ [查看 CloudWatch 数据](#cloud_watch-view_data)
+ [Amazon DocumentDB 维度](#cloud_watch-metrics_dimensions)
+ [监控 Opcounter 指标](#cloud_watch-monitoring_opcounters)
+ [监控数据库连接](#cloud_watch-monitoring_connections)

## Amazon DocumentDB 指标
<a name="cloud_watch-metrics_list"></a>

要监控 Amazon DocumentDB 集群和实例的运行状况和性能，您可以在 Amazon DocumentDB 控制台中查看以下指标。

**注意**  
下表中的指标适用于基于实例的集群和弹性集群。

**Topics**
+ [资源利用率指标](#resource-utilization)
+ [延迟指标](#latency-metrics)
+ [NVMe支持的实例指标](#nvme-metrics)
+ [操作指标](#operations-metrics)
+ [吞吐量指标](#throughput-metrics)
+ [系统指标](#system-metrics)
+ [T3 实例指标](#t3-instance-metrics)

### 资源利用率指标
<a name="resource-utilization"></a>


| 指标 | 说明 | 
| --- | --- | 
| BackupRetentionPeriodStorageUsed | 在 Amazon DocumentDB 的保留期内，用于支持 point-in-time还原功能的备份存储总量（以字节为单位）。包含在 TotalBackupStorageBilled 指标报告的总数中。针对每个 Amazon DocumentDB 集群单独计算。 | 
| ChangeStreamLogSize | 集群用于存储变更流日志的存储量（以兆字节为单位）。此值是集群总存储量的子集 (VolumeBytesUsed)，将影响集群的成本。有关存储定价信息，请参阅 [ Amazon DocumentDB 产品页面](https://aws.amazon.com//documentdb/pricing)。变更流日志大小取决于集群上发生了多少更改以及变更流日志的保留时间。有关变更流的更多信息，请参阅[将变更流与 Amazon DocumentDB 结合使用](change_streams.md)。 | 
| CPUUtilization | 实例占用的 CPU 百分比。 | 
| DatabaseConnections | 在以 1 分钟频率拍摄的实例上打开的连接（活动和空闲）数。 | 
| DatabaseConnectionsMax | 1 分钟内在实例上打开的最大数据库连接（活动和空闲）数。 | 
| DatabaseConnectionsLimit | 在任何给定时间，实例上允许的最大并发数据库连接（活动和空闲）数。 | 
| DatabaseCursors | 在以 1 分钟频率拍摄的实例上打开的光标数。 | 
| DatabaseCursorsMax | 1 分钟内实例上打开的最大光标数。 | 
| DatabaseCursorsLimit | 在任何给定时间，实例上允许的最大光标数。 | 
| DatabaseCursorsTimedOut | 在 1 分钟内超时的光标数。 | 
| FreeableMemory | 随机存取内存的可用量 (以字节为单位)。 | 
| FreeLocalStorage | 此指标报告每个实例中可用于临时表和日志的存储量。此值取决于实例类。您可通过为实例选择较大的实例类来增加对实例可用的存储空间量。（这不适用于 DocumentDB 无服务器。）  | 
| LowMemThrottleQueueDepth | 由于可用内存不足而受到限制的请求的队列深度，频率为 1 分钟。  | 
| LowMemThrottleMaxQueueDepth | 1 分钟内由于可用内存不足而受到限制的请求的最大队列深度。  | 
| LowMemNumOperationsThrottled | 1 分钟内由于可用内存不足而受到限制的请求数量。  | 
| SnapshotStorageUsed | 给定 Amazon DocumentDB 集群的所有快照在其备份保留时段外消耗的备份存储总量（以字节为单位）。包含在 TotalBackupStorageBilled 指标报告的总数中。针对每个 Amazon DocumentDB 集群单独计算。 | 
| SwapUsage | 实例上使用的交换空间的大小。 | 
| TotalBackupStorageBilled | 为给定 Amazon DocumentDB 集群计费时所针对的备份存储总量（以字节为单位）。包含由 BackupRetentionPeriodStorageUsed 和 SnapshotStorageUsed 指标度量的备份存储。针对每个 Amazon DocumentDB 集群单独计算。 | 
| TransactionsOpen | 在以 1 分钟频率拍摄的实例上打开的事务数量。 | 
| TransactionsOpenMax | 1 分钟内在实例上打开的最大事务数量。 | 
| TransactionsOpenLimit | 在任何给定时间，实例上允许的最大并发事务数。 | 
| VolumeBytesUsed | 您的集群使用的存储量（以字节为单位）。此值将影响集群的成本。有关定价信息，请参阅 [Amazon DocumentDB 定价页面](https://aws.amazon.com//documentdb/pricing)。 | 

### 延迟指标
<a name="latency-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| DBClusterReplicaLagMaximum | 数据库集群中主实例和每个 Amazon DocumentDB 实例之间的最大滞后量（以毫秒为单位）。 | 
| DBClusterReplicaLagMinimum | 集群中主实例和每个副本实例之间的最小滞后量（以毫秒为单位）。 | 
| DBInstanceReplicaLag | 在从主实例向副本实例复制更新时的滞后总量（以毫秒为单位）。 | 
| ReadLatency | 每次磁盘 I/O 操作所花费的平均时间。 | 
| WriteLatency | 每次磁盘操作所用的平均时间，以毫秒为单位。 I/O  | 

### NVMe支持的实例指标
<a name="nvme-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| NVMeStorageCacheHitRatio | 分层缓存所提供请求的百分比。 | 
| FreeNVMeStorage | 可用的临时存储 NVMe 量。 | 
| ReadIOPSNVMeStorage | 对临时 NVMe 存储进行磁盘读取 I/O 操作的平均次数。 | 
| ReadLatencyNVMeStorage | 临时 NVMe 存储每次磁盘读取 I/O 操作所花费的平均时间。 | 
| ReadThroughputNVMeStorage | 临时 NVMe 存储每秒从磁盘读取的平均字节数。 | 
| WriteIOPSNVMeStorage | 对临时 NVMe 存储进行磁盘写入 I/O 操作的平均次数。 | 
| WriteLatencyNVMeStorage | 临时 NVMe 存储每次磁盘写入 I/O 操作所花费的平均时间。 | 
| WriteThroughputNVMeStorage | 临时 NVMe 存储每秒写入磁盘的平均字节数。 | 

### 操作指标
<a name="operations-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| DocumentsDeleted | 1 分钟内删除的文档数量。 | 
| DocumentsInserted | 1 分钟内插入的文档数量。 | 
| DocumentsReturned | 1 分钟内返回的文档数量。 | 
| DocumentsUpdated | 1 分钟内更新的文档数量。 | 
| OpcountersCommand | 1 分钟内发出的命令数。 | 
| OpcountersDelete | 1 分钟内发出的删除操作数。 | 
| OpcountersGetmore | 1 分钟内发出的 getmore 数。 | 
| OpcountersInsert | 1 分钟内发出的插入操作数。 | 
| OpcountersQuery | 1 分钟内发出的查询数。 | 
| OpcountersUpdate | 1 分钟内发出的更新操作数。 | 
| TransactionsStarted | 1 分钟内在实例上启动的事务数量。 | 
| TransactionsCommitted | 1 分钟内在实例上提交的事务数量。 | 
| TransactionsAborted | 1 分钟内在实例上中止的事务数量。 | 
| TTLDeletedDocuments | 在 1 分钟内被删除 TTLMonitor 的文档数。 | 

### 吞吐量指标
<a name="throughput-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| NetworkReceiveThroughput | 集群中每个实例从客户端接收的网络吞吐量（以每秒字节数为单位）。此吞吐量不包括集群中的实例与集群卷之间的网络流量。 | 
| NetworkThroughput | Amazon DocumentDB 集群中每个实例从客户端接收和发送到客户端的网络吞吐量（以每秒字节数为单位）。此吞吐量不包括集群中的实例与集群卷之间的网络流量。 | 
| NetworkTransmitThroughput | 集群中每个实例发送到客户端的网络吞吐量（以每秒字节数为单位）。此吞吐量不包括集群中的实例与集群卷之间的网络流量。 | 
| ReadIOPS | 每秒磁盘读取 I/O 操作的平均次数。Amazon DocumentDB 每分钟分别报告一次读取和写入 IOPS。 | 
| ReadThroughput | 每秒从磁盘读取的平均字节数。 | 
| StorageNetworkReceiveThroughput | 集群中每个实例从 Amazon DocumentDB 集群存储卷接收的网络吞吐量（以每秒字节数为单位）。 | 
| StorageNetworkTransmitThroughput | 集群中每个实例发送到 Amazon DocumentDB 集群存储卷的网络吞吐量（以每秒字节数为单位）。 | 
| StorageNetworkThroughput | Amazon DocumentDB 集群中每个实例接收自和发送到 Amazon DocumentDB 集群存储卷的网络吞吐量（以每秒字节数为单位）。 | 
| VolumeReadIOPs |  集群卷的平均计费读取 I/O 操作数，每隔 5 分钟报告一次。计费读取操作数是在集群卷级别计算的，由集群中的所有实例聚合而来，然后每隔 5 分钟报告一次。此值是通过采用 5 分钟以上的读取操作数指标的值计算得来的。您可通过采用计费读取操作数指标的值并除以 300 秒来确定每秒的计费读取操作数。 例如，如果 `VolumeReadIOPs` 返回 13,686，则每秒的计费读取操作数为 45 (13,686 / 300 = 45.62)。 您累积请求不在缓冲区缓存中因而必须从存储加载的数据库页的查询的计费读取操作数。您可能看到计费读取操作数出现峰值，因为查询结果是从存储中读取然后加载到缓冲区缓存中的。  | 
| VolumeWriteIOPs |  集群卷的平均计费写入 I/O 操作数，每隔 5 分钟报告一次。计费写入操作数是在集群卷级别计算的，由集群中的所有实例聚合而来，然后每隔 5 分钟报告一次。此值是通过采用 5 分钟以上的写入操作数指标的值计算得来的。您可通过采用计费写入操作数指标的值并除以 300 秒来确定每秒的计费写入操作数。 例如，如果 `VolumeWriteIOPs` 返回 13686，则每秒的计费写入操作数为 45 (13686 / 300 = 45.62)。 请注意，`VolumeReadIOPs`和`VolumeWriteIOPs`指标是由 DocumentDB 存储层计算的，其中包括主实例和副本实例 IOs 执行的指标。数据每 20-30 分钟聚合一次，然后每隔 5 分钟报告一次，因此该时间段内该指标的数据点相同。如果您正在寻找与 1 分钟间隔内的插入操作相关联的指标，则可以使用实例级别 WriteIOPs 指标。该指标可在您的 Amazon DocumentDB 主实例的“monitoring”（监控）选项卡中找到。  | 
| WriteIOPS | 每秒磁盘写入 I/O 操作的平均次数。在集群级别使用时，WriteIOPs 会对集群中的所有实例进行评估。每分钟分别报告一次读取和写入 IOPS。 | 
| WriteThroughput | 每秒写入磁盘的平均字节数。 | 

### 系统指标
<a name="system-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| AvailableMVCCIds | 一个计数器，显示在达到零之前剩余的可用写入操作数。当此计数器达到零时，您的集群将进入只读模式，直到 IDs 被回收和回收。计数器会随着每次写入操作而减少，并随着垃圾收集回收旧的 M IDs VCC 而增加。 | 
| BufferCacheHitRatio | 缓冲区缓存提供的请求的百分比。 | 
| DiskQueueDepth | 等待写入磁盘或从磁盘读取的 I/O 操作数。 | 
| EngineUptime | 实例已运行的时间长度（以秒为单位）。 | 
| IndexBufferCacheHitRatio | 缓冲区缓存提供的指数请求的百分比。删除索引、集合或数据库后，您可能会立即看到该指标的峰值超过 100%。60 秒后自动更正。此限制将在未来的补丁更新中得到修复。 | 
| LongestActiveGCRuntime | 最长活动垃圾回收过程的持续时间（以秒为单位）。每分钟更新一次，仅跟踪活动操作，不包括在一分钟时段内完成的进程。 | 

### T3 实例指标
<a name="t3-instance-metrics"></a>


| 指标 | 说明 | 
| --- | --- | 
| CPUCreditUsage | 在测量周期内花费的 CPU 积分数。 | 
| CPUCreditBalance | 实例产生的 CPU 积分数量。在 CPU 突增以及 CPU 积分的花费速度比获得速度快时，该余额将用完。 | 
| CPUSurplusCreditBalance | 当余额值为零时，为维持 CPU 性能而花费的剩 CPUCredit余 CPU 积分数。 | 
| CPUSurplusCreditsCharged | 超过可在 24 小时内获得的 CPU 积分数上限的超额 CPU 积分数，因而会产生额外的费用。有关更多信息，请参阅 [ 监控您的 CPU 积分](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-performance-instances-monitoring-cpu-credits.html)。 | 

## 查看 CloudWatch 数据
<a name="cloud_watch-view_data"></a>

您可以使用 CloudWatch 控制台、亚马逊 DocumentDB 控制台 AWS Command Line Interface (AWS CLI) 或 API 查看亚马逊 CloudWatch 数据。 CloudWatch 

------
#### [ Using the AWS 管理控制台 ]

要使用 Amazon DocumentDB 管理控制台查看 CloudWatch 指标，请完成以下步骤。

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**集群**。
**提示**  
如果您在屏幕左侧没有看到导航窗格，请在页面左上角选择菜单图标 (![\[Hamburger menu icon with three horizontal lines.\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/docdb-menu-icon.png))。

1. 在集群导航框中，您将看到“**集群标识符**”列。您的实例列于集群下，类似于以下屏幕截图。  
![\[“集群”表，显示了如何在集群下嵌套实例。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/choose-clusters.png)

1. 从实例列表中选择要获取其指标的实例的名称。

1. 在生成的实例摘要页面中，选择 **Monitoring**（监控）选项卡，查看您的 Amazon DocumentDB 实例指标的图形表示。由于必须为每个指标生成图表，因此可能需要几分钟才能填充**CloudWatch**图表。

   下图显示了 Amazon DocumentDB 控制台中两个 CloudWatch 指标的图形表示，`WriteIOPS`以及。`ReadIOPS`  
![\[两个折线图，分别代表亚马逊 DocumentDB 控制台中的 WriteIOps 和 CloudWatch readiOps 指标。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/cw-metrics-con.png)

------
#### [ Using the CloudWatch Management Console ]

要使用 CloudWatch 管理控制台查看 CloudWatch 指标，请完成以下步骤。

1. 登录并打开亚马逊文档数据库控制台，网址为。 AWS 管理控制台[https://console.aws.amazon.com/cloudwatch](https://console.aws.amazon.com/cloudwatch)

1. 在导航窗格中，选择**指标**。然后，从服务名称列表中选择 **DocDB**。

1. 选择指标维度（例如 **集群指标**）。

1. **All metrics** 选项卡显示 **DocDB** 中该维度的所有指标。

   1. 要对表进行排序，请使用列标题。

   1. 要为指标绘制图表，请选中该指标旁的复选框。要选择所有指标，请选中表的标题行中的复选框。

   1. 要按指标筛选，请将鼠标悬停在指标名称上，然后选择指标名称旁边的下拉箭头。然后，选择 **Add to search**（增加到搜索），如下图所示。  
![\[所有指标选项卡列出了各项指标，并显示指标名称的下拉列表。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/cloudwatch-filter-metrics.png)

------
#### [ Using the AWS CLI ]

要查看 Amazon DocumentDB CloudWatch 的数据，请使用带有以下参数的 CloudWatch `get-metric-statistics`操作。

**参数**
+ **--namespace**：必需。需要其 CloudWatch 指标的服务命名空间。对于 Amazon DocumentDB 来说，必须为 `AWS/DocDB`。
+ **--metric-name**：必需。需要其数据的指标的名称。
+ **--start-time**：必需。用于确定要返回的第一个数据点的时间戳。

   包含指定的值；结果包括具有指定时间戳的数据点。时间戳必须采用 ISO 8601 UTC 格式（例如，2016-10-03T23:00:00Z）。
+ **--end-time**：必需。用于确定要返回的最后一个数据点的时间戳。

  包含指定的值；结果包括具有指定时间戳的数据点。时间戳必须采用 ISO 8601 UTC 格式（例如，2016-10-03T23:00:00Z）。
+ **--period**：必需。返回的数据点的粒度（以秒为单位）。对于具有常规精度的指标，期间可以短到一分钟（60 秒），并且必须为 60 的倍数。对于以小于一分钟的间隔收集的高精度指标，期间可以是 1、5、10、30、60 或 60 的任意倍数。
+ **--dimensions**— 可选。如果该指标包含多个维度，则必须为每个维度包含一个值。 CloudWatch 将每个唯一的维度组合视为一个单独的指标。如果未发布某个特定的维度组合，则无法检索该组合的统计数据。您必须指定创建指标时使用的同一维度。
+ **--statistics**：可选。百分位数之外的指标统计数据。对于百分位数统计数据，请使用 `ExtendedStatistics`。调用 `GetMetricStatistics` 时，必须指定 `Statistics` 或 `ExtendedStatistics`，但不能同时指定两者。

**允许的值：**
  + `SampleCount`
  + `Average`
  + `Sum`
  + `Minimum`
  + `Maximum`
+ **--extended-statistics**：可选。`percentile` 统计数据。指定介于 p0.0 到 p100 之间的值。调用 `GetMetricStatistics` 时，必须指定 `Statistics` 或 `ExtendedStatistics`，但不能同时指定两者。
+ **--unit**：可选。给定指标的单位。可以用多个单位报告指标。如果不提供单位，将返回所有单位。如果您仅指定指标不报告的单位，调用的结果将为空。

**可能的值：**
  + `Seconds`
  + `Microseconds`
  + `Milliseconds`
  + `Bytes`
  + `Kilobytes`
  + `Megabytes`
  + `Gigabytes`
  + `Terabytes`
  + `Bits`
  + `Kilobytes`
  + `Megabits`
  + `Gigabits`
  + `Terabits`
  + `Percent`
  + `Count`
  + `Bytes/Second`
  + `Kilobytes/Second`
  + `Megabytes/Second`
  + `Gigabytes/Second`
  + `Terabytes/Second`
  + `Bits/Second`
  + `Kilobits/Second`
  + `Megabits/Second`
  + `Gigabits/Second`
  + `Terabits/Second`
  + `Count/Second`
  + `None`

**Example**  
以下示例查找 2 小时时段的最大 `CPUUtilization`，每隔 60 秒采样一次。  
对于 Linux、macOS 或 Unix：  

```
aws cloudwatch get-metric-statistics \
       --namespace AWS/DocDB \
       --dimensions \
           Name=DBInstanceIdentifier,Value=docdb-2019-01-09-23-55-38 \
       --metric-name CPUUtilization \
       --start-time 2019-02-11T05:00:00Z \
       --end-time 2019-02-11T07:00:00Z \
       --period 60 \
       --statistics Maximum
```
对于 Windows：  

```
aws cloudwatch get-metric-statistics ^
       --namespace AWS/DocDB ^
       --dimensions ^
           Name=DBInstanceIdentifier,Value=docdb-2019-01-09-23-55-38 ^
       --metric-name CPUUtilization ^
       --start-time 2019-02-11T05:00:00Z ^
       --end-time 2019-02-11T07:00:00Z ^
       --period 60 ^
       --statistics Maximum
```
此操作的输出类似于以下内容：  

```
{
       "Label": "CPUUtilization",
       "Datapoints": [
           {
               "Unit": "Percent",
               "Maximum": 4.49152542374361,
               "Timestamp": "2019-02-11T05:51:00Z"
           },
           {
               "Unit": "Percent",
               "Maximum": 4.25000000000485,
               "Timestamp": "2019-02-11T06:44:00Z"
           },
           
           ********* some output omitted for brevity *********
           
           {
               "Unit": "Percent",
               "Maximum": 4.33333333331878,
               "Timestamp": "2019-02-11T06:07:00Z"
           }
       ]
   }
```

------

## Amazon DocumentDB 维度
<a name="cloud_watch-metrics_dimensions"></a>

Amazon DocumentDB 的指标由账户或操作的值来限定。您可以使用 CloudWatch 控制台检索按下表中任意维度筛选的 Amazon DocumentDB 数据。


| 维度 | 说明 | 
| --- | --- | 
| DBClusterIdentifier | 筛选您为特定 Amazon DocumentDB 集群请求的数据。 | 
| DBClusterIdentifier, Role | 筛选您为特定 Amazon DocumentDB 集群请求的数据，并按实例角色 (WRITER/READER) 聚合指标。例如，您可以聚合属于某个群集的所有 READER 实例的指标。 | 
| DBInstanceIdentifier | 筛选您为特定数据库实例请求的数据。 | 

## 监控 Opcounter 指标
<a name="cloud_watch-monitoring_opcounters"></a>

对于空闲集群，Opcounter 指标具有非零值（通常约为 50）。这是因为 Amazon DocumentDB 会定期执行运行状况检查、内部操作和指标收集任务。

## 监控数据库连接
<a name="cloud_watch-monitoring_connections"></a>

当你使用数据库引擎命令查看连接数时`db.runCommand( { serverStatus: 1 })`，你看到的连接数可能比你看到的`DatabaseConnections`多达 10 个 CloudWatch。发生这种情况的原因是，Amazon DocumentDB 执行定期运行状况检查和指标收集任务，而这些任务不记入 `DatabaseConnections`。`DatabaseConnections` 仅显示客户启动的连接数。

# 使用 AWS CloudTrail 记录 Amazon DocumentDB API 调用
<a name="logging-with-cloudtrail"></a>

Amazon DocumentDB（与 MongoDB 兼容）与 AWS CloudTrail 集成，后者是记录由 Amazon DocumentDB （与 MongoDB 兼容）中用户、角色或 AWS 服务所采取操作的服务。CloudTrail 将对 Amazon DocumentDB 的所有 AWS CLI API 调用作为事件捕获，包括来自 Amazon DocumentDB 控制台的调用、来自代码对 Amazon DocumentDB API 操作的调用。如果您创建跟踪，则可以使 CloudTrail 事件持续传送到 Amazon S3 桶（包括 Amazon DocumentDB 的事件）。如果您不配置跟踪，则仍可在 CloudTrail 控制台中的 **Event history （事件历史记录）** 中查看最新事件。使用 CloudTrail 收集的信息，您可以确定向 Amazon DocumentDB（与 MongoDB 兼容） 发出的请求内容、发出请求的 IP 地址、何人发出的请求、请求的发出时间以及其他详细信息。

**重要**  
对于某些管理功能，Amazon DocumentDB 使用与 Amazon Relational Database Service (Amazon RDS) 共享的操作技术。Amazon DocumentDB 控制台、AWS CLI 和 API 调用记录为对 Amazon RDS API 的调用。

要了解有关 AWS CloudTrail 的更多信息，请参阅 [AWS CloudTrail 用户指南](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/)。

## CloudTrail 中的 Amazon DocumentDB 信息
<a name="logging-with-cloudtrail-info-available"></a>

在您创建 AWS 账户时，将在该账户上启用 CloudTrail。当 Amazon DocumentDB（与 MongoDB 兼容）中发生活动时，该活动将记录在 CloudTrail 事件中，并与其它AWS服务事件一同保存在**事件历史记录**中。您可以在 AWS 账户 中查看、搜索和下载最新事件。有关更多信息，请参阅[使用 CloudTrail 事件历史记录查看事件](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events.html)。

要持续记录 AWS 账户 中的事件，包括 Amazon DocumentDB（与 MongoDB 兼容） 的事件，请创建跟踪记录。通过跟踪记录，CloudTrail 可将日志文件传送至 Amazon S3 存储桶。预设情况下，在控制台中创建跟踪记录时，此跟踪记录应用于所有AWS 区域。此跟踪记录在 AWS 分区中记录所有区域中的事件，并将日志文件传送至您指定的 Amazon S3 存储桶。此外，您可以配置其他 AWS 服务，进一步分析在 CloudTrail 日志中收集的事件数据并采取行动。有关更多信息，请参阅《AWS CloudTrail 用户指南》**中的以下主题：
+ [创建跟踪概述](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-create-and-update-a-trail.html)
+ [CloudTrail 支持的服务和集成](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-aws-service-specific-topics.html#cloudtrail-aws-service-specific-topics-integrations)
+ [为 CloudTrail 配置 Amazon SNS 通知](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/configure-sns-notifications-for-cloudtrail.html)
+ [接收多个区域中的 CloudTrail 日志文件](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/receive-cloudtrail-log-files-from-multiple-regions.html)
+ [接收多个账户中的 CloudTrail 日志文件](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-receive-logs-from-multiple-accounts.html)

每个事件或日志条目都包含有关生成请求的人员的信息。身份信息有助于您确定以下内容：
+ 请求是使用根凭证还是用户凭证发出的。
+ 请求是使用角色还是联合用户的临时安全凭证发出的。
+ 请求是否由其他 AWS 服务发出。

有关更多信息，请参阅 [CloudTrail userIdentity 元素](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference-user-identity.html)。

# 分析 Amazon DocumentDB 操作
<a name="profiling"></a>

可以使用 Amazon DocumentDB（与 MongoDB 兼容）中的分析器来记录在您集群上执行的操作的执行时间和详细信息。对于监控集群上速度最慢的操作以帮助您提高单个查询的性能和整体集群性能，分析器非常有用。

默认情况下，分析器功能处于禁用状态。启用后，Profiler 会将花费超过客户定义的阈值（例如 100 毫秒）的操作记录到 Ama CloudWatch zon Logs 中。记录的详细信息包括分析的命令、时间、计划摘要和客户端元数据。将操作记录到 CloudWatch 日志后，您可以使用 CloudWatch Logs Insights 来分析、监控和存档您的 Amazon DocumentDB 分析数据。[常见查询](#profiling.common-queries) 部分中提供了常见的查询。

在启用时，分析器会使用集群中的其他资源。我们建议您从较高的阈值（例如，500 毫秒）开始，然后逐步降低该值以确定缓慢的操作。对于高吞吐量应用程序，从 50 毫秒阈值开始会导致集群性能问题。分析器在集群级别启用，并对集群中的所有实例和数据库执行分析。Amazon DocumentDB 会尽力将操作记录到亚马逊 CloudWatch 日志。

尽管 Amazon DocumentDB 不会为启用分析器收取任何额外费用，但您需要按标准费率支付日志使用费。 CloudWatch 有关 CloudWatch 日志定价的信息，请参阅 [Amazon CloudWatch 定价](https://aws.amazon.com/cloudwatch/pricing/)。

**Topics**
+ [支持的操作](#profiling.supported-commands)
+ [限制](#profiling.limitations)
+ [启用分析器](#profiling.enable-profiling)
+ [禁用分析器](#profiling.disable-profiling)
+ [禁用分析器日志导出](#profiling.disabling-logs-export)
+ [访问分析器日志](#profiling.accessing)
+ [常见查询](#profiling.common-queries)

## 支持的操作
<a name="profiling.supported-commands"></a>

Amazon DocumentDB 分析器支持以下操作：
+ `aggregate`
+ `count`
+ `delete`
+ `distinct`
+ `find`（OP\$1QUERY 和命令）
+ `findAndModify`
+ `insert`
+ `update`

## 限制
<a name="profiling.limitations"></a>

仅当查询的整个结果集能够容纳在一个批处理中，并且结果集小于 16MB（最大 BSON 大小）时，慢速查询分析器才能够生成分析器日志。大于 16MB 的结果集会自动拆分为多个批处理。

大多数驱动程序或 Shell 可能会设置一个较小的默认批处理大小。您可以在查询中指定批处理大小。为了捕获慢速查询日志，我们建议设置一个超过您预期结果集大小的批处理大小。如果不确定结果集大小，或者结果集大小不同，也可以将批处理大小设置为较大的数字（例如，100k）。

但是，使用较大的批大小意味着在将响应发送到客户端之前，必须从数据库中检索更多结果。对于某些查询，这可能会在获得结果之前造成更长的延迟。如果您不打算使用整个结果集，则可能会花更多的钱 I/Os 来处理查询并丢弃结果。

## 启用 Amazon DocumentDB 分析器
<a name="profiling.enable-profiling"></a>

在集群上启用分析器的过程包含三个步骤。确保所有步骤都已完成，否则分析日志将不会发送到 CloudWatch 日志。分析器在集群级别设置，对集群的所有数据库和实例执行分析。

**在集群上启用分析器**

1. 由于您无法修改默认集群参数组，请确保您有可用的自定义集群参数组。有关更多信息，请参阅 [创建 Amazon DocumentDB 集群参数组](cluster_parameter_groups-create.md)。

1. 使用可用的自定义集群参数组，修改以下参数：`profiler`、`profiler_threshold_ms` 和 `profiler_sampling_rate`。有关更多信息，请参阅 [修改 Amazon DocumentDB 集群参数组](cluster_parameter_groups-modify.md)。

1. 创建或修改您的集群以使用自定义集群参数组并启用将`profiler`日志导出到 CloudWatch 日志的功能。

以下各节介绍如何使用 AWS 管理控制台 和 AWS Command Line Interface (AWS CLI) 实现这些步骤。

------
#### [ Using the AWS 管理控制台 ]

1. 开始之前，请先创建一个 Amazon DocumentDB 集群和一个自定义集群参数组（如果您还没有）。有关更多信息，请参阅[创建 Amazon DocumentDB 集群参数组](cluster_parameter_groups-create.md)和[创建 Amazon DocumentDB 集群](db-cluster-create.md)。

1. 使用可用的自定义集群参数组，修改以下参数。有关更多信息，请参阅 [修改 Amazon DocumentDB 集群参数组](cluster_parameter_groups-modify.md)。
   + `profiler`：启用或禁用查询分析。允许的值为 `enabled ` 和 `disabled`。默认值为 `disabled`。要启用分析，请将值设置为 `enabled`。
   + `profiler_threshold_ms`— 如果设置`profiler`为`enabled`，则所有花费时间超过其时间的命令` profiler_threshold_ms`都将被记录到 CloudWatch。允许的值为 `[50-INT_MAX]`。默认值为 `100`。
   + `profiler_sampling_rate`：应该分析或记录的缓慢操作的部分。允许的值为 `[0.0-1.0]`。默认值为 `1.0`。

1. 修改您的集群以使用自定义集群参数组，并将分析器日志导出设置为发布到 Amazon CloudWatch。

   1. 在导航窗格中，选择 **Clusters (集群)** 以将自定义参数组添加到集群。

   1. 选择要与您的参数组关联的集群名称左边的按钮。选择 **Actions (操作)**，然后选择 **Modify (修改)** 以修改您的集群。

   1. 在 **Cluster options (集群选项)** 下，选择上一步中的自定义参数组以将其添加到集群中。

   1. 在 “**日志导出**” 下，选择要发布到 Amazon CloudWatch 的 **Profiler 日志**。

   1. 选择 **Continue (继续)** 以查看修改摘要。

   1. 在确认您的更改后，您可以立即应用这些更改，也可以在 **Scheduling of modifications (修改计划)** 下的下一个维护时段内应用这些更改。

   1. 选择 **Modify cluster (修改集群)** 以使用新参数组更新您的集群。

------
#### [ Using the AWS CLI ]

以下过程对集群 `sample-cluster` 上的所有支持操作启用分析器。

1. 在开始之前，请运行以下命令，并查看对于名称中不包含 `default` 且具有 `docdb3.6` 作为参数组系列的集群参数组的输出，以确保您拥有可用的自定义集群参数组。如果您没有非默认集群参数组，请参阅[创建 Amazon DocumentDB 集群参数组](cluster_parameter_groups-create.md)。

   ```
   aws docdb describe-db-cluster-parameter-groups \
       --query 'DBClusterParameterGroups[*].[DBClusterParameterGroupName,DBParameterGroupFamily]'
   ```

   在以下输出中，仅 `sample-parameter-group ` 满足这两个条件。

   ```
   [
          [
              "default.docdb3.6",
              "docdb3.6"
          ],
          [
              "sample-parameter-group",
              "docdb3.6"
          ]
   ]
   ```

1. 使用您的自定义集群参数组，修改以下参数。
   + `profiler`：启用或禁用查询分析。允许的值为 `enabled ` 和 `disabled`。默认值为 `disabled`。要启用分析，请将值设置为 `enabled`。
   + `profiler_threshold_ms`— 如果设置`profiler`为`enabled`，则所有命令花费的时间`profiler_threshold_ms`都超过了记录到的时间 CloudWatch。允许的值为 `[50-INT_MAX]`。默认值为 `100`。
   + `profiler_sampling_rate`：应该分析或记录的缓慢操作的部分。允许的值为 `[0.0-1.0]`。默认值为 `1.0`。

   ```
   aws docdb modify-db-cluster-parameter-group \
       --db-cluster-parameter-group-name sample-parameter-group \
       --parameters ParameterName=profiler,ParameterValue=enabled,ApplyMethod=immediate \
                    ParameterName=profiler_threshold_ms,ParameterValue=100,ApplyMethod=immediate \
                    ParameterName=profiler_sampling_rate,ParameterValue=0.5,ApplyMethod=immediate
   ```

1. 修改您的 Amazon DocumentDB 集群，使其使用上一步中提到的 `sample-parameter-group` 自定义集群参数组，并将参数 `--enable-cloudwatch-logs-exports` 设置为 `profiler`。

   以下代码修改群集`sample-cluster`以使用上一步`sample-parameter-group`中的，并`profiler`添加到已启用的 CloudWatch 日志导出中。

   ```
   aws docdb modify-db-cluster \
          --db-cluster-identifier sample-cluster \
          --db-cluster-parameter-group-name sample-parameter-group \
          --cloudwatch-logs-export-configuration '{"EnableLogTypes":["profiler"]}'
   ```

   此操作的输出将类似于下文。

   ```
   {
       "DBCluster": {
           "AvailabilityZones": [
               "us-east-1c",
               "us-east-1b",
               "us-east-1a"
           ],
           "BackupRetentionPeriod": 1,
           "DBClusterIdentifier": "sample-cluster",
           "DBClusterParameterGroup": "sample-parameter-group",
           "DBSubnetGroup": "default",
           "Status": "available",
           "EarliestRestorableTime": "2020-04-07T02:05:12.479Z",
           "Endpoint": "sample-cluster.node.us-east-1.docdb.amazonaws.com",
           "ReaderEndpoint": "sample-cluster.node.us-east-1.docdb.amazonaws.com",
           "MultiAZ": false,
           "Engine": "docdb",
           "EngineVersion": "3.6.0",
           "LatestRestorableTime": "2020-04-08T22:08:59.317Z",
           "Port": 27017,
           "MasterUsername": "test",
           "PreferredBackupWindow": "02:00-02:30",
           "PreferredMaintenanceWindow": "tue:09:50-tue:10:20",
           "DBClusterMembers": [
               {
                   "DBInstanceIdentifier": "sample-instance-1",
                   "IsClusterWriter": true,
                   "DBClusterParameterGroupStatus": "in-sync",
                   "PromotionTier": 1
               },
               {
                   "DBInstanceIdentifier": "sample-instance-2",
                   "IsClusterWriter": true,
                   "DBClusterParameterGroupStatus": "in-sync",
                   "PromotionTier": 1
               }
           ],
           "VpcSecurityGroups": [
               {
                   "VpcSecurityGroupId": "sg-abcd0123",
                   "Status": "active"
               }
           ],
           "HostedZoneId": "ABCDEFGHIJKLM",
           "StorageEncrypted": true,
           "KmsKeyId": "arn:aws:kms:us-east-1:<accountID>:key/sample-key",
           "DbClusterResourceId": "cluster-ABCDEFGHIJKLMNOPQRSTUVWXYZ",
           "DBClusterArn": "arn:aws:rds:us-east-1:<accountID>:cluster:sample-cluster",
           "AssociatedRoles": [],
           "ClusterCreateTime": "2020-01-10T22:13:38.261Z",
           "EnabledCloudwatchLogsExports": [
               "profiler"
           ],
           "DeletionProtection": true
       }
   }
   ```

------

## 禁用 Amazon DocumentDB 分析器
<a name="profiling.disable-profiling"></a>

要禁用探查器，您可以禁用`profiler`参数和将日志导出到`profiler`日 CloudWatch 志。

### 禁用分析器
<a name="profiling.disable-profiler"></a>

您可以使用 AWS 管理控制台 或禁用该`profiler`参数 AWS CLI，如下所示。

------
#### [ Using the AWS 管理控制台 ]

以下过程使用来禁用 Amazon Documen `profiler` tDB。 AWS 管理控制台 

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**参数组**。然后选择您要在其上禁用分析器的集群参数组的名称。

1. 在生成的 **Cluster parameters (集群参数)** 页面中，选择 `profiler` 参数左侧的按钮，然后选择 **Edit (编辑)**。

1. 在 **Modify Profiler (修改分析器)** 对话框中，在列表中选择 `disabled`。

1. 选择 **Modify cluster parameter (修改集群参数)**。

------
#### [ Using the AWS CLI ]

要使用 AWS CLI在集群上禁用 `profiler`，请如下所示修改集群。

```
aws docdb modify-db-cluster-parameter-group \
    --db-cluster-parameter-group-name sample-parameter-group \
    --parameters ParameterName=profiler,ParameterValue=disabled,ApplyMethod=immediate
```

------

## 禁用分析器日志导出
<a name="profiling.disabling-logs-export"></a>

您可以使用 AWS 管理控制台 或禁用将`profiler` CloudWatch 日志导出到日志 AWS CLI，如下所示。

------
#### [ Using the AWS 管理控制台 ]

以下过程使用禁用 Amazon DocumentDB AWS 管理控制台 将日志导出到。 CloudWatch

1. [在 /docdb 上打开亚马逊 DocumentDB 控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 在导航窗格中，选择**集群**。选择要禁用导出日志的集群名称左侧的按钮。

1. 在 **Actions (操作)** 菜单上，选择 **Modify (修改)**。

1. 向下滚动到 **Log exports (日志导出)** 部分并取消选择 **Profiler logs (分析器日志)**。

1. 选择**继续**。

1. 检查更改，然后选择何时将该更改应用到集群：
   + **Apply during the next scheduled maintenance window (在下一个计划的维护时段内应用)**
   + **Apply immediately (立即应用)**

1. 选择**修改集群**。

------
#### [ Using the AWS CLI ]

以下代码修改集群`sample-cluster `并禁用 CloudWatch 探查器日志。

**Example**  
对于 Linux、macOS 或 Unix：  

```
aws docdb modify-db-cluster \
   --db-cluster-identifier sample-cluster \
   --cloudwatch-logs-export-configuration '{"DisableLogTypes":["profiler"]}'
```
对于 Windows：  

```
aws docdb modify-db-cluster ^
   --db-cluster-identifier sample-cluster ^
   --cloudwatch-logs-export-configuration '{"DisableLogTypes":["profiler"]}'
```
此操作的输出将类似于下文。  

```
{
    "DBCluster": {
        "AvailabilityZones": [
            "us-east-1c",
            "us-east-1b",
            "us-east-1a"
        ],
        "BackupRetentionPeriod": 1,
        "DBClusterIdentifier": "sample-cluster",
        "DBClusterParameterGroup": "sample-parameter-group",
        "DBSubnetGroup": "default",
        "Status": "available",
        "EarliestRestorableTime": "2020-04-08T02:05:17.266Z",
        "Endpoint": "sample-cluster.node.us-east-1.docdb.amazonaws.com",
        "ReaderEndpoint": "sample-cluster.node.us-east-1.docdb.amazonaws.com",
        "MultiAZ": false,
        "Engine": "docdb",
        "EngineVersion": "3.6.0",
        "LatestRestorableTime": "2020-04-09T05:14:44.356Z",
        "Port": 27017,
        "MasterUsername": "test",
        "PreferredBackupWindow": "02:00-02:30",
        "PreferredMaintenanceWindow": "tue:09:50-tue:10:20",
        "DBClusterMembers": [
            {
                "DBInstanceIdentifier": "sample-instance-1",
                "IsClusterWriter": true,
                "DBClusterParameterGroupStatus": "in-sync",
                "PromotionTier": 1
            },
            {
                "DBInstanceIdentifier": "sample-instance-2",
                "IsClusterWriter": true,
                "DBClusterParameterGroupStatus": "in-sync",
                "PromotionTier": 1
            }
        ],
        "VpcSecurityGroups": [
            {
                "VpcSecurityGroupId": "sg-abcd0123",
                "Status": "active"
            }
        ],
        "HostedZoneId": "ABCDEFGHIJKLM",
        "StorageEncrypted": true,
        "KmsKeyId": "arn:aws:kms:us-east-1:<accountID>:key/sample-key",
        "DbClusterResourceId": "cluster-ABCDEFGHIJKLMNOPQRSTUVWXYZ",
        "DBClusterArn": "arn:aws:rds:us-east-1:<accountID>:cluster:sample-cluster",
        "AssociatedRoles": [],
        "ClusterCreateTime": "2020-01-10T22:13:38.261Z",
        "DeletionProtection": true
    }
}
```

------

## 访问您的 Amazon DocumentDB 分析器日志
<a name="profiling.accessing"></a>

按照以下步骤访问您在Amazon上的个人资料日志 CloudWatch。

1. 打开 CloudWatch 控制台，网址为[https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/)。

1. 确保您与 Amazon DocumentDB 集群位于同一区域。

1. 在导航窗格中，选择**日志**。

1. 要查找集群的分析器日志，请在列表中选择 `/aws/docdb/yourClusterName/profiler`。

   此时，每个实例名称的下方将显示该实例的分析日志。

## 常见查询
<a name="profiling.common-queries"></a>

以下是您可以用来分析您的已分析命令的常见查询。有关 [Lo CloudWatch gs Insights 的更多信息，请参阅使用 CloudWatch 日志见解和[示例查询](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax-examples.html)分析日志数据](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html)。

### 获取指定集合上最慢的 10 个操作
<a name="profiling.common-queries.slow-queries-on-collection"></a>

```
filter ns="test.foo" | sort millis desc | limit 10
```

### 获取集合上用时超过 60 毫秒的所有更新操作
<a name="profiling.common-queries.updates-gt-60-ms"></a>

```
filter millis > 60 and op = "update"
```

### 获取上个月最慢的 10 个操作
<a name="profiling.common-queries.slow-queries-last-month"></a>

```
sort millis desc | limit 10
```

### 获取具有 COLLSCAN 计划摘要的所有查询
<a name="profiling.common-queries.collscan-plan-summary"></a>

```
filter planSummary="COLLSCAN"
```

# 使用 Performance Insights 进行监控
<a name="performance-insights"></a>

Performance Insights 添加到现有的 Amazon DocumentDB 监控功能中，以展示您的集群性能并帮助您分析影响集群性能的任何问题。利用 Performance Insights 控制面板，您可以可视化数据库负载并按等待状态、查询语句、主机或应用来筛选负载。

**注意**  
Performance Insights 仅适用于亚马逊 DocumentDB 3.6、4.0、5.0 和 8.0 基于实例的集群。

**它有何用处？**
+ 可视化数据库性能：可视化负载以确定负载在数据库上的时间和位置
+ 确定导致数据库负载的原因：确定哪些查询、主机和应用程序导致了实例上的负载
+ 确定数据库何时出现负载 ：放大 Performance Insights 控制面板以关注特定事件，或缩小以查看更大时间跨度的趋势
+ 数据库加载警报 — 自动访问新的数据库负载指标，您可以从中 CloudWatch 监控数据库负载指标以及其他 Amazon DocumentDB 指标并对其设置警报

**Amazon DocumentDB Performance Insights 有哪些局限性？**
+  AWS GovCloud （美国东部）和 AWS GovCloud （美国西部）地区的 Performance Insights 不可用
+ Performance Insights for Amazon DocumentDB 最多可保留 7 天的性能数据
+ 长度超过 1024 字节的查询不会在性能详情中聚合

**Topics**
+ [Performance Insights 概念](performance-insights-concepts.md)
+ [启用和禁用 Performance Insights](performance-insights-enabling.md)
+ [为 Performance Insights 配置访问策略](performance-insights-policies.md)
+ [使用 Performance Insights 控制面板分析指标](performance-insights-analyzing.md)
+ [使用 Performance Insights API 检索指标](performance-insights-metrics.md)
+ [Performance Insights 的亚马逊 CloudWatch 指标](performance-insights-cloudwatch.md)
+ [Performance Insights 的计数器指标](performance-insights-counter-metrics.md)

# Performance Insights 概念
<a name="performance-insights-concepts"></a>

**Topics**
+ [平均活动会话数](#performance-insights-concepts-sessions)
+ [Dimensions](#performance-insights-concepts-dimensions)
+ [最大 vCPU](#performance-insights-concepts-maxvcpu)

## 平均活动会话数
<a name="performance-insights-concepts-sessions"></a>

数据库负载（数据库负载）衡量数据库中的活动级别。Performance Insights 的关键指标是 `DB Load`，每秒收集一次。`DBLoad` 指标的单位是 Amazon DocumentDB 实例的*平均活动会话数 (AAS)*。

*活动*会话是已将作业提交到 Amazon DocumentDB 实例并且正在等待响应的连接。例如，如果您将查询提交到 Amazon DocumentDB 实例，则数据库会话在实例处理该查询时将处于活动状态。

为了获取平均活动会话数，Performance Insights 会对同时运行查询的会话数进行采样。平均活动会话数是会话总数除以样本总数。下表显示了正在运行的查询的五个连续示例。


| 示例 | 运行查询的会话数 | AAS | 计算 | 
| --- | --- | --- | --- | 
|  1  |  2  |  2  |  2 个会话/1 个样本  | 
|  2  |  0  |  1  |  2 个会话/2 个样本  | 
|  3  |  4  |  2  |  6 个会话/3 个样本  | 
|  4  |  0  |  1.5  |  6 个会话/4 个样本  | 
|  5  |  4  |  2  |  10 个会话/5 个样本  | 

在上一示例中，1-5 时间间隔的数据库负载为 2 AAS。数据库负载的增加意味着，平均而言数据库上运行的会话更多。

## Dimensions
<a name="performance-insights-concepts-dimensions"></a>

`DB Load` 指标不同于其他时间序列指标，因为您可以将它分为称为维度的子组件。您可以将维度视为 `DB Load` 指标的不同特征的类别。诊断性能问题时，最有用的维度是**等待状态**和**主要查询**。

**等待状态**  
*等待状态* 会导致查询语句等待特定事件发生，然后才能继续运行。例如，查询语句可能会一直等到已锁定的资源得到解锁。通过结合使用 `DB Load` 和等待状态，您可以全面了解会话状态。以下是各种 Amazon DocumentDB 等待状态：


| Amazon DocumentDB 等待状态 | 等待状态描述 | 
| --- | --- | 
|  Latch  |  当会话等待分页缓冲池时，就会出现 Latch 等待状态。当系统频繁处理大型查询、集合扫描或缓冲池太小而无法处理工作集时，频繁分页和退出缓冲池的情况可能会更频繁。  | 
| CPU |  当会话在 CPU 上等待时，就会出现 CPU 等待状态。  | 
|  CollectionLock  |  当会话 CollectionLock 等待获取集合锁定时，就会出现等待状态。当对集合进行 DDL 操作时，就会发生这些事件。  | 
| DocumentLock |  当会话 DocumentLock 等待获取文档锁时，会出现等待状态。对同一文档进行大量并发写入将导致该文档的 DocumentLock等待状态增加。  | 
|  SystemLock  |  当 SystemLock 会话在系统上等待时，就会出现等待状态。当系统上频繁出现长时间运行的查询、长时间运行的事务或高并发时，可能会发生这种情况。  | 
|  IO  |  当会话等待 IO 完成时，就会出现 IO 等待状态。  | 
|  BufferLock  |  当会话 BufferLock 等待获取缓冲区中共享页面的锁时，就会出现等待状态。 BufferLock如果其他进程在请求的页面上持有打开的游标，则等待状态可能会延长。  | 
|  LowMemThrottle  |  由于 Amazon DocumentDB 实例的内存压力过大而导致会话处于 LowMemThrottle 等待状态时，就会出现等待状态。如果此状态持续很长时间，请考虑纵向扩展实例以提供额外的内存。有关更多信息，请参阅[资源管理器](https://docs.aws.amazon.com/documentdb/latest/developerguide/how-it-works.html)。  | 
|  BackgroundActivity  |  当会话正在 BackgroundActivity 等待内部系统进程时，会出现等待状态。  | 
|  其他  |  其他等待状态是内部等待状态。如果此状态持续很长时间，请考虑终止此查询。有关更多信息，请参阅[如何查找并终止长时间运行或受阻的查询？](https://docs.aws.amazon.com/documentdb/latest/developerguide/user_diagnostics.html#user_diagnostics-query_terminating.html)  | 

**主要查询**  
等待状态太显示瓶颈，主要查询则显示哪些查询对数据库负载的贡献最大。例如，当前可能正在数据库上运行许多查询，但单个查询可能会占用 99% 的数据库负载。在这种情况下，高负载可能表示查询存在问题。

## 最大 vCPU
<a name="performance-insights-concepts-maxvcpu"></a>

在控制面板中，**数据库负载**图表会收集、聚合和显示会话信息。要查看活动会话是否超过最大 CPU，请查看它们与**最大 vCPU** 线的关系。**最大 vCPU** 值由 Amazon DocumentDB 实例的 vCPU（虚拟 CPU）内核数决定。

如果数据库负载经常高于**最大 vCPU** 线并且主要等待状态为 CPU，则表示 CPU 过载。在这种情况下，您可能需要限制与实例的连接数，优化具有高 CPU 负载的任何查询，或考虑使用更大的实例类。如果始终有大量实例处于任何等待状态，则表示可能存在要解决的瓶颈或资源争用问题。即使数据库负载未越过**最大 vCPU** 线，也可能会出现此问题。

# 启用和禁用 Performance Insights
<a name="performance-insights-enabling"></a>

要使用 Performance Insights，请在数据库实例中启用它。如果需要，您可以稍后将其禁用。启用和禁用 Performance Insights 不会导致停机、重新启动或故障转移。

性能详情代理占用数据库主机上有限的 CPU 和内存。当数据库负载较高时，代理将通过降低收集数据的频率来限制性能影响。

## 在创建集群时启用 Performance Insights
<a name="performance-insights-enabling-create-instance"></a>

在控制台中，您可以在创建或修改新数据库实例时启用或禁用 Performance Insights。

### 使用 AWS 管理控制台
<a name="create-instance-console"></a>

在控制台中，您可以在创建 Amazon DocumentDB 集群时启用 Performance Insights。在创建新 Amazon DocumentDB 集群时，通过在 **Performance Insights** 部分中选择**启用 Performance Insights** 以启用 Performance Insights。

**控制台说明**

1. 有关创建集群的说明，请参阅[创建 Amazon DocumentDB 集群](https://docs.aws.amazon.com/documentdb/latest/developerguide/db-cluster-create.html)中的说明。

1. 在 Performance Insights 部分中选择**启用 Performance Insights**。  
![\[“性能详情”部分，其中已选择“启用性能详情”。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/select-performance-insights.png)
**注意**  
Performance Insights 的数据留存期将为七天。

   ** AWS KMS 密钥**-指定您的 AWS KMS 密钥。Performance Insights 使用您的 AWS KMS 密钥加密所有潜在的敏感数据。正在传输的数据和静态数据都会被加密。有关更多信息，请参阅为 Performance Insights 配置 AWS AWS KMS 策略。

## 修改实例时启用和禁用
<a name="performance-insights-enabling-modify-instance"></a>

您也可以修改数据库实例以使用控制台或 AWS CLI启用或禁用 Performance Insights。

------
#### [ Using the AWS 管理控制台 ]

**控制台说明**

1. [登录 AWS 管理控制台，然后在 /docdb 上打开亚马逊文档数据库控制台。https://console.aws.amazon.com](https://console.aws.amazon.com/docdb)

1. 选择**集群**。

1. 选择一个数据库实例，然后选择**修改**。

1. 在 Performance Insights 部分，选择**启用 Performance Insights** 或**禁用 Performance Insights**。
**注意**  
如果选择 “**启用 Performance Insights**”，则可以指定 AWS AWS KMS 密钥。Performance Insights 使用您的 AWS KMS 密钥加密所有潜在的敏感数据。正在传输的数据和静态数据都会被加密。有关更多信息，请参阅[加密 Amazon DocumentDB 静态数据](https://docs.aws.amazon.com/documentdb/latest/developerguide/encryption-at-rest.html)。

1. 选择**继续**。

1. 对于**修改计划**，选择**立即应用**。如果您选择**在下一个计划的维护时段内应用**，则您的实例将忽略此设置并立即启用 Performance Insights。

1. 选择**修改实例**。

------
#### [ Using the AWS CLI ]

使用`create-db-instance`或`modify-db-instance` AWS AWS CLI 命令时，您可以通过指定来启用 Performance Insights`--enable-performance-insights`，也可以通过指定将其禁用`--no-enable-performance-insights`。

以下过程介绍如何使用 AWS AWS CLI为数据库实例启用或禁用 Performance Insights。


**AWS AWS CLI 指令**

调用`modify-db-instance` AWS AWS CLI 命令并提供以下值：
+ `--db-instance-identifer`：数据库实例的名称
+ `--enable-performance-insights` 以启用，或 `--no-enable-performance-insights` 以禁用

**Example**  
以下示例为 `sample-db-instance` 启用 Performance Insights：  

```
aws docdb modify-db-instance \
    --db-instance-identifier sample-db-instance \
    --enable-performance-insights
```

```
aws docdb modify-db-instance ^
    --db-instance-identifier sample-db-instance ^
    --enable-performance-insights
```

------

# 为 Performance Insights 配置访问策略
<a name="performance-insights-policies"></a>

要访问 Performance Insights，您必须拥有 AWS Identity and Access Management （IAM）的相应权限。您可以使用以下选项来授予访问权限：
+ 将 `AmazonRDSPerformanceInsightsReadOnly` 托管式策略附加到权限集或角色。
+ 创建自定义 IAM policy 并将其附加到权限集或角色。

此外，如果您在启用 Performance Insights 时指定了客户托管密钥，请确保账户中的用户对 KMS 密钥具有 `kms:Decrypt` 和 `kms:GenerateDataKey` 权限。

**注意**  
[在 encryption-at-rest AWS KMS 密钥和安全组管理方面，Amazon DocumentDB 利用了与 Amazon RDS 共享的操作技术。](https://aws.amazon.com/rds)

## 将 Amazon RDSPerformance InsightsReadOnly 政策附加到 IAM 委托人
<a name="USER_PerfInsights.access-control.IAM-principal"></a>

`AmazonRDSPerformanceInsightsReadOnly`是一项 AWS托管策略，允许访问亚马逊 DocumentDB Performance Insights API 的所有只读操作。目前，此 API 中的所有操作均为只读。如果将 `AmazonRDSPerformanceInsightsReadOnly` 附加到权限集或角色，接收人可以使用 Performance Insights 以及其他控制台功能。

## 为 Performance Insights 创建自定义 IAM policy
<a name="USER_PerfInsights.access-control.custom-policy"></a>

对于没有 `AmazonRDSPerformanceInsightsReadOnly` 策略的用户，您可以通过创建或修改用户托管 IAM policy 来授予对 Performance Insights 的访问权限。当您将策略附加到一个权限集或角色时，接收人可以使用 Performance Insights。

**创建自定义策略**

1. 使用 [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/) 打开 IAM 控制台。

1. 在导航窗格中，选择**策略**。

1. 选择 **Create policy (创建策略)**。

1. 在**创建策略**页面上，选择“JSON”选项卡。

1. 复制并粘贴以下文本，*us-east-1*替换为您 AWS 所在地区的*111122223333*名称和您的客户账号。

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Action": "rds:DescribeDBInstances",
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": "rds:DescribeDBClusters",
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:DescribeDimensionKeys",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:GetDimensionKeyDetails",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:GetResourceMetadata",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:GetResourceMetrics",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:ListAvailableResourceDimensions",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           },
           {
               "Effect": "Allow",
               "Action": "pi:ListAvailableResourceMetrics",
               "Resource": "arn:aws:pi:us-east-1:111122223333:metrics/rds/*"
           }
       ]
   }
   ```

------

1. 选择**查看策略**。

1. 为策略提供名称并可以选择提供描述，然后选择**创建策略**。

现在，可以将策略附加到权限集或角色。以下过程假设您已经有一个可用于此目的的用户。

**将策略附加到用户**

1. 使用 [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/) 打开 IAM 控制台。

1. 在导航窗格中，选择 **Users**。

1. 从列表中选择现有用户。
**重要**  
要使用 Performance Insights，请确保除了自定义策略之外，您还有权访问 Amazon DocumentDB。例如，**AmazonDocDBReadOnlyAccess**预定义策略提供对 Amazon Docdb 的只读访问权限。有关更多信息，请参阅使用策略[管理访问权限](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAM.html#security_iam_access-manage)。

1. 在 **Summary (摘要)** 页上，选择 **Add permissions (添加权限)**。

1. 选择**直接附加现有策略**。对于 **Search**，键入策略名称的前几个字符，如下所示。  
![\[选择一个策略\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/pi-add-permissions.png)

1. 选择策略，然后选择 **Next: Review**。

1. 选择 **Add permissions (添加权限)**。

## 为 Performan AWS KMS ce Insights 配置策略
<a name="USER_PerfInsights.access-control.cmk-policy"></a>

Performan AWS KMS key ce Insights 使用加密敏感数据。当您通过 API 或控制台启用 Performance Insights 时，您可以选择以下选项：
+ 选择默认值 AWS 托管式密钥。

  Amazon DocumentDB 使用您的新数据库实例。 AWS 托管式密钥 亚马逊 DocumentDB 会 AWS 托管式密钥 为您的 AWS 账户创建一个。您的亚马逊文档数据库 AWS 账户在每个 AWS 区域都有不同的 AWS 托管式密钥 账户。
+ 选择客户托管密钥。

  如果您指定一个客户托管密钥，则您账户中调用 Performance Insights API 的用户需要在 KMS 密钥具有 `kms:Decrypt` 和 `kms:GenerateDataKey` 权限。您可以通过 IAM policy 配置这些权限。但是，我们建议您通过 KMS 密钥策略来管理这些权限。有关更多信息，请参阅[在 AWS KMS 中使用密钥策略](https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html)。

**Example**  
以下示例密钥策略显示了如何将语句添加到 KMS 密钥策略。这些语句可以访问 Performance Insights。根据您的使用方式 AWS KMS，您可能需要更改一些限制。在将语句添加到您的策略之前，请删除所有注释。

# 使用 Performance Insights 控制面板分析指标
<a name="performance-insights-analyzing"></a>

Performance Insights 控制面板包含帮助您分析和排查性能问题的数据库性能信息。在主控制面板页面上，可以查看有关数据库负载（DB 负载）的信息。您可以按维度（例如等待状态或查询）对数据库负载进行“切片”。

**Topics**
+ [Performance Insights 控制面板概览](performance-insights-dashboard-overview.md)
+ [打开 Performance Insights 控制面板](performance-insights-dashboard-opening.md)
+ [通过等待状态分析数据库负载](performance-insights-analyzing-db-load.md)
+ [主要查询选项卡概览](performance-insights-top-queries.md)
+ [放大数据库负载图表](performance-insights-zoom-db-load.md)

# Performance Insights 控制面板概览
<a name="performance-insights-dashboard-overview"></a>

与 Performance Insights 进行交互的最简单方式即为控制面板。以下示例显示了 Amazon DocumentDB 实例的控制面板。默认情况下，Performance Insights 控制面板将显示最近一小时的数据。

![\[性能详情控制面板，其中显示了 Amazon DocumentDB 实例随时间推移的 CPU 利用率和数据库负载。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/overview-dashboard.png)


控制面板分为以下几个部分：

1. **计数器指标**：显示特定性能计数器指标的数据。

1. **数据库负载**：显示数据库负载与**最大 vCPU** 线表示的数据库实例容量的比较情况。

1.  **主要维度**：显示对数据库负载影响最大的主要维度。这些维度包括 `waits`、`queries`、`hosts`、`databases` 和 `applications`。

**Topics**
+ [计数器指标图表](#performance-insights-overview-metrics)
+ [数据库负载图表](#performance-insights-overview-db-load-chart)
+ [主要维度表](#performance-insights-overview-top-dimensions)

## 计数器指标图表
<a name="performance-insights-overview-metrics"></a>

使用计数器指标，您可以自定义 Performance Insights 控制面板来包括最多 10 个其他图表。这些图表显示了所选的数十个操作系统指标。您可将此信息与数据库负载相关联，以帮助识别和分析性能问题。

**计数器指标**图表显示了性能计数器的数据。

![\[计数器指标图表，其中显示了随时间推移的 CPU 利用率。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/counter-metrics.png)


要更改性能计数器，请选择**管理指标**。您可以选择多个 **OS 指标**，如以下屏幕截图所示。要查看任何指标的详细信息，请将鼠标悬停在相应指标名称上。

![\[性能详情控制面板指标选择界面，其中具有操作系统指标选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/overview-os-metrics.png)


## 数据库负载图表
<a name="performance-insights-overview-db-load-chart"></a>

**数据库负载**图表显示数据库负载与**最大 vCPU** 线表示的实例容量的比较情况。预设情况下，堆叠折线图将以每单位时间的平均活动会话数表示数据库负载。数据库负载按等待状态进行切片（分组）。

![\[数据库负载图表，其中显示了随时间推移的平均活动会话数，在接近末尾时出现 CPU 使用率激增。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/database-load.png)


**按维度切片的数据库负载**  
您可以选择按任何受支持维度分组的活动会话显示负载。下图显示了 Amazon DocumentDB 实例的维度。

![\[图表，其中显示了数据库负载，下拉列表中显示了各种“切片依据”选项。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/database-load-sliced.png)


**维度项目的数据库负载详细信息**  
要查看维度中数据库负载项目的详细信息，请将光标悬停在相应项目名称上。下图显示了查询语句的详细信息。

![\[条形图，其中显示了数据库负载，将鼠标悬停在项目名称上时会显示其他详细信息。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/database-load-details.png)


要在图例中查看任何项目在选定时间段内的详细信息，请将鼠标悬停在相应项目上。

![\[条形图，其中显示了数据库负载，将鼠标悬停在条形上时会显示其他详细信息。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/database-load-hover.png)


## 主要维度表
<a name="performance-insights-overview-top-dimensions"></a>

**主要维度表**将按不同的维度切割数据库负载。维度是数据库负载不同特征的类别或“切片依据”。如果维度为查询，则**主要查询**显示了对数据库负载影响最大的查询语句。

请选择以下任何一个维度选项卡。

![\[“排名靠前的查询维度”选项卡，其中显示了两个排名靠前的查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-dimensions.png)


下表简要说明了每个选项卡。


| 选项卡 | 说明 | 
| --- | --- | 
|  主要等待  |   数据库后端正在等待的事件  | 
|  主要查询  |  当前正在运行的查询语句  | 
|  主要主机  |  所连接客户端的主机 IP 和端口  | 
|  主要数据库  |  客户端所连接的数据库的名称  | 
|  主要应用程序  |  连接到数据库的应用程序的名称  | 

要了解如何使用**主要查询**选项卡分析查询，请参阅 [主要查询选项卡概览](performance-insights-top-queries.md)。

# 打开 Performance Insights 控制面板
<a name="performance-insights-dashboard-opening"></a>

**要在 AWS 管理控制台中查看 Performance Insights 控制面板，请使用以下步骤：**

1. 打开 Performance Insights 控制台[https://console.aws.amazon.com/docdb/](https://console.aws.amazon.com/docdb/home#performance-insights)。

1. 选择一个数据库实例。将为该 Amazon DocumentDB 实例显示 Performance Insights 控制面板。

   对于启用 Performance Insights 的 Amazon DocumentDB 实例，您还可以通过选择实例列表中的**会话**项目来访问控制面板。在**当前活动**下，**会话**项目显示在过去五分钟内平均活跃会话中的数据库负载。条形图显示负载量。当条形图为空时，实例处于空闲状态。随着负载的增加，条形图会以蓝色填充。当负载超过实例类上虚拟 CPUs (vCPUs) 的数量时，条形变为红色，表示存在潜在的瓶颈。  
![\[“集群”页面，其中显示了 Amazon DocumentDB 区域集群以及每个集群实例的 CPU 和当前活动。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/opening-clusters.png)

1. （可选）通过选择右上角的按钮来选择不同的时间间隔。例如，要将间隔更改为 1 小时，请选择 **1 小时**。  
![\[时间间隔按钮范围从五分钟到一周不等。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/opening-time.png)

   在以下屏幕截图中，数据库负载间隔为 1 小时。  
![\[条形图，其中显示了以平均活动会话数衡量的数据库负载。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/opening-db-load.png)

1. 要自动刷新数据，请启用**自动刷新**。  
![\[“自动刷新”按钮已启用，显示在各时间间隔按钮旁边。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/opening-auto-refresh.png)

   Performance Insights 控制面板自动刷新新的数据。刷新速率取决于所显示的数据量：
   + 5 分钟则每 5 秒刷新一次。
   + 1 小时则每分钟刷新一次。
   + 5 小时则每分钟刷新一次。
   + 每 5 分钟刷新 24 小时一次。
   + 每小时刷新一周一次。

# 通过等待状态分析数据库负载
<a name="performance-insights-analyzing-db-load"></a>

如果**数据库负载（DB 负载）**图表显示了一个瓶颈，您可以找出负载的来源。为此，请查看**数据库负载**图表下方的主要负载项目。选择特定项目 (如查询或应用) 以深入了解该项目并查看有关该项目的详细信息。

按等待状态和主要查询分组的数据库负载通常可以提供对性能问题的最深入了解。按等待状态分组的数据库负载显示了数据库中是否存在任何资源瓶颈或并发瓶颈。在这种情况下，“主要负载项目”表的**主要查询**选项卡显示了增大该负载的查询。

诊断性能问题的典型工作流程如下：

1. 查看**数据库负载**图表并了解是否存在数据库负载的事件越过了 **Max CPU** 线。

1. 如果有，请查看**数据库负载**图表并确定负主要责任的等待状态。

1. 通过以下方式确定导致负载的摘要查询：查看“主要负载项目”表上的**主要查询**选项卡中的哪个查询对于导致这些等待状态所起的作用最大。可通过 **按等待状态排列的负载 (AAS)** 列加以识别。

1. 在**主要查询**选项卡中选择这些摘要查询之一以展开它并查看它包含的子查询。

您还可以分别选择**热门主机**或**热门应用程序**来查看哪些主机或应用程序造成的负载最大。应用程序名称在 Amazon DocumentDB 实例的连接字符串中指定。`Unknown` 表示未指定应用程序字段。

例如，在下面的控制面板中，**CPU** 等待状态占大部分数据库负载。选择**主要查询**下的排名靠前的查询会将数据库负载图表的范围限定为重点关注选择查询贡献的最大负载。

![\[数据库负载图表，其中显示了 CPU 使用率峰值。相应的“排名靠前的查询”选项卡显示了对等待状态影响最大的查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/db-load-1.png)


![\[数据库负载图表，其中显示了对等待状态影响最大的查询的 CPU 使用率峰值。相应的“排名靠前的查询”选项卡显示了该查询的子查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/db-load-2.png)


# 主要查询选项卡概览
<a name="performance-insights-top-queries"></a>

原定设置情况下，**主要查询**选项卡将显示对数据库负载影响最大的 25 个 SQL 查询。您可以分析查询文本，帮助调整您的查询。

**Topics**
+ [查询摘要](#performance-insights-top-queries-digests)
+ [按等待状态排列的负载 (AAS)](#performance-insights-top-queries-aas)
+ [查看详细的查询信息](#performance-insights-top-queries-query-info)
+ [访问语句查询文本](#performance-insights-top-queries-accessing-text)
+ [查看和下载语句查询文本](#performance-insights-top-queries-viewing-downloading)

## 查询摘要
<a name="performance-insights-top-queries-digests"></a>

*查询摘要*是多个结构上相似但可能具有不同文本值的实际查询的组合。摘要用问号替换硬编码值。例如，查询摘要可能如下所示：

```
{"find":"customerscollection","filter":{"FirstName":"?"},"sort":{"key":{"$numberInt":"?"}},"limit":{"$numberInt":"?"}}
```

此摘要可能包含以下子查询：

```
{"find":"customerscollection","filter":{"FirstName":"Karrie"},"sort":{"key":{"$numberInt":"1"}},"limit":{"$numberInt":"3"}}
{"find":"customerscollection","filter":{"FirstName":"Met"},"sort":{"key":{"$numberInt":"1"}},"limit":{"$numberInt":"3"}}
{"find":"customerscollection","filter":{"FirstName":"Rashin"},"sort":{"key":{"$numberInt":"1"}},"limit":{"$numberInt":"3"}}
```

要查看摘要中的文字查询语句，请选择查询，然后选择加号 (`+`)。在下面的屏幕截图中，选定的查询是摘要。

![\[“排名靠前的查询”表，显示了扩展查询摘要，其中已选择一个子查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-queries-literal.png)


**注意**  
查询摘要将相似的查询语句进行分组，但不会编辑敏感信息。

## 按等待状态排列的负载 (AAS)
<a name="performance-insights-top-queries-aas"></a>

在**主要查询**中，**按等待状态排列的负载 (AAS)** 列说明了与每个主要负载项目关联的数据库负载的百分比。此列按当前在**数据库负载图表**中选择的分组方式反映该项目的负载。例如，您可以按等待状态对**数据库负载**图表进行分组。在这种情况下，系统将对 **DB Load by Waits (按等待状态排列的数据库负载)** 栏进行大小调整、分段和颜色编码，以显示该查询在导致给定等待状态方面所起的作用大小，它还会显示哪些等待状态正在影响选定的查询。

![\[条形图，其中显示了按照 CPU、IO 和锁存等待状态分组的数据库负载。相应的表显示了基于按等待状态排列的负载的排名靠前的查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-queries-aas.png)


## 查看详细的查询信息
<a name="performance-insights-top-queries-query-info"></a>

在**主要查询**表中，您可以打开一条*摘要语句*以查看其信息。信息将显示在底部窗格中。

![\[“排名靠前的查询”表，其中显示了所选的查询语句，并在下方显示了其查询信息。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-queries-detailed.png)


以下类型的标识符 (IDs) 与查询语句相关联：

1. **支持查询 ID**：查询 ID 的哈希值。此值仅用于在使用 Support 时引用 AWS 查询 ID。 AWS Support 无法访问您的实际查询 IDs 和查询文本。

1. **支持摘要 ID**：摘要 ID 的哈希值。此值仅用于在使用 Support 时引用摘要 ID。 AWS AWS Support 无法访问您的实际摘要 IDs 和查询文本。

## 访问语句查询文本
<a name="performance-insights-top-queries-accessing-text"></a>

原定设置情况下，**主要查询**表中的每行为每条查询语句显示 500 字节的查询文本。当摘要语句超过 500 字节时，可通过在 Performance Insights 控制面板中打开该语句来查看更多文本。在这种情况下，显示的查询的最大长度为 1 KB。如果查看完整的查询语句，也可以选择**下载**。

## 查看和下载语句查询文本
<a name="performance-insights-top-queries-viewing-downloading"></a>

在 Performance Insights 控制面板中，您可以查看或下载查询文本。

**在 Performance Insights 控制面板中查看更多查询文本**

1. 打开亚马逊 DocumentDB 控制台，网址为：[https://console.aws.amazon.com/docdb/](https://console.aws.amazon.com/docdb/)

1. 在导航窗格中，选择**性能详情**。

1. 选择一个数据库实例。将为该数据库实例显示 Performance Insights 控制面板。

   具有大于 500 字节的文本的查询语句如下图所示。  
![\[“排名靠前的查询”表已选择子查询。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-queries-statement.png)

1. 检查查询信息部分以查看更多的查询文本。  
![\[“查询信息”部分，其中显示了所选查询的完整文本。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/top-queries-query-text.png)

Performance Insights 控制面板可以为每个完整的查询语句最多显示 1 KB。

**注意**  
要复制或下载查询语句，请禁用弹出窗口阻止程序。

# 放大数据库负载图表
<a name="performance-insights-zoom-db-load"></a>

您可以使用 Performance Insights 用户界面的其他功能来帮助分析性能数据。

**Click-and-Drag 放大**  
在 Performance Insights 界面中，您可以选择负载图表的一小部分并放大细节。

![\[条形图，显示了数据库负载，其中一部分突出显示以供放大查看。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/pi-zoom-1.png)


要放大负载图表的一部分，请选择开始时间并拖动到所需时间段的结尾。执行该操作时，所选区域将突出显示。释放鼠标时，负载图表上的所选区域将放大，并重新计算**主要*项目***表。

![\[数据库负载条形图，其中显示了放大部分，下方是相应的“排名靠前的等待”表。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/pi-zoom-2.png)


# 使用 Performance Insights API 检索指标
<a name="performance-insights-metrics"></a>

启用 Performance Insights 后，API 将提供实例性能的可见性。Ama CloudWatch zon Logs 为 AWS 服务的销售监控指标提供了权威来源。

Performance Insights 提供了按平均活动会话 (AAS) 衡量的数据库负载的特定于域的视图。对 API 使用者而言，此指标看起来像是二维时间序列数据集。数据的时间维度提供所查询时间范围的每个时间点的数据库负载数据。每个时间点将分解与所请求维度相关的整体负载，如相应时间点测量的 `Query`、`Wait-state`、`Application` 或 `Host`。

Amazon DocumentDB Performance Insights 用于监控您的 Amazon DocumentDB 数据库实例，使您可以分析数据库性能和排查数据库性能问题。查看 Performance Insights 数据的一种方法是在 AWS 管理控制台中。Performance Insights 还提供公有 API，以便您可以查询自己的数据。您可以使用 API 来执行以下操作：
+ 将数据卸载到数据库中
+ 将 Performance Insights 数据添加到现有监控控制面板
+ 构建监控工具

要使用 Performance Insights API，请在您的 Amazon DocumentDB 实例之一上启用 Performance Insights。有关启用 Performance Insights 的信息，请参阅 [启用和禁用 Performance Insights](performance-insights-enabling.md)。有关性能详情 API 的更多信息，请参阅[性能详情 API 参考](https://docs.aws.amazon.com/performance-insights/latest/APIReference/Welcome.html)。

Performance Insights API 提供以下操作。


****  

|  Performance Insights 操作  |  AWS CLI 命令  |  说明  | 
| --- | --- | --- | 
|  [https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_DescribeDimensionKeys.html](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_DescribeDimensionKeys.html)  |  [https://docs.aws.amazon.com/cli/latest/reference/pi/describe-dimension-keys.html](https://docs.aws.amazon.com/cli/latest/reference/pi/describe-dimension-keys.html)  |  对于特定的时间段，检索指标的前 N 个维度键。  | 
|  [https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_GetDimensionKeyDetails.html](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_GetDimensionKeyDetails.html)  |  [https://docs.aws.amazon.com/cli/latest/reference/pi/get-dimension-key-details.html](https://docs.aws.amazon.com/cli/latest/reference/pi/get-dimension-key-details.html)  |  检索数据库实例或数据来源的指定维度组的属性。例如，如果您指定了查询 ID，并且有维度详细信息，则 `GetDimensionKeyDetails` 将检索与此 ID 关联的维度 `db.query.statement` 的全文。此操作很有用，因为 `GetResourceMetrics` 和 `DescribeDimensionKeys` 不支持检索大型查询语句文本。  | 
| [GetResourceMetadata](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_GetResourceMetadata.html) |  [https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metadata.html](https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metadata.html)  |  检索不同功能的元数据。例如，元数据可以表明特定数据库实例上的某个功能已打开或关闭。  | 
|  [https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_GetResourceMetrics.html](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_GetResourceMetrics.html)  |  [https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metrics.html](https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metrics.html)  |  检索一组数据来源在一段时间内的 Performance Insights 指标。您可以提供特定维度组和维度，并为每个组提供聚合和筛选条件。  | 
| [ListAvailableResourceDimensions](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_ListAvailableResourceDimensions.html) |  [https://docs.aws.amazon.com/cli/latest/reference/pi/list-available-resource-dimensions.html](https://docs.aws.amazon.com/cli/latest/reference/pi/list-available-resource-dimensions.html)  |  检索特定实例上每个特定指标类型可查询的维度。  | 
| [ListAvailableResourceMetrics](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_ListAvailableResourceMetrics.html) |  [https://docs.aws.amazon.com/cli/latest/reference/pi/list-available-resource-metrics.html](https://docs.aws.amazon.com/cli/latest/reference/pi/list-available-resource-metrics.html)  |  检索指定指标类型的所有可用指标，指定数据库实例可用该指标进行查询。  | 

**Topics**
+ [AWS CLI 获取性能见解](#performance-insights-metrics-CLI)
+ [检索时间序列指标](#performance-insights-metrics-time-series)
+ [AWS CLI 性能 Insights 的示例](#performance-insights-metrics-api-examples)

## AWS CLI 获取性能见解
<a name="performance-insights-metrics-CLI"></a>

您可以使用 AWS CLI查看 Performance Insights 数据。可以通过在命令行上输入以下内容来查看 Performance Insights 的 AWS CLI 命令的帮助。

```
aws pi help
```

如果您尚未 AWS CLI 安装，请参阅*AWS CLI 用户指南*中的[安装 AWS 命令行界面](https://docs.aws.amazon.com/cli/latest/userguide/installing.html)，了解有关安装命令行界面的信息。

## 检索时间序列指标
<a name="performance-insights-metrics-time-series"></a>

`GetResourceMetrics` 操作从 Performance Insights 数据中检索一个或多个时间序列指标。`GetResourceMetrics` 需要指标和时间段，并返回包含数据点列表的响应。

例如， AWS 管理控制台 用于`GetResourceMetrics`填充 “**计数器指标**” 图表和 “**数据库负载**” 图表，如下图所示。

![\[“Counter Metrics (计数器指标)”和“Database Load (数据库负载)”图表\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/perf-insights-api-charts.png)


`GetResourceMetrics` 返回的所有指标都是标准的时间序列指标，但 `db.load` 除外。此指标显示在 **Database Load (数据库负载)** 图表中。`db.load` 指标不同于其他时间序列指标，因为您可以将它分为称为*维度*的子组件。在上图中，按组成 `db.load` 的等待状态对 `db.load` 进行细分和分组。

**注意**  
`GetResourceMetrics` 也可以返回 `db.sampleload` 指标，但 `db.load` 指标在大多数情况下是合适的。

有关 `GetResourceMetrics` 返回的计数器指标的信息，请参阅[Performance Insights 的计数器指标](performance-insights-counter-metrics.md)。

指标支持以下计算：
+ 平均值：指标在一段时间内的平均值。在指标名称后面附加 `.avg`。
+ 最小值：指标在一段时间内的最小值。在指标名称后面附加 `.min`。
+ 最大值：指标在一段时间内的最大值。在指标名称后面附加 `.max`。
+ 总计：指标值在一段时间内的总计。在指标名称后面附加 `.sum`。
+ 样本数：在一段时间内收集指标的次数。在指标名称后面附加 `.sample_count`。

例如，假定在 300 秒（5 分钟）时段内收集指标，并且每分钟收集一次指标。各分钟的值为 1、2、3、4 和 5。在本例中，返回以下计算：
+ 平均值：3
+ 最小值：1
+ 最大值：5
+ 总计：15
+ 样本数：5

有关使用该`get-resource-metrics` AWS CLI 命令的信息，请参见[https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metrics.html](https://docs.aws.amazon.com/cli/latest/reference/pi/get-resource-metrics.html)。

对于 `--metric-queries` 选项，请指定一个或多个要获取其结果的查询。每个查询包括必需的 `Metric` 和可选的 `GroupBy` 和 `Filter` 参数。以下是 `--metric-queries` 选项规范的示例。

```
{
   "Metric": "string",
   "GroupBy": {
     "Group": "string",
     "Dimensions": ["string", ...],
     "Limit": integer
   },
   "Filter": {"string": "string"
     ...}
```

## AWS CLI 性能 Insights 的示例
<a name="performance-insights-metrics-api-examples"></a>

以下示例说明了如何使用 Performance Insights 的。 AWS CLI 

**Topics**
+ [检索计数器指标](#performance-insights-metrics-api-examples.CounterMetrics)
+ [检索首要等待状态的数据库负载平均值](#performance-insights-metrics-api-examples.DBLoadAverage)
+ [检索主要查询的数据库负载平均值](#performance-insights-metrics-api-examples.topquery)
+ [检索按查询筛选的数据库负载平均值](#performance-insights-metrics-api-examples.DBLoadAverageByQuery)

### 检索计数器指标
<a name="performance-insights-metrics-api-examples.CounterMetrics"></a>

以下屏幕截图显示 AWS 管理控制台中的两个计数器指标图表。

![\[计数器指标图表。\]](http://docs.aws.amazon.com/zh_cn/documentdb/latest/developerguide/images/performance-insights/perf-insights-api-counters-charts.png)


以下示例显示如何收集 AWS 管理控制台 用于生成两个计数器指标图表的相同数据。

对于 Linux、macOS 或 Unix：

```
aws pi get-resource-metrics \
   --service-type DOCDB \
   --identifier db-ID \
   --start-time 2022-03-13T8:00:00Z \
   --end-time   2022-03-13T9:00:00Z \
   --period-in-seconds 60 \
   --metric-queries '[{"Metric": "os.cpuUtilization.user.avg"  },
                      {"Metric": "os.cpuUtilization.idle.avg"}]'
```

对于 Windows：

```
aws pi get-resource-metrics ^
   --service-type DOCDB ^
   --identifier db-ID ^
   --start-time 2022-03-13T8:00:00Z ^
   --end-time   2022-03-13T9:00:00Z ^
   --period-in-seconds 60 ^
   --metric-queries '[{"Metric": "os.cpuUtilization.user.avg"  },
                      {"Metric": "os.cpuUtilization.idle.avg"}]'
```

还可以通过为 `--metrics-query` 选项指定文件来使命令更易于读取。以下示例为该选项使用名为 query.json 的文件。此文件具有以下内容。

```
[
    {
        "Metric": "os.cpuUtilization.user.avg"
    },
    {
        "Metric": "os.cpuUtilization.idle.avg"
    }
]
```

运行以下命令来使用此文件。

对于 Linux、macOS 或 Unix：

```
aws pi get-resource-metrics \
   --service-type DOCDB \
   --identifier db-ID \
   --start-time 2022-03-13T8:00:00Z \
   --end-time   2022-03-13T9:00:00Z \
   --period-in-seconds 60 \
   --metric-queries file://query.json
```

对于 Windows：

```
aws pi get-resource-metrics ^
   --service-type DOCDB ^
   --identifier db-ID ^
   --start-time 2022-03-13T8:00:00Z ^
   --end-time   2022-03-13T9:00:00Z ^
   --period-in-seconds 60 ^
   --metric-queries file://query.json
```

上一个示例为各选项指定了以下值：
+ `--service-type`：`DOCDB` 适用于 Amazon DocumentDB
+ `--identifier`：数据库实例的资源 ID
+ `--start-time` 和 `--end-time`：要查询的期间的 ISO 8601 `DateTime` 值，支持多种格式

它查询一小时时间范围：
+ `--period-in-seconds`：对于每分钟查询来说为 `60`
+ `--metric-queries`：两个查询的数组，每个查询只用于一个指标。

  指标名称使用点在有用的类别中分类指标，最后一个元素是函数。在示例中，对于每个查询来说，此函数是 `avg`。与 Amazon 一样 CloudWatch，支持的函数有`min``max`、`total`、和`avg`。

响应类似于以下内容。

```
{
    "AlignedStartTime": "2022-03-13T08:00:00+00:00",
    "AlignedEndTime": "2022-03-13T09:00:00+00:00",
    "Identifier": "db-NQF3TTMFQ3GTOKIMJODMC3KQQ4",
    "MetricList": [
        {
            "Key": {
                "Metric": "os.cpuUtilization.user.avg"
            },
            "DataPoints": [
                {
                    "Timestamp": "2022-03-13T08:01:00+00:00", //Minute1
                    "Value": 3.6
                },
                {
                    "Timestamp": "2022-03-13T08:02:00+00:00", //Minute2
                    "Value": 2.6
                },
                //.... 60 datapoints for the os.cpuUtilization.user.avg metric
        {
            "Key": {
                "Metric": "os.cpuUtilization.idle.avg"
            },
            "DataPoints": [
                {
                    "Timestamp": "2022-03-13T08:01:00+00:00",
                    "Value": 92.7
                },
                {
                    "Timestamp": "2022-03-13T08:02:00+00:00",
                    "Value": 93.7
                },
                //.... 60 datapoints for the os.cpuUtilization.user.avg metric 
            ]
        }
    ] //end of MetricList
} //end of response
```

响应具有 `Identifier`、`AlignedStartTime` 和 `AlignedEndTime`。但 `--period-in-seconds` 值为 `60`，开始和结束时间已与分钟对齐。如果 `--period-in-seconds` 为 `3600`，则开始和结束时间已与小时对齐。

响应中的 `MetricList` 具有许多条目，每个条目具有 `Key` 和 `DataPoints` 条目。每个 `DataPoint` 具有 `Timestamp` 和 `Value`。每个 `Datapoints` 列表具有 60 个数据点，因为查询针对一小时内的每分钟数据，具有 `Timestamp1/Minute1`、`Timestamp2/Minute2` 等，一直到 `Timestamp60/Minute60`。

因为查询用于两个不同的计数器指标，响应 `MetricList` 中有两个元素。

### 检索首要等待状态的数据库负载平均值
<a name="performance-insights-metrics-api-examples.DBLoadAverage"></a>

以下示例与 AWS 管理控制台 用于生成堆叠面积折线图的查询相同。此示例检索按前七个等待状态划分负载的最后一个小时的 `db.load.avg`。命令与 [检索计数器指标](#performance-insights-metrics-api-examples.CounterMetrics) 中的命令相同。不过，query.json 文件具有以下内容。

```
[
    {
        "Metric": "db.load.avg",
        "GroupBy": { "Group": "db.wait_state", "Limit": 7 }
    }
]
```

运行以下命令。

对于 Linux、macOS 或 Unix：

```
aws pi get-resource-metrics \
   --service-type DOCDB \
   --identifier db-ID \
   --start-time 2022-03-13T8:00:00Z \
   --end-time   2022-03-13T9:00:00Z \
   --period-in-seconds 60 \
   --metric-queries file://query.json
```

对于 Windows：

```
aws pi get-resource-metrics ^
   --service-type DOCDB ^
   --identifier db-ID ^
   --start-time 2022-03-13T8:00:00Z ^
   --end-time   2022-03-13T9:00:00Z ^
   --period-in-seconds 60 ^
   --metric-queries file://query.json
```

此示例指定指标 `db.load.avg` 和前七个等待状态的 `GroupBy`。有关此示例有效值的详细信息，请参阅 Performance *Insights API 参考[DimensionGroup](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_DimensionGroup.html)中的。*

响应类似于以下内容。

```
{
    "AlignedStartTime": "2022-04-04T06:00:00+00:00",
    "AlignedEndTime": "2022-04-04T06:15:00+00:00",
    "Identifier": "db-NQF3TTMFQ3GTOKIMJODMC3KQQ4",
    "MetricList": [
        {//A list of key/datapoints
            "Key": {
                //A Metric with no dimensions. This is the total db.load.avg
                "Metric": "db.load.avg"
            },
            "DataPoints": [
                //Each list of datapoints has the same timestamps and same number of items
                {
                    "Timestamp": "2022-04-04T06:01:00+00:00",//Minute1
                    "Value": 0.0
                },
                {
                    "Timestamp": "2022-04-04T06:02:00+00:00",//Minute2
                    "Value": 0.0
                },
                //... 60 datapoints for the total db.load.avg key
                ]
        },
        {
            "Key": {
                //Another key. This is db.load.avg broken down by CPU
                "Metric": "db.load.avg",
                "Dimensions": {
                    "db.wait_state.name": "CPU"
                }
            },
            "DataPoints": [
                {
                    "Timestamp": "2022-04-04T06:01:00+00:00",//Minute1
                    "Value": 0.0
                },
                {
                    "Timestamp": "2022-04-04T06:02:00+00:00",//Minute2
                    "Value": 0.0
                },
                //... 60 datapoints for the CPU key
            ]
        },//... In total we have 3 key/datapoints entries, 1) total, 2-3) Top Wait States
    ] //end of MetricList
} //end of response
```

在此响应中，`MetricList` 中有三个条目。有一个有关总 `db.load.avg` 的条目，还有三个条目，其中每个条目关于按前三个等待状态之一划分的 `db.load.avg`。由于具有分组维度（与第一个示例不同），所以必须具有一个用于每个指标分组的键。不能像在基本计数器指标使用案例中那样每个指标只有一个键。

### 检索主要查询的数据库负载平均值
<a name="performance-insights-metrics-api-examples.topquery"></a>

以下示例按前 10 个查询语句对 `db.wait_state` 进行分组。有两个不同的查询语句组：
+ `db.query`：完整的查询语句，例如 `{"find":"customers","filter":{"FirstName":"Jesse"},"sort":{"key":{"$numberInt":"1"}}}`
+ `db.query_tokenized`：令牌化的查询语句，例如 `{"find":"customers","filter":{"FirstName":"?"},"sort":{"key":{"$numberInt":"?"}},"limit":{"$numberInt":"?"}}`

在分析数据库性能时，将仅参数不同的查询语句视为一个逻辑项目很有用。因此，您在查询时可以使用 `db.query_tokenized`。不过，尤其在您对 `explain()` 感兴趣时，查看带参数的完整查询语句会更有用。令牌化和完整查询之间存在父-子关系，多个完整查询（子级）分组在同一令牌化查询（父级）下。

此示例中的命令类似于 [检索首要等待状态的数据库负载平均值](#performance-insights-metrics-api-examples.DBLoadAverage) 中的命令。不过，query.json 文件具有以下内容。

```
[
    {
        "Metric": "db.load.avg",
        "GroupBy": { "Group": "db.query_tokenized", "Limit": 10 }
    }
]
```

下面的示例使用了 `db.query_tokenized`。

对于 Linux、macOS 或 Unix：

```
aws pi get-resource-metrics \
   --service-type DOCDB \
   --identifier db-ID \
   --start-time 2022-03-13T8:00:00Z \
   --end-time   2022-03-13T9:00:00Z \
   --period-in-seconds 3600 \
   --metric-queries file://query.json
```

对于 Windows：

```
aws pi get-resource-metrics ^
   --service-type DOCDB ^
   --identifier db-ID ^
   --start-time 2022-03-13T8:00:00Z ^
   --end-time   2022-03-13T9:00:00Z  ^
   --period-in-seconds 3600 ^
   --metric-queries file://query.json
```

此示例查询时间超过 1 小时，其中 1 分钟 period-in-seconds。

此示例指定指标 `db.load.avg` 和前七个等待状态的 `GroupBy`。有关此示例有效值的详细信息，请参阅 Performance *Insights API 参考[DimensionGroup](https://docs.aws.amazon.com/performance-insights/latest/APIReference/API_DimensionGroup.html)中的。*

响应类似于以下内容。

```
{
    "AlignedStartTime": "2022-04-04T06:00:00+00:00",
    "AlignedEndTime": "2022-04-04T06:15:00+00:00",
    "Identifier": "db-NQF3TTMFQ3GTOKIMJODMC3KQQ4",
    "MetricList": [
        {//A list of key/datapoints
            "Key": {
                "Metric": "db.load.avg"
            },
            "DataPoints": [
                //... 60 datapoints for the total db.load.avg key
                ]
        },
               {
            "Key": {//Next key are the top tokenized queries
                "Metric": "db.load.avg",
                "Dimensions": {
                    "db.query_tokenized.db_id": "pi-1064184600",
                    "db.query_tokenized.id": "77DE8364594EXAMPLE",
                    "db.query_tokenized.statement": "{\"find\":\"customers\",\"filter\":{\"FirstName\":\"?\"},\"sort\":{\"key\":{\"$numberInt\":\"?\"}},\"limit\"
:{\"$numberInt\":\"?\"},\"$db\":\"myDB\",\"$readPreference\":{\"mode\":\"primary\"}}"
                }
            },
            "DataPoints": [
            //... 60 datapoints 
            ]
        },
        // In total 11 entries, 10 Keys of top tokenized queries, 1 total key 
    ] //End of MetricList
} //End of response
```

此响应的 `MetricList` 中具有 11 个条目（1 个总计，10 个首要令牌化查询），其中每个条目具有 24 个每小时 `DataPoints`。

对于令牌化查询，每个维度列表中具有三个条目：
+ `db.query_tokenized.statement`：令牌化的查询语句。
+ `db.query_tokenized.db_id `：Performance Insights 为您生成的合成 ID。此示例返回 `pi-1064184600` 合成 ID。
+ `db.query_tokenized.id`：Performance Insights 中的查询的 ID。

  在中 AWS 管理控制台，此 ID 被称为 Support ID。之所以这样命名，是因为 ID 是 Su AWS pport 可以检查的数据，以帮助您解决数据库问题。 AWS 非常重视数据的安全性和隐私性，几乎所有数据都使用您的数据加密存储 AWS KMS key。因此，里面没有人 AWS 可以查看这些数据。在上一个示例中，`tokenized.statement` 和 `tokenized.db_id` 都进行了加密存储。如果您的数据库出现问题，Su AWS pport 可以通过引用 Support ID 来帮助您。

在查询时，在 `Group` 中指定 `GroupBy` 可能很方便。不过，要更精细地控制返回的数据，请指定维度列表。例如，如果所需的所有内容是 `db.query_tokenized.statement`，则可将 `Dimensions` 属性添加到 query.json 文件中。

```
[
    {
        "Metric": "db.load.avg",
        "GroupBy": {
            "Group": "db.query_tokenized",
            "Dimensions":["db.query_tokenized.statement"],
            "Limit": 10
        }
    }
]
```

### 检索按查询筛选的数据库负载平均值
<a name="performance-insights-metrics-api-examples.DBLoadAverageByQuery"></a>

此示例中的相应 API 查询类似于 [检索主要查询的数据库负载平均值](#performance-insights-metrics-api-examples.topquery) 中的命令。不过，query.json 文件具有以下内容。

```
[
 {
        "Metric": "db.load.avg",
        "GroupBy": { "Group": "db.wait_state", "Limit": 5  }, 
        "Filter": { "db.query_tokenized.id": "AKIAIOSFODNN7EXAMPLE" }
    }
]
```

在此响应中，所有值均根据 query.json 文件中指定的标记化查询 AKIAIOSFODNN7示例的贡献进行过滤。键还可能遵循与没有筛选条件的查询不同的顺序，因为前五个等待状态影响了筛选的查询。

# Performance Insights 的亚马逊 CloudWatch 指标
<a name="performance-insights-cloudwatch"></a>

Performance Insights 会自动向亚马逊发布指标 CloudWatch。可以从 Performance Insights 中查询相同的数据，但是将指标包含在里面可以 CloudWatch 轻松添加 CloudWatch 警报。还可以轻松地将指标添加到现有 CloudWatch 控制面板中。


| 指标 | 说明 | 
| --- | --- | 
|  DBLoad  |  Amazon DocumentDB 的活动会话数。通常，您需要活动会话的平均数量数据。在 Performance Insights 中，作为 `db.load.avg` 查询此数据。  | 
|  DBLoadCPU  |  等待状态类型为 CPU 的活动会话的数量。在 Performance Insights 中，作为 `db.load.avg` 查询此数据，按等待状态类型 `CPU` 进行筛选。  | 
|  DBLoad非 CPU  |  等待状态类型不为 CPU 的活动会话的数量。  | 

**注意**  
 CloudWatch 仅当数据库实例有负载时，才会将这些指标发布到。

您可以使用 CloudWatch 控制台 AWS CLI、或 CloudWatch API 检查这些指标。

例如，您可以通过运行[get-metric-statistics](https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-statistics.html)命令来获取`DBLoad`指标的统计信息。

```
aws cloudwatch get-metric-statistics \
    --region ap-south-1 \
    --namespace AWS/DocDB \
    --metric-name DBLoad  \
    --period 360 \
    --statistics Average \
    --start-time 2022-03-14T8:00:00Z \
    --end-time 2022-03-14T9:00:00Z \
    --dimensions Name=DBInstanceIdentifier,Value=documentdbinstance
```

该示例将生成与下类似的输出。

```
{
    "Datapoints": [
        {
            "Timestamp": "2022-03-14T08:42:00Z", 
            "Average": 1.0, 
            "Unit": "None"
        }, 
        {
            "Timestamp": "2022-03-14T08:24:00Z", 
            "Average": 2.0, 
            "Unit": "None"
        }, 
        {
            "Timestamp": "2022-03-14T08:54:00Z", 
            "Average": 6.0, 
            "Unit": "None"
        }, 
        {
            "Timestamp": "2022-03-14T08:36:00Z", 
            "Average": 5.7, 
            "Unit": "None"
        }, 
        {
            "Timestamp": "2022-03-14T08:06:00Z", 
            "Average": 4.0, 
            "Unit": "None"
        }, 
        {
            "Timestamp": "2022-03-14T08:00:00Z", 
            "Average": 5.2, 
            "Unit": "None"
        }
    ], 
    "Label": "DBLoad"
}
```

您可以使用 CloudWatch 控制台中的`DB_PERF_INSIGHTS`指标数学函数来查询 Amazon DocumentDB Performance Insights 计数器指标。`DB_PERF_INSIGHTS` 函数还包括以亚分钟为间隔的 `DBLoad` 指标。您可以对这些指标设置 CloudWatch 警报。有关如何创建警报的更多详细信息，请参阅针对[AWS 数据库中的 Performance Insights 计数器指标创建警报](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_alarm_database_performance_insights.html)。

有关的更多信息 CloudWatch，请参阅 [Amazon 是什么 CloudWatch？](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) 在《*亚马逊 CloudWatch 用户指南》*中。

# Performance Insights 的计数器指标
<a name="performance-insights-counter-metrics"></a>

计数器指标是 Performance Insights 控制面板中的操作系统指标。为帮助确定和分析性能问题，您可将计数器指标与数据库负载相关联。

## Performance Insights 操作系统计数器
<a name="performance-insights-counter-metrics-counters"></a>

以下操作系统计数器可用于 Amazon DocumentDB Performance Insights。


| 计数器 | 类型 | 指标 | 
| --- | --- | --- | 
| active | memory | os.memory.active | 
| buffers | memory | os.memory.buffers | 
| cached | memory | os.memory.cached | 
| dirty | memory | os.memory.dirty | 
| free | memory | os.memory.free | 
| inactive | memory | os.memory.inactive | 
| mapped | memory | os.memory.mapped | 
| pageTables | memory | os.memory.pageTables | 
| slab | memory | os.memory.slab | 
| total | memory | os.memory.total | 
| writeback | memory | os.memory.writeback | 
| idle | cpuUtilization | os.cpuUtilization.idle | 
| system | cpuUtilization | os.cpuUtilization.system | 
| total | cpuUtilization | os.cpuUtilization.total | 
| user | cpuUtilization | os.cpuUtilization.user | 
| wait | cpuUtilization | os.cpuUtilization.wait | 
| one | loadAverageMinute | os。 loadAverageMinute.one | 
| fifteen | loadAverageMinute | os。 loadAverageMinute。十五 | 
| five | loadAverageMinute | os。 loadAverageMinute.five | 
| cached | swap | os.swap.cached | 
| free | swap | os.swap.free | 
| in | swap | os.swap.in | 
| out | swap | os.swap.out | 
| total | swap | os.swap.total | 
| rx | network | os.network.rx | 
| tx | network | os.network.tx | 
| num VCPUs | general | os.general.num VCPUs |