推荐的指标 - Amazon CloudWatch

推荐的指标

下表列出了每种组件类型的建议指标。

组件类型 工作负载类型 建议的指标

EC2 实例 (Windows Server)

默认/自定义

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

LogicalDisk % Free Space

Memory Available Mbytes

Active Directory

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

数据库 ==> 实例数据库缓存 % 命中

DirectoryServices DRA 挂起的复制操作

DirectoryServices DRA 挂起的复制同步

DNS 递归查询失败/秒

LogicalDisk 平均值 Disk Queue Length

Java 应用程序

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

Microsoft IIS/.NET Web 前端

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

.NET CLR Exceptions # of Exceps Thrown/Sec

.NET CLR Memory # Total Committed Bytes

.NET CLR Memory % Time in GC

ASP.NET Applications Requests in Application Queue

ASP.NET Requests Queued

ASP.NET Application Restarts

Microsoft SQL Server 数据库层

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

Paging File % Usage

System Processor Queue Length

Network Interface Bytes Total/Sec

PhysicalDisk % Disk Time

SQLServer:Buffer Manager Buffer Cache Hit ratio

SQLServer:Buffer Manager Page Life Expectancy

SQLServer:General Statistics Processes Blocked

SQLServer:General Statistics User Connections

SQLServer:Locks Number of Deadlocks/Sec

SQLServer:SQL Statistics Batch Requests/Sec

MySQL

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

LogicalDisk % Free Space

Memory Available Mbytes

.NET 工作线程池/中间层

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

.NET CLR Exceptions # of Exceps Thrown/Sec

.NET CLR Memory # Total Committed Bytes

.NET CLR Memory % Time in GC

.NET Core 层

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

Oracle

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

LogicalDisk % Free Space

Memory Available Mbytes

Postgres

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

LogicalDisk % Free Space

Memory Available Mbytes

SharePoint

CPUUtilization

StatusCheckFailed

Processor % Processor Time

Memory % Committed Bytes In Use

Memory Available Mbytes

ASP.NET 应用程序缓存 API 修剪

ASP.NET 请求被拒

ASP.NET 工作进程重启

Memory Pages/sec

SharePoint 发布缓存发布缓存刷新/秒

SharePoint Foundation 执行时间/页面请求

SharePoint 基于磁盘的缓存缓存压缩总数

SharePoint 基于磁盘的缓存 Blob 缓存命中率

SharePoint 基于磁盘的缓存 Blob 缓存填充率

SharePoint 基于磁盘的缓存 Blob 缓存刷新/秒

ASP.NET Requests Queued

ASP.NET Applications Requests in Application Queue

ASP.NET Application Restarts

LogicalDisk 平均值 Disk sec/Write

LogicalDisk 平均值 Disk sec/Read

Processor % Interrupt Time

EC2 实例 (Linux Server)

默认/自定义

CPUUtilization

StatusCheckFailed

disk_used_percent

mem_used_percent

Java 应用程序

CPUUtilization

StatusCheckFailed

disk_used_percent

mem_used_percent

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

.NET Core 层或 SQL Server 数据库层

CPUUtilization

StatusCheckFailed

disk_used_percent

mem_used_percent

Oracle

CPUUtilization

StatusCheckFailed

disk_used_percent

mem_used_percent

Postgres

CPUUtilization

StatusCheckFailed

disk_used_percent

mem_used_percent

EC2 实例组

SAP HANA 多节点或单节点
  • hanadb_server_startup_time_variations_seconds

  • hanadb_level_5_alerts_count

  • hanadb_level_4_alerts_count

  • hanadb_out_of_memory_events_count

  • hanadb_max_trigger_read_ratio_percent

  • hanadb_max_trigger_write _rate_ratio_%

  • hanadb_log_switch_race_ratio_percent

  • hanadb_time_since_last_savepoint_seconds

  • hanadb_disk_usage_highlevel_percent

  • hanadb_current_allocation_limit_used_percent

  • hanadb_table_allocation_limit_used_percent

  • hanadb_cpu_usage_percent

  • hanadb_plan_cache_hit_ratio_percent

  • hanadb_last_data_backup_age_days

EBS 卷 任何

VolumeReadBytes

VolumeWriteBytes

VolumeReadOps

VolumeWriteOps

VolumeQueueLength

VolumeThroughputPercentage

VolumeConsumedReadWriteOps

BurstBalance

Classic ELB

任何

HTTPCode_Backend_4XX

HTTPCode_Backend_5XX

延迟

SurgeQueueLength

UnHealthyHostCount

Application ELB

任何

HTTPCode_Target_4XX_Count

HTTPCode_Target_5XX_Count

TargetResponseTime

UnHealthyHostCount

RDS 数据库实例

任何

CPUUtilization

ReadLatency

WriteLatency

BurstBalance

FailedSQLServerAgentJobsCount

RDS 数据库集群 任何

CPUUtilization

CommitLatency

DatabaseConnections

死锁数

FreeableMemory

NetworkThroughput

VolumeBytesUsed

Lambda 函数

任何

持续时间

错误

IteratorAge

ProvisionedConcurrencySpilloverInvocations

节流

SQS Queue

任何

ApproximateAgeOfOldestMessage

ApproximateNumberOfMessagesVisible

NumberOfMessagesSent

Amazon DynamoDB 表 任何

SystemErrors

UserErrors

已使用读取容量单位

已使用写入容量单位

ReadThrottleEvents

WriteThrottleEvents

ConditionalCheckFailedRequests

TransactionConflict

Amazon S3 存储桶

任何

如果启用了带有复制时间控制 (RTC) 的复制配置:

ReplicationLatency

BytesPendingReplication

OperationsPendingReplication

如果启用了请求指标:

5xxErrors

4xxErrors

BytesDownloaded

BytesUploaded

AWS Step Functions

任何
常规
  • ExecutionThrottled

  • ExecutionsAborted

  • ProvisionedBucketSize

  • ProvisionedRefillRate

  • ConsumedCapacity

如果状态机类型为 EXPRESS 或日志组级别为 OFF
  • ExecutionsFailed

  • ExecutionsTimedOut

如果状态机有 Lambda 函数
  • LambdaFunctionsFailed

  • LambdaFunctionsTimedOut

如果状态机有活动
  • ActivitiesFailed

  • ActivitiesTimedOut

  • ActivitiesHeartbeatTimedOut

如果状态机有服务集成
  • ServiceIntegrationsFailed

  • ServiceIntegrationsTimedOut

API Gateway REST API 阶段

任何
  • 4xxErrors

  • 5xxErrors

  • 延迟

ECS 集群

任何

CpuUtilized

MemoryUtilized

NetworkRxBytes

NetworkTxBytes

RunningTaskCount

PendingTaskCount

StorageReadBytes

StorageWriteBytes

CPUReservation(仅限 EC2 启动类型)

CPUUtilization(仅限 EC2 启动类型)

MemoryReservation(仅限 EC2 启动类型)

MemoryUtilization(仅限 EC2 启动类型)

GPUReservation(仅限 EC2 启动类型)

instance_cpu_utilization(仅限 EC2 启动类型)

instance_filesystem_utilization(仅限 EC2 启动类型)

instance_memory_utilization(仅限 EC2 启动类型)

instance_network_total_bytes(仅限 EC2 启动类型)

Java 应用程序

CpuUtilized

MemoryUtilized

NetworkRxBytes

NetworkTxBytes

RunningTaskCount

PendingTaskCount

StorageReadBytes

StorageWriteBytes

CPUReservation(仅限 EC2 启动类型)

CPUUtilization(仅限 EC2 启动类型)

MemoryReservation(仅限 EC2 启动类型)

MemoryUtilization(仅限 EC2 启动类型)

GPUReservation(仅限 EC2 启动类型)

instance_cpu_utilization(仅限 EC2 启动类型)

instance_filesystem_utilization(仅限 EC2 启动类型)

instance_memory_utilization(仅限 EC2 启动类型)

instance_network_total_bytes(仅限 EC2 启动类型)

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

ECS 服务

任何

CPUUtilization

MemoryUtilization

CpuUtilized

MemoryUtilized

NetworkRxBytes

NetworkTxBytes

RunningTaskCount

PendingTaskCount

StorageReadBytes

StorageWriteBytes

Java 应用程序

CPUUtilization

MemoryUtilization

CpuUtilized

MemoryUtilized

NetworkRxBytes

NetworkTxBytes

RunningTaskCount

PendingTaskCount

StorageReadBytes

StorageWriteBytes

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

EKS 集群

任何

cluster_failed_node_count

node_cpu_reserved_capacity

node_cpu_utilization

node_filesystem_utilization

node_memory_reserved_capacity

node_memory_utilization

node_network_total_bytes

pod_cpu_reserved_capacity

pod_cpu_utilization

pod_cpu_utilization_over_pod_limit

pod_memory_reserved_capacity

pod_memory_utilization

pod_memory_utilization_over_pod_limit

pod_network_rx_bytes

pod_network_tx_bytes

Java 应用程序

cluster_failed_node_count

node_cpu_reserved_capacity

node_cpu_utilization

node_filesystem_utilization

node_memory_reserved_capacity

node_memory_utilization

node_network_total_bytes

pod_cpu_reserved_capacity

pod_cpu_utilization

pod_cpu_utilization_over_pod_limit

pod_memory_reserved_capacity

pod_memory_utilization

pod_memory_utilization_over_pod_limit

pod_network_rx_bytes

pod_network_tx_bytes

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

EC2 上的 Kubernetes 集群

任何

cluster_failed_node_count

node_cpu_reserved_capacity

node_cpu_utilization

node_filesystem_utilization

node_memory_reserved_capacity

node_memory_utilization

node_network_total_bytes

pod_cpu_reserved_capacity

pod_cpu_utilization

pod_cpu_utilization_over_pod_limit

pod_memory_reserved_capacity

pod_memory_utilization

pod_memory_utilization_over_pod_limit

pod_network_rx_bytes

pod_network_tx_bytes

Java 应用程序

cluster_failed_node_count

node_cpu_reserved_capacity

node_cpu_utilization

node_filesystem_utilization

node_memory_reserved_capacity

node_memory_utilization

node_network_total_bytes

pod_cpu_reserved_capacity

pod_cpu_utilization

pod_cpu_utilization_over_pod_limit

pod_memory_reserved_capacity

pod_memory_utilization

pod_memory_utilization_over_pod_limit

pod_network_rx_bytes

pod_network_tx_bytes

java_lang_threading_threadcount

java_lang_classloading_loadedclasscount

java_lang_memory_heapmemoryusage_used

java_lang_memory_heapmemoryusage_committed

java_lang_operatingsystem_freephysicalmemorysize

java_lang_operatingsystem_freeswapspacesize

下表列出了每种组件类型的建议流程和流程指标。CloudWatch Application Insights 不建议对不在实例上运行的流程进行流程监控。

组件类型 工作负载类型 建议的流程 建议的指标

EC2 实例 (Windows Server)

Microsoft IIS/.NET Web 前端

w3wp

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

Microsoft SQL Server 数据库层

SQLAgent

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

sqlservr

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

sqlwriter

procstat cpu_usage,

procstat memory_rss

ReportingServicesService

procstat cpu_usage,

procstat memory_rss

MsDtsServr

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

Msmdsrv

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

.NET 工作线程池/中间层

w3wp

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes

.NET Core 层

w3wp

procstat cpu_usage,

procstat memory_rss,

procstat memory_vms,

procstat read_bytes,

procstat write_bytes