Metrics analyzed by AWS Compute Optimizer - AWS Compute Optimizer

Metrics analyzed by AWS Compute Optimizer

After you opt in, AWS Compute Optimizer analyzes the specifications, such as vCPUs, memory, or storage, and the CloudWatch metrics of your running resources from a period over the last 14 days. If you activate the enhanced infrastructure metrics recommendation preference, AWS Compute Optimizer analyzes your resources for up to 93 days.

The analysis can take up to 24 hours to complete. When the analysis is complete, the findings are displayed on the dashboard page of the Compute Optimizer console. For more information, see Using the AWS Compute Optimizer dashboard.

Note
  • To generate recommendations for Amazon EC2 instances, Auto Scaling groups, Amazon EBS volumes, Lambda functions, and commercial software licenses, Compute Optimizer uses the maximum utilization point within each five-minute time interval over the lookback period. For ECS services on Fargate recommendations, Compute Optimizer uses the maximum utilization point within each one-minute time interval.

  • AWS might use your utilization data to help improve the overall quality of Compute Optimizer's recommendations. To stop AWS using your utilization data, contact AWS Support.

EC2 instance metrics

Metrics analyzed for EC2 instances

Compute Optimizer analyzes the following CloudWatch metrics of your EC2 instances, including instances that are part of Auto Scaling groups.

Metric Description
CPUUtilization

The percentage of allocated EC2 compute units that are in use on the instance. This metric identifies the processing power that's required to run an application on an instance.

MemoryUtilization

The percentage of memory that's used during the sample period. This metric identifies the memory that's required to run an application on an instance.

Memory utilization metrics are analyzed for the following resources:

GPUUtilization

The percentage of allocated GPUs that are currently in use on the instance.

Note

To allow Compute Optimizer analyze the GPU utilization metric of your instances, install the CloudWatch agent on your instances. For more information, see Enabling NVIDIA GPU utilization with the CloudWatch agent.

GPUMemoryUtilization

The percentage of total GPU memory that's currently in use on the instance.

NetworkIn

The number of bytes that's received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an instance.

NetworkOut

The number of bytes that are sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic from an instance.

NetworkPacketsIn

The number of packets that are received by the instance.

NetworkPacketsOut

The number of packets that are sent out by the instance.

DiskReadOps

The read operations per second of the instance store volume of the instance.

DiskWriteOps

The write operations per second of the instance store volume of the instance.

DiskReadBytes

The read bytes per second of the instance store volume of the instance.

DiskWriteBytes

The write bytes per second of the instance store volume of the instance.

VolumeReadBytes

The read bytes per second of EBS volumes attached to the instance. Displayed as KiBs in the console.

VolumeWriteBytes

The write bytes per second of EBS volumes attached to the instance. Displayed as KiBs in the console.

VolumeReadOps

The read operations per second of EBS volumes attached to the instance.

VolumeWriteOps

The write operations per second of EBS volumes attached to the instance.

For more information about instance metrics, see List the available CloudWatch metrics for your instances in the Amazon Elastic Compute Cloud User Guide. For more information about EBS volume metrics, see Amazon CloudWatch metrics for Amazon EBS in the Amazon Elastic Compute Cloud User Guide.

Enabling memory utilization with the CloudWatch agent

To have Compute Optimizer analyze the memory utilization metric of your instances, install the CloudWatch agent on your instances. Enabling Compute Optimizer to analyze memory utilization data for your instances provides an additional measurement of data that further improves Compute Optimizer's recommendations. For more information about installing the CloudWatch agent, see Collecting Metrics and Logs from Amazon EC2 Instances and On-Premises Servers with the CloudWatch agent in the Amazon CloudWatch User Guide.

On Linux instances, Compute Optimizer analyses the mem_used_percent metric in the CWAgent namespace, or the legacy MemoryUtilization metric in the System/Linux namespace. On Windows instances, Compute Optimizer analyses the Available MBytes metric in the CWAgent namespace. If both the Available MBytes and Memory % Committed Bytes In Use metrics are configured in the CWAgent namespace, Compute Optimizer chooses Available MBytes as the primary memory metric to generate recommendations.

Note
  • We recommend that you configure the CWAgent namespace to use Available MBytes as your memory metric for Windows instances.

  • Compute Optimizer also supports the Available KBytes and Available Bytes metrics, and prioritizes both over the Memory % Committed Bytes In Use metric when generating recommendations for Windows instances.

Additionally, the namespace must contain the InstanceId dimension. If the InstanceId dimension is missing or you overwrite it with a custom dimension name, Compute Optimizer can't collect memory utilization data for your instance. Namespaces and dimensions are defined in the CloudWatch agent configuration file. For more information, see Create the CloudWatch agent Configuration File in the Amazon CloudWatch User Guide.

Example: CloudWatch agent configuration for memory collection

{ "agent": { "metrics_collection_interval": 60, "run_as_user": "root" }, "metrics": { "namespace": "CWAgent", "append_dimensions": { "InstanceId": "${aws:InstanceId}" }, "metrics_collected": { "mem": { "measurement": [ "mem_used_percent" ], "metrics_collection_interval": 60 } } } }

Enabling NVIDIA GPU utilization with the CloudWatch agent

To allow Compute Optimizer to analyze the NVIDIA GPU utilization metric of your instances, do the following:

  1. Install the CloudWatch agent on your instances. For more information, see Installing the CloudWatch agent in the Amazon CloudWatch User Guide.

  2. Allow the CloudWatch agent to collect NVIDIA GPU metrics. For more information, see Collect NVIDIA GPU metrics in the Amazon CloudWatch User Guide.

Compute Optimizer analyzes the following NVIDIA GPU metrics:

  • nvidia_smi_utilization_gpu

  • nvidia_smi_memory_used

  • nvidia_smi_encoder_stats_session_count

  • nvidia_smi_encoder_stats_average_fps

  • nvidia_smi_encoder_stats_average_latency

  • nvidia_smi_temperature_gpu

The namespace must contain the InstanceId dimension and index dimensions. If the dimensions are missing or you overwrite them with a custom dimension name, Compute Optimizer can't collect GPU utilization data for your instance. Namespaces and dimensions are defined in the CloudWatch agent configuration file. For more information, see Create the CloudWatch agent Configuration File in the Amazon CloudWatch User Guide.

Configure external metrics ingestion

You can use the external metrics ingestion feature to configure AWS Compute Optimizer to ingest EC2 memory utilization metrics from one of the four observability products: Datadog, Dynatrace, Instana, and New Relic. When you enable external metrics ingestion, Compute Optimizer analyzes your external EC2 memory utilization metrics in addition to your CPU, disk, network, IO, and throughput data to generate EC2 rightsizing recommendations. These recommendations can provide you with additional savings and enhanced performance. For more information, see External metrics ingestion.

EBS volume metrics

Compute Optimizer analyzes the following CloudWatch metrics of your EBS volumes.

Metric Description
VolumeReadBytes

The read bytes per second of the EBS volume.

VolumeWriteBytes

The write bytes per second of the EBS volume.

VolumeReadOps

The read operations per second of the EBS volume.

VolumeWriteOps

The write operations per second of the EBS volume.

For more information about these metrics, see Amazon CloudWatch metrics for Amazon EBS in the Amazon Elastic Compute Cloud User Guide.

Lambda function metrics

Compute Optimizer analyzes the following CloudWatch metrics of your Lambda functions.

Metric Description
Invocations

The number of times your function code is executed, including successful executions and executions that result in a function error.

Duration

The amount of time that your function code spends processing an event.

Errors

The number of invocations that result in a function error. Function errors include exceptions thrown by your code and exceptions thrown by the Lambda runtime. The runtime returns errors for issues such as timeouts and configuration errors.

Throttles

The number of invocation requests that are throttled.

For more information about these metrics, see Working with AWS Lambda function metrics in the AWS Lambda Developer Guide.

In addition to these metrics, Compute Optimizer analyzes the memory utilization of your function during the look-back period. For more information about memory utilization for Lambda functions, see Understanding AWS Lambda behavior using Amazon CloudWatch Logs Insights in the AWS Management & Governance Blog and Using Lambda Insights in CloudWatch in the AWS Lambda Developer Guide.

Metrics for Amazon ECS services on Fargate

Compute Optimizer analyzes the following CloudWatch and Amazon ECS utilization metrics of your Amazon ECS services on Fargate.

Metric Description
CPUUtilization

The percentage of CPU capacity that's used in the service.

MemoryUtilization

The percentage of memory that's used in the service.

For more information about these metrics, see Amazon ECS CloudWatch metrics in the Amazon ECS User Guide for AWS Fargate.

Metrics for commercial software licenses

Compute Optimizer analyzes the following metric to generate recommendations for commercial software licenses.

mssql_enterprise_features_used — the number of Microsoft SQL Server Enterprise edition features in use. The features are as follows:

  • More than 128GB of memory for the buffer pool extension

  • More than 48 vCPUs

  • Always On availability groups with more than 1 database

  • Asynchronous commit replicas

  • Read-only replicas

  • Asynchronous database mirroring

  • tempdb memory-optimized metadata is enabled

  • R or Python extensions

  • Peer-to-peer replication

  • Resource Governor

RDS database metrics

Compute Optimizer analyzes the following CloudWatch metrics of your Amazon RDS DB and Aurora DB instances.

Amazon RDS

Compute Optimizer analyzes the following CloudWatch metrics of your Amazon RDS DB instances.

Metric Description
CPUUtilization

The percentage of allocated compute units that are in use on the DB instance. This metric identifies the processing power that's required to run an application on an instance.

DatabaseConnections

The number of client sessions that are connected to the DB instance.

NetworkReceiveThroughput

The incoming (receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

NetworkTransmitThroughput

The outgoing (transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

ReadIOPS

The average number of disk read I/O operations per second.

WriteIOPS

The average number of disk write I/O operations per second.

ReadThroughput

The average number of bytes read from disk per second.

WriteThroughput

The average number of bytes written to disk per second.

EBSIOBalance%

The percentage of I/O credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.

EBSByteBalance%

The percentage of throughput credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.

FreeStorageSpace

The amount of available storage space.

If you enabled Amazon RDS Performance Insights, Compute Optimizer also analyzes the following metrics of your Amazon RDS DB instance. To enable Performance Insights for your DB instances, see Turning Performance Insights on and off for Amazon RDS in the Amazon Relational Database Service User Guide.

Note

If Performance Insights isn’t enabled, Compute Optimizer doesn’t provide recommendations to reduce vCPU capacity.

Metric Description
DBLoad

The level of session activity in your database. For more information, see Database load in the Amazon Relational Database Service User Guide.

os.swap.in

The amount of memory, in kilobytes, swapped in from disk.

os.swap.out

The amount of memory, in kilobytes, swapped out to disk.

For more information about Amazon RDS metrics, see Metrics reference for Amazon RDS in the Amazon Relational Database Service User Guide.

Amazon Aurora

Compute Optimizer analyzes the following CloudWatch metrics of your Amazon Aurora DB instances.

Metric Description
CPUUtilization

The percentage of CPU used by an Aurora DB instance.

DatabaseConnections

The number of client network connections to the database instance.

NetworkReceiveThroughput

The amount of network throughput received from clients by each instance in the Aurora DB cluster. This throughput doesn't include network traffic between instances in the Aurora DB cluster and the cluster volume.

NetworkTransmitThroughput

The amount of network throughput sent to clients by each instance in the Aurora DB cluster. This throughput doesn't include network traffic between instances in the DB cluster and the cluster volume.

StorageNetworkReadThroughput

The amount of network throughput received from the Aurora storage subsystem by each instance in the DB cluster.

StorageNetworkWriteThroughput

The amount of network throughput sent to the Aurora storage subsystem by each instance in the Aurora DB cluster.

AuroraMemoryHealthState

Indicates the memory health state. A value of 0 equals NORMAL. A value of 10 equals RESERVED, which means that the server is approaching a critical level of memory usage.

Note

This metric applies to Aurora MySQL only.

AuroraMemoryNumDeclinedSqlTotal

The total number of queries declined as part of out-of-memory (OOM) avoidance.

Note

This metric applies to Aurora MySQL only.

AuroraMemoryNumKillConnTotal

The total number of connections closed as part of OOM avoidance.

Note

This metric applies to Aurora MySQL only.

AuroraMemoryNumKillQueryTotal

The total number of queries ended as part of OOM avoidance.

Note

This metric applies to Aurora MySQL only.

ReadIOPSEphemeralStorage

The average number of disk read I/O operations to Ephemeral NVMe storage.

Note

This metric applies to instances that support locally attached non-volatile memory express (NVMe) storage.

WriteIOPSEphemeralStorage

The average number of disk write I/O operations to Ephemeral NVMe storage.

Note

This metric applies to instances that support locally attached non-volatile memory express (NVMe) storage.

ReadIOPS

The average number of disk I/O operations per second but the reports read and write separately, in 1-minute intervals.

WriteIOPS

The number of Aurora storage write records generated per second. This is more or less the number of log records generated by the database. These do not correspond to 8K page writes, and do not correspond to network packets sent.

For more information, see Amazon CloudWatch metrics for Amazon Aurora in the Amazon Aurora User Guide.

If you enabled Performance Insights for Aurora, Compute Optimizer also analyzes the following metrics of your Aurora DB instances. To enable Performance Insights for Aurora, see Turning Performance Insights on and off for Aurora in the Amazon Aurora User Guide.

Metric Description
DBLoad

The number of active sessions for the database. Typically, you want the data for the average number of active sessions. In Performance Insights, this data is queried as db.load.avg.

os.memory.outOfMemoryKillCount

The number of OOM kills that happened over the last collection interval.

For more information about Aurora metrics, see Metrics reference for Amazon Aurora in the Amazon Aurora User Guide.