

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 查询 Prometheus 指标
<a name="AMP-query"></a>

现在，指标已被摄取到工作区，您可以对其进行查询。

要创建可视化表示指标的控制面板，可以使用 Amazon Managed Grafana 等服务。Amazon Managed Grafana（或 Grafana 的独立实例）可以构建一个图形界面，以多种显示演示样式显示您的指标。有关 Amazon Managed Grafana 的更多信息，请参阅[《Amazon Managed Grafana 用户指南》](https://docs.aws.amazon.com/grafana/latest/userguide/)。

您还可以创建一次性查询，探索您的数据，或使用直接查询编写自己的应用程序来使用您的指标。直接查询使用 Amazon Managed Service for Prometheus API 和标准 Prometheus 查询语言 PromQL 从 Prometheus 工作区获取数据。有关 PromQL 及其语法的更多信息，请参阅 Prometheus 文档中的 [Querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/)。

**Topics**
+ [PromQL 备忘单](#promql-cheat-sheet)
+ [基本选择器](#promql-basic-selectors)
+ [范围向量选择器](#promql-range-selectors)
+ [聚合运算符](#promql-aggregation-operators)
+ [常见的函数](#promql-functions)
+ [二元运算符](#promql-operators)
+ [实用查询示例](#promql-practical-examples)
+ [确保指标查询安全](AMP-secure-querying.md)
+ [设置 Amazon Managed Grafana 以与 Amazon Managed Service for Prometheus 配合使用](AMP-amg.md)
+ [设置 Grafana 开源或 Grafana Enterprise 以与 Amazon Managed Service for Prometheus 配合使用](AMP-onboard-query-standalone-grafana.md)
+ [使用在 Amazon EKS 集群中运行的 Grafana 进行查询](AMP-onboard-query-grafana-7.3.md)
+ [使用兼容普罗米修斯进行查询 APIs](AMP-onboard-query-APIs.md)
+ [获取每次查询的查询使用统计数据](AMP-stats.md)

## PromQL 备忘单
<a name="promql-cheat-sheet"></a>

在 Amazon Managed Service for Prometheus 工作区中查询指标时，使用此 PromQL（Prometheus 查询语言）备忘单作为快速参考。借助 PromQL，您可以通过其功能查询语言实时选择和聚合时间序列数据。

有关 PromQL 的更多详细信息，请参阅网站上的 [PromQL 备忘单](https://promlabs.com/promql-cheat-sheet/)。*PromLabs*

## 基本选择器
<a name="promql-basic-selectors"></a>

按指标名称和标签匹配程序选择时间序列：

```
# Select all time series with the metric name http_requests_total
http_requests_total

# Select time series with specific label values
http_requests_total{job="prometheus", method="GET"}

# Use label matchers
http_requests_total{status_code!="200"}          # Not equal
http_requests_total{status_code=~"2.."}          # Regex match
http_requests_total{status_code!~"4.."}          # Negative regex match
```

## 范围向量选择器
<a name="promql-range-selectors"></a>

选择随时间推移而变化的样本范围：

```
# Select 5 minutes of data
http_requests_total[5m]

# Time units: s (seconds), m (minutes), h (hours), d (days), w (weeks), y (years)
cpu_usage[1h]
memory_usage[30s]
```

## 聚合运算符
<a name="promql-aggregation-operators"></a>

跨多个时间序列聚合数据：

```
# Sum all values
sum(http_requests_total)

# Sum by specific labels
sum by (job) (http_requests_total)
sum without (instance) (http_requests_total)

# Other aggregation operators
avg(cpu_usage)                    # Average
min(response_time)               # Minimum
max(response_time)               # Maximum
count(up)                        # Count of series
stddev(cpu_usage)               # Standard deviation
```

## 常见的函数
<a name="promql-functions"></a>

应用函数来转换数据：

```
# Rate of increase per second (for counters)
rate(http_requests_total[5m])

# Increase over time range
increase(http_requests_total[1h])

# Derivative (for gauges)
deriv(cpu_temperature[5m])

# Mathematical functions
abs(cpu_usage - 50)              # Absolute value
round(cpu_usage, 0.1)           # Round to nearest 0.1
sqrt(memory_usage)              # Square root

# Time functions
time()                          # Current Unix timestamp
hour()                          # Hour of day (0-23)
day_of_week()                   # Day of week (0-6, Sunday=0)
```

## 二元运算符
<a name="promql-operators"></a>

执行算术和逻辑运算：

```
# Arithmetic operators
cpu_usage + 10
memory_total - memory_available
disk_usage / disk_total * 100

# Comparison operators (return 0 or 1)
cpu_usage > 80
memory_usage < 1000
response_time >= 0.5

# Logical operators
(cpu_usage > 80) and (memory_usage > 1000)
(status_code == 200) or (status_code == 201)
```

## 实用查询示例
<a name="promql-practical-examples"></a>

您可以在 Amazon Managed Service for Prometheus 工作区中使用的常见监控查询：

```
# CPU usage percentage
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

# Request rate per second
sum(rate(http_requests_total[5m])) by (job)

# Error rate percentage
sum(rate(http_requests_total{status_code=~"5.."}[5m])) / 
sum(rate(http_requests_total[5m])) * 100

# 95th percentile response time
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))

# Top 5 instances by CPU usage
topk(5, avg by (instance) (cpu_usage))
```