解释结果

运行基准处理作业并获得数据集的统计数据和约束后，可以执行监控作业，以便计算统计数据并列出相对于基准约束遇到的任何违规情况。默认情况下，亚马逊 CloudWatch 指标还会在您的账户中报告。有关在 Amazon SageMaker Studio 中查看监控结果的信息，请参阅在 Amazon SageMaker Studio 中可视化实时端点的结果。

列出执行

计划按指定间隔启动监控作业。下面的代码列出了最近的五次执行。如果您在创建每小时计划后运行此代码，则执行可能为空，您可能需要等到越过小时界限（inUTC）后才能看到执行开始。下面的代码包含等待的逻辑。


mon_executions = my_default_monitor.list_executions()
print("We created a hourly schedule above and it will kick off executions ON the hour (plus 0 - 20 min buffer.\nWe will have to wait till we hit the hour...")

while len(mon_executions) == 0:
    print("Waiting for the 1st execution to happen...")
    time.sleep(60)
    mon_executions = my_default_monitor.list_executions()

检查特定执行

在上一步中，您选取了最新的已完成或失败的计划执行。您可以探索什么是正确的，什么是错误的。终端状态包括：

Completed - 监控执行已完成，未在违规情况报告中找到任何问题。
CompletedWithViolations - 执行已完成，但检测到约束违反情况。
Failed - 监控执行失败，可能是因客户端错误（例如，角色问题）或基础设施问题导致的。要确定原因，请参阅 FailureReason 和 ExitMessage。


latest_execution = mon_executions[-1] # latest execution's index is -1, previous is -2 and so on..
time.sleep(60)
latest_execution.wait(logs=False)

print("Latest execution status: {}".format(latest_execution.describe()['ProcessingJobStatus']))
print("Latest execution result: {}".format(latest_execution.describe()['ExitMessage']))

latest_job = latest_execution.describe()
if (latest_job['ProcessingJobStatus'] != 'Completed'):
        print("====STOP==== \n No completed executions to inspect further. Please wait till an execution completes or investigate previously reported failures.")


report_uri=latest_execution.output.destination
print('Report Uri: {}'.format(report_uri))

列出生成的报告

使用以下代码列出生成的报告。


from urllib.parse import urlparse
s3uri = urlparse(report_uri)
report_bucket = s3uri.netloc
report_key = s3uri.path.lstrip('/')
print('Report bucket: {}'.format(report_bucket))
print('Report key: {}'.format(report_key))

s3_client = boto3.Session().client('s3')
result = s3_client.list_objects(Bucket=report_bucket, Prefix=report_key)
report_files = [report_file.get("Key") for report_file in result.get('Contents')]
print("Found Report Files:")
print("\n ".join(report_files))

违规情况报告

如果与基准相比存在违规情况，则会在违规情况报告中生成它们。使用以下代码列出违规情况。


violations = my_default_monitor.latest_monitoring_constraint_violations()
pd.set_option('display.max_colwidth', -1)
constraints_df = pd.io.json.json_normalize(violations.body_dict["violations"])
constraints_df.head(10)

这仅适用于包含表格式数据的数据集。以下架构文件指定计算的统计数据和监控的违规情况。

表格数据集的输出文件

文件名称	描述
`statistics.json`	包含所分析数据集中每个特征的列式统计数据。请参阅下一个主题中此文件的架构。注意创建此文件仅用于数据质量监控。
`constraint_violations.json`	包含在当前数据集中找到的相对于 `baseline_statistics` 路径中指定的基准统计数据文件和 `baseline_constaints` 路径中指定的约束文件的所有违规情况。

默认情况下，会为每项功能Amazon SageMaker 模型监控器预建容器保存一组 Amazon CloudWatch 指标。

容器代码可以在这个位置发出 CloudWatch 指标:/opt/ml/output/metrics/cloudwatch.

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

预构建容器

可视化实时端点的结果

解释结果

列出执行

检查特定执行

列出生成的报告

违规情况报告

注意