Amazon SageMaker 模型卡 - Amazon SageMaker

Amazon SageMaker 模型卡

重要

Amazon SageMaker Model Card 与 SageMaker Model Registry 集成。如果要在模型注册中心注册模型,可以使用集成来添加审核信息。有关更多信息,请参阅 更新模型版本的详细信息

使用 Amazon SageMaker 模型卡将有关机器学习 (ML) 模型的关键细节记录在一个地方,从而简化治理和报告。模型卡可以帮助您在模型的整个生命周期中捕捉有关模型的关键信息,并实施负责任的人工智能实践。

编目详细信息,例如模型的预期用途和风险评级、训练详细信息和指标、评估结果和观测结果,以及额外的标注,例如注意事项、建议和自定义信息。通过创建模型卡,您可以执行以下操作:

  • 提供有关如何使用模型的指导。

  • 通过详细描述模型训练和性能,为审计活动提供支持。

  • 说明模型如何支持业务目标。

模型卡为记录哪些信息提供了规范性指导,并包含自定义信息字段。创建模型卡后,您可以将其导出为 PDF 或下载,以与相关的利益相关者共享。除批准状态更新外,对模型卡的任何编辑都会导致额外的模型卡版本,以便具有不可改变的模型更改记录。

先决条件

要开始使用 Amazon SageMaker 模型卡,您必须拥有创建、编辑、查看和导出模型卡的权限。

模型的预期用途

指定模型的预期用途有助于确保模型开发人员和用户获得负责任地训练或部署模型所需的信息。模型的预期用途应说明适合使用该模型的场景,以及不建议使用该模型的场景。

我们建议包括:

  • 模型的一般用途

  • 模型适用的使用案例

  • 模型不适用的使用案例

  • 开发模型时做出的假设

模型的预期用途不仅限于技术细节,还描述了应如何在生产中使用模型、适合使用模型的场景以及其他注意事项,例如模型使用的数据类型或开发过程中做出的任何假设。

风险评级

开发人员为具有不同风险级别的使用案例创建机器学习模型。例如,批准贷款申请的模型可能比检测电子邮件类别的模型风险更高。鉴于模型的风险状况各不相同,模型卡为您提供了一个对模型的风险评级进行分类的字段。

此风险评级可以是 unknownlowmediumhigh。使用这些风险评级字段来标注未知、低、中或高风险模型,并帮助您的组织遵守有关将某些模型投入生产的任何现有规则。

模型卡 JSON 架构

模型卡的评估详细信息必须以 JSON 格式提供。如果您有由 SageMaker ClarifySageMaker Model Monitor 生成的现有 JSON 格式评估报告,请将其上传到 Amazon S3 并提供 S3 URI 以自动解析评估指标。有关更多信息和示例报告,请参阅 Amazon SageMaker 模型治理 - 模型卡 示例笔记本中的示例指标文件夹。

使用 SageMaker Python SDK 创建模型卡时,模型内容必须采用模型卡 JSON 架构并以字符串形式提供。提供类似于以下示例的模型内容。

{ "$schema": "http://json-schema.org/draft-07/schema#", "$id": "http://json-schema.org/draft-07/schema#", "title": "SageMakerModelCardSchema", "description": "Internal model card schema for SageMakerRepositoryService without model_package_details", "version": "0.1.0", "type": "object", "additionalProperties": false, "properties": { "model_overview": { "description": "Overview about the model", "type": "object", "additionalProperties": false, "properties": { "model_description": { "description": "description of model", "type": "string", "maxLength": 1024 }, "model_creator": { "description": "Creator of model", "type": "string", "maxLength": 1024 }, "model_artifact": { "description": "Location of the model artifact", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } }, "algorithm_type": { "description": "Algorithm used to solve the problem", "type": "string", "maxLength": 1024 }, "problem_type": { "description": "Problem being solved with the model", "type": "string" }, "model_owner": { "description": "Owner of model", "type": "string", "maxLength": 1024 } } }, "intended_uses": { "description": "Intended usage of model", "type": "object", "additionalProperties": false, "properties": { "purpose_of_model": { "description": "Why the model was developed?", "type": "string", "maxLength": 2048 }, "intended_uses": { "description": "intended use cases", "type": "string", "maxLength": 2048 }, "factors_affecting_model_efficiency": { "type": "string", "maxLength": 2048 }, "risk_rating": { "description": "Risk rating for model card", "$ref": "#/definitions/risk_rating" }, "explanations_for_risk_rating": { "type": "string", "maxLength": 2048 } } }, "business_details": { "description": "Business details of model", "type": "object", "additionalProperties": false, "properties": { "business_problem": { "description": "What business problem does the model solve?", "type": "string", "maxLength": 2048 }, "business_stakeholders": { "description": "Business stakeholders", "type": "string", "maxLength": 2048 }, "line_of_business": { "type": "string", "maxLength": 2048 } } }, "training_details": { "description": "Overview about the training", "type": "object", "additionalProperties": false, "properties": { "objective_function": { "description": "the objective function the model will optimize for", "function": { "$ref": "#/definitions/objective_function" }, "notes": { "type": "string", "maxLength": 1024 } }, "training_observations": { "type": "string", "maxLength": 1024 }, "training_job_details": { "type": "object", "additionalProperties": false, "properties": { "training_arn": { "description": "SageMaker Training job arn", "type": "string", "maxLength": 1024 }, "training_datasets": { "description": "Location of the model datasets", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } }, "training_environment": { "type": "object", "additionalProperties": false, "properties": { "container_image": { "description": "SageMaker training image uri", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } } } }, "training_metrics": { "type": "array", "items": { "maxItems": 50, "$ref": "#/definitions/training_metric" } }, "user_provided_training_metrics": { "type": "array", "items": { "maxItems": 50, "$ref": "#/definitions/training_metric" } }, "hyper_parameters": { "type": "array", "items": { "maxItems": 100, "$ref": "#/definitions/training_hyper_parameter" } }, "user_provided_hyper_parameters": { "type": "array", "items": { "maxItems": 100, "$ref": "#/definitions/training_hyper_parameter" } } } } } }, "evaluation_details": { "type": "array", "default": [], "items": { "type": "object", "required": [ "name" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,63}" }, "evaluation_observation": { "type": "string", "maxLength": 2096 }, "evaluation_job_arn": { "type": "string", "maxLength": 256 }, "datasets": { "type": "array", "items": { "type": "string", "maxLength": 1024 }, "maxItems": 10 }, "metadata": { "description": "additional attributes associated with the evaluation results", "type": "object", "additionalProperties": { "type": "string", "maxLength": 1024 } }, "metric_groups": { "type": "array", "default": [], "items": { "type": "object", "required": [ "name", "metric_data" ], "properties": { "name": { "type": "string", "pattern": ".{1,63}" }, "metric_data": { "type": "array", "items": { "anyOf": [ { "$ref": "#/definitions/simple_metric" }, { "$ref": "#/definitions/linear_graph_metric" }, { "$ref": "#/definitions/bar_chart_metric" }, { "$ref": "#/definitions/matrix_metric" } ] } } } } } } } }, "additional_information": { "additionalProperties": false, "type": "object", "properties": { "ethical_considerations": { "description": "Any ethical considerations that the author wants to provide", "type": "string", "maxLength": 2048 }, "caveats_and_recommendations": { "description": "Caveats and recommendations for people who might use this model in their applications.", "type": "string", "maxLength": 2048 }, "custom_details": { "type": "object", "additionalProperties": { "$ref": "#/definitions/custom_property" } } } } }, "definitions": { "source_algorithms": { "type": "array", "minContains": 1, "maxContains": 1, "items": { "type": "object", "additionalProperties": false, "required": [ "algorithm_name" ], "properties": { "algorithm_name": { "description": "The name of an algorithm that was used to create the model package. The algorithm must be either an algorithm resource in your SageMaker account or an algorithm in AWS Marketplace that you are subscribed to.", "type": "string", "maxLength": 170 }, "model_data_url": { "description": "The Amazon S3 path where the model artifacts, which result from model training, are stored.", "type": "string", "maxLength": 1024 } } } }, "inference_specification": { "type": "object", "additionalProperties": false, "required": [ "containers" ], "properties": { "containers": { "description": "Contains inference related information which were used to create model package.", "type": "array", "minContains": 1, "maxContains": 15, "items": { "type": "object", "additionalProperties": false, "required": [ "image" ], "properties": { "model_data_url": { "description": "The Amazon S3 path where the model artifacts, which result from model training, are stored.", "type": "string", "maxLength": 1024 }, "image": { "description": "Inference environment path. The Amazon EC2 Container Registry (Amazon ECR) path where inference code is stored.", "type": "string", "maxLength": 255 }, "nearest_model_name": { "description": "The name of a pre-trained machine learning benchmarked by Amazon SageMaker Inference Recommender model that matches your model.", "type": "string" } } } } } }, "risk_rating": { "description": "Risk rating of model", "type": "string", "enum": [ "High", "Medium", "Low", "Unknown" ] }, "custom_property": { "description": "Additional property in section", "type": "string", "maxLength": 1024 }, "objective_function": { "description": "objective function that training job is optimized for", "additionalProperties": false, "properties": { "function": { "type": "string", "enum": [ "Maximize", "Minimize" ] }, "facet": { "type": "string", "maxLength": 63 }, "condition": { "type": "string", "maxLength": 63 } } }, "training_metric": { "description": "training metric data", "type": "object", "required": [ "name", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "value": { "type": "number" } } }, "training_hyper_parameter": { "description": "training hyper parameter", "type": "object", "required": [ "name" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "value": { "type": "string", "pattern": ".{0,255}" } } }, "linear_graph_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "linear_graph" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "array", "items": { "type": "number" }, "minItems": 2, "maxItems": 2 }, "minItems": 1 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_string" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "bar_chart_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "bar_chart" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "number" }, "minItems": 1 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_array" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "matrix_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "matrix" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "array", "items": { "type": "number" }, "minItems": 1, "maxItems": 20 }, "minItems": 1, "maxItems": 20 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_array" }, "y_axis_name": { "$ref": "#/definitions/axis_name_array" } } }, "simple_metric": { "description": "metric data", "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "number", "string", "boolean" ] }, "value": { "anyOf": [ { "type": "number" }, { "type": "string", "maxLength": 63 }, { "type": "boolean" } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_string" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "axis_name_array": { "type": "array", "items": { "type": "string", "maxLength": 63 } }, "axis_name_string": { "type": "string", "maxLength": 63 } } }