检测视频中的标签 - Amazon Rekognition

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

检测视频中的标签

Amazon Rekognition Video 可以检测视频中的标签(对象和概念)以及检测到标签的时间。有关SDK代码示例,请参阅使用 Java 或 Python 分析存储在亚马逊 S3 存储桶中的视频 (SDK)。有关 AWS CLI 示例,请参阅使用分析视频 AWS Command Line Interface

Amazon Rekognition Video 标签检测是一项异步操作。要开始检测视频中的标签,请致电StartLabelDetection

Amazon Rekognition Video 会将视频分析的完成状态发布到 Amazon Simple Notification Service 主题。如果视频分析成功,请致电GetLabelDetection获取检测到的标签。有关调用视频分析API操作的信息,请参阅调用 Amazon Rekognition Video 操作

StartLabelDetection请求

以下示例是 StartLabelDetection 操作的请求。您为 StartLabelDetection 操作提供了存储在 Amazon S3 存储桶中的视频。在示例请求中JSON,指定了 Amazon S3 存储桶和视频名称以及MinConfidenceFeaturesSettings、和NotificationChannel

MinConfidence 是 Amazon Rekognition Video 对检测到的标签或实例边界框(如果检测到)要在响应中返回而对其准确度所具有的最小置信度。

使用Features,您可以指定要将 GENERAL _ 作为响应的一部分LABELS返回。

使用Settings,您可以筛选退回的商品是否为 GENERAL _ LABELS。对于标签,您可以使用纳入和排除筛选器。您也可以按特定标签、单个标签或标签类别进行筛选:

  • LabelInclusionFilters – 用于指定要在响应中纳入哪些标签

  • LabelExclusionFilters – 用于指定要从响应中排除哪些标签。

  • LabelCategoryInclusionFilters – 用于指定要在响应中纳入哪些标签类别。

  • LabelCategoryExclusionFilters - 用于指定要从响应中排除哪些标签类别。

您还可以根据需要组合纳入和排除筛选器,排除某些标签或类别,而纳入其他标签或类别。

NotificationChannel是你希望 Amazon Rekognition Video 向其发布标签检测操作完成状态的亚马逊SNS主题之一。ARN如果您使用的是AmazonRekognitionServiceRole权限策略,则亚马逊SNS主题的主题名称必须以 Rekognition 开头。

以下是JSON表单StartLabelDetection请求示例,包括过滤器:

{ "ClientRequestToken": "5a6e690e-c750-460a-9d59-c992e0ec8638", "JobTag": "5a6e690e-c750-460a-9d59-c992e0ec8638", "Video": { "S3Object": { "Bucket": "bucket", "Name": "video.mp4" } }, "Features": ["GENERAL_LABELS"], "MinConfidence": 75, "Settings": { "GeneralLabels": { "LabelInclusionFilters": ["Cat", "Dog"], "LabelExclusionFilters": ["Tiger"], "LabelCategoryInclusionFilters": ["Animals and Pets"], "LabelCategoryExclusionFilters": ["Popular Landmark"] } }, "NotificationChannel": { "RoleArn": "arn:aws:iam::012345678910:role/SNSAccessRole", "SNSTopicArn": "arn:aws:sns:us-east-1:012345678910:notification-topic", } }

GetLabelDetection 操作响应

GetLabelDetection 将返回一个数组 (Labels),其中包含有关在视频中检测到的标签的信息。数组可以按时间排序,也可以按指定 SortBy 参数时检测到的标签进行排序。也可以使用 AggregateBy 参数选择如何汇总响应项。

以下示例是的JSON响应GetLabelDetection。在响应中,请注意以下内容:

  • 排序顺序 – 返回的标签数组按时间进行排序。要按标签进行排序,请为 GetLabelDetectionSortBy 输入参数中指定 NAME。如果标签在视频中多次出现,则会有 (LabelDetection) 元素的多个实例。默认排序顺序为 TIMESTAMP,而辅助排序顺序为 NAME

  • 标签信息LabelDetection 数组元素包含一个(标签)对象,该对象包含标签名称和 Amazon Rekognition 在检测到的标签的准确性中具有的置信度。Label 对象还包括标签的分层分类和常见标签的边界框信息。Timestamp 是从视频开始到检测到标签的时间,以毫秒为单位。

    还会返回与标签关联的任何类别或别名的相关信息。对于按视频SEGMENTS汇总的结果,将返回 StartTimestampMillisEndTimestampMillisDurationMillis 结构,它们分别定义了片段的开始时间、结束时间和持续时间。

  • 汇总 – 指定返回结果时的汇总方式。默认为按 TIMESTAMPS 汇总。您也可以选择按 SEGMENTS 汇总,即在某个时间段内汇总结果。如果按 SEGMENTS 汇总,则不会返回有关检测到的带有边界框的实例的信息。只返回在分段期间检测到的标签。

  • 分页信息 - 此示例显示一页标签检测信息。您可以为 GetLabelDetectionMaxResults 输入参数中指定要返回的 LabelDetection 对象的数量。如果存在的结果的数量超过了 MaxResults,则 GetLabelDetection 会返回一个令牌 (NextToken),用于获取下一页的结果。有关更多信息,请参阅 获取 Amazon Rekognition Video 分析结果

  • 视频信息 – 此响应包含有关由 GetLabelDetection 返回的每页信息中的视频格式(VideoMetadata)的信息。

以下是JSON表单中的示例 GetLabelDetection 响应,其聚合方式TIMESTAMPS为:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "Timestamp": 1000, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 1000, "Label": { "Name": "Cup", "Categories": [ { "Name": "Kitchen and Dining" } ], "Aliases": [ { "Name": "Mug" } ], "Parents": [], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 2000, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567, }, "Confidence": 99.9364013671875 } ] } }, { "Timestamp": 4000, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 } ] } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

以下是JSON表单中的示例 GetLabelDetection 响应,其聚合方式SEGMENTS为:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "StartTimestampMillis": 225, "EndTimestampMillis": 3578, "DurationMillis": 3353, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 // Maximum confidence score for Segment mode } }, { "StartTimestampMillis": 7578, "EndTimestampMillis": 12371, "DurationMillis": 4793, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875 } }, { "StartTimestampMillis": 22225, "EndTimestampMillis": 22578, "DurationMillis": 2353, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

转变回 GetLabelDetection 应

使用 GetLabelDetection API操作检索结果时,可能需要响应结构来模仿旧的API响应结构,其中主标签和别名都包含在同一个列表中。

上一节中的示例JSON响应显示了来自的API响应的当前形式 GetLabelDetection。

以下示例显示了之前的响应 GetLabelDetection API:

{ "Labels": [ { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 60.51791763305664, "Parents": [], "Name": "Leaf" } }, { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 99.53411102294922, "Parents": [], "Name": "Human" } }, { "Timestamp": 0, "Label": { "Instances": [ { "BoundingBox": { "Width": 0.11109819263219833, "Top": 0.08098889887332916, "Left": 0.8881205320358276, "Height": 0.9073750972747803 }, "Confidence": 99.5831298828125 }, { "BoundingBox": { "Width": 0.1268676072359085, "Top": 0.14018426835536957, "Left": 0.0003282368124928324, "Height": 0.7993982434272766 }, "Confidence": 99.46029663085938 } ], "Confidence": 99.63411102294922, "Parents": [], "Name": "Person" } }, . . . { "Timestamp": 166, "Label": { "Instances": [], "Confidence": 73.6471176147461, "Parents": [ { "Name": "Clothing" } ], "Name": "Sleeve" } } ], "LabelModelVersion": "2.0", "JobStatus": "SUCCEEDED", "VideoMetadata": { "Format": "QuickTime / MOV", "FrameRate": 23.976024627685547, "Codec": "h264", "DurationMillis": 5005, "FrameHeight": 674, "FrameWidth": 1280 } }

如果需要,您可以转换当前响应以遵循旧响应的格式。您可以使用以下示例代码将最新的API响应转换为先前的API响应结构:

from copy import deepcopy VIDEO_LABEL_KEY = "Labels" LABEL_KEY = "Label" ALIASES_KEY = "Aliases" INSTANCE_KEY = "Instances" NAME_KEY = "Name" #Latest API response sample for AggregatedBy SEGMENTS EXAMPLE_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, ] } #Output example after the transformation for AggregatedBy SEGMENTS EXPECTED_EXPANDED_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, { "Timestamp": 0, "Label":{ "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, ] } #Latest API response sample for AggregatedBy TIMESTAMPS EXAMPLE_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, ] } #Output example after the transformation for AggregatedBy TIMESTAMPS EXPECTED_EXPANDED_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, { "Timestamp": 0, "Label": { "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, }, ] } def expand_aliases(inferenceOutputsWithAliases): if VIDEO_LABEL_KEY in inferenceOutputsWithAliases: expandInferenceOutputs = [] for segmentLabelDict in inferenceOutputsWithAliases[VIDEO_LABEL_KEY]: primaryLabelDict = segmentLabelDict[LABEL_KEY] if ALIASES_KEY in primaryLabelDict: for alias in primaryLabelDict[ALIASES_KEY]: aliasLabelDict = deepcopy(segmentLabelDict) aliasLabelDict[LABEL_KEY][NAME_KEY] = alias[NAME_KEY] del aliasLabelDict[LABEL_KEY][ALIASES_KEY] if INSTANCE_KEY in aliasLabelDict[LABEL_KEY]: del aliasLabelDict[LABEL_KEY][INSTANCE_KEY] expandInferenceOutputs.append(aliasLabelDict) inferenceOutputsWithAliases[VIDEO_LABEL_KEY].extend(expandInferenceOutputs) return inferenceOutputsWithAliases if __name__ == "__main__": segmentOutputWithExpandAliases = expand_aliases(EXAMPLE_SEGMENT_OUTPUT) assert segmentOutputWithExpandAliases == EXPECTED_EXPANDED_SEGMENT_OUTPUT timestampOutputWithExpandAliases = expand_aliases(EXAMPLE_TIMESTAMP_OUTPUT) assert timestampOutputWithExpandAliases == EXPECTED_EXPANDED_TIMESTAMP_OUTPUT