偵測影片中的標籤 - Amazon Rekognition

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

偵測影片中的標籤

Amazon Rekognition Video 可以在影片中偵測標籤 (物件和概念),以及偵測到標籤的時間。如需SDK程式碼範例,請參閱使用 Java 或 Python(SDK)分析存儲在 Amazon S3 存儲桶中的視頻。如需範 AWS CLI 例,請參閱使用分析視訊 AWS Command Line Interface

Amazon Rekognition Video 標籤偵測是一種非同步操作。要開始檢測視頻中的標籤,請致電StartLabelDetection

Amazon Rekognition Video 向 Amazon Simple Notification Service 主題發佈影片的完成狀態。如果視頻分析成功,請致電GetLabelDetection以獲取檢測到的標籤。如需呼叫視訊分析API作業的相關資訊,請參閱呼叫 Amazon Rekognition Video 操作

StartLabelDetection請求

以下是 StartLabelDetection 操作要求的範例。您可以使用儲存在 Amazon S3 儲存貯體中的影片來提供 StartLabelDetection 操作。在範例請求中JSON,指定 Amazon S3 儲存貯體和影片名稱,以及MinConfidenceFeaturesSettings、和NotificationChannel

MinConfidence 是 Amazon Rekognition Video 在偵測到的標籤的準確性中必須具有的最低可信度或執行個體的週框方塊 (如偵測到),以便在回應中傳回。

使用時Features,您可以指定您希望 GENERAL _ 作為響應的一部分LABELS返回。

使用Settings,您可以篩選 GENERAL _ 的傳回項目LABELS。對於標籤,您可以使用包容性和獨家篩選器。您還可以按特定標籤,單個標籤或按標籤類別進行篩選:

  • LabelInclusionFilters:用於指定要包含在回應中的標籤

  • LabelExclusionFilters:用於指定要從回應中排除的標籤。

  • LabelCategoryInclusionFilters:用於指定要包含在回應中的標籤類別。

  • LabelCategoryExclusionFilters:用於指定要從回應中排除的標籤類別。

您還可以根據需要組合包含性和排斥性篩選,但不包括某些標籤或類別以及包括其他標籤或類別。

NotificationChannel是您希望 Amazon Rekognition 視頻發布標籤檢測操作的完成狀態的亞馬遜SNS主題之一。ARN如果您使用的是AmazonRekognitionServiceRole許可政策,則 Amazon 主SNS題必須具有以 Rekognition 開頭的主題名稱。

以下是JSON表單中的樣本StartLabelDetection請求,包括過濾器:

{ "ClientRequestToken": "5a6e690e-c750-460a-9d59-c992e0ec8638", "JobTag": "5a6e690e-c750-460a-9d59-c992e0ec8638", "Video": { "S3Object": { "Bucket": "bucket", "Name": "video.mp4" } }, "Features": ["GENERAL_LABELS"], "MinConfidence": 75, "Settings": { "GeneralLabels": { "LabelInclusionFilters": ["Cat", "Dog"], "LabelExclusionFilters": ["Tiger"], "LabelCategoryInclusionFilters": ["Animals and Pets"], "LabelCategoryExclusionFilters": ["Popular Landmark"] } }, "NotificationChannel": { "RoleArn": "arn:aws:iam::012345678910:role/SNSAccessRole", "SNSTopicArn": "arn:aws:sns:us-east-1:012345678910:notification-topic", } }

GetLabelDetection 作業回應

GetLabelDetection 會傳回陣列 (Labels),其中包含影片中偵測到之標籤的相關資訊。陣列可依時間或指定 SortBy 參數時偵測到的標籤來排序。您也可以使用 AggregateBy 參數來選取回應專案的彙總方式。

下列範例是的JSON回應GetLabelDetection。在回應中,請注意下列事項:

  • 排序順序:傳回的標籤陣列會依時間排序。若要依標籤排序,請在 SortBy 輸入參數中指定 NAME 以執行 GetLabelDetection。如果標籤在視頻中出現多次,則會出現(LabelDetection)元素的倍數實例。預設排序或是 TIMESTAMP,而次要排序順序為 NAME

  • 標籤資訊LabelDetection 陣列元素包含 (標籤) 物件,其中包括標籤名稱和 Amazon Rekognition 標籤偵測精確度的可信度分數。Label 物件也包含標籤的階層式分類法和常用標籤的週框方塊資訊。Timestamp 則是偵測到標籤的時間,從影片開始起算並以毫秒為單位。

    也會傳回與標籤相關聯之任何類別或別名的相關資訊。對於依影片 SEGMENTS 彙總的結果,會傳回 StartTimestampMillisEndTimestampMillisDurationMillis 的結構,分別定義區段的開始時間、結束時間和持續時間。

  • 彙總:指定傳回時如何彙總結果。依據 TIMESTAMPS 彙總預設值。您也可以選擇依據 SEGMENTS 彙總,以便在時間範圍內彙總結果。如果依據 SEGMENTS 彙總,則不會傳回偵測到具有邊界方框之執行個體的相關資訊。僅傳回區段期間偵測到的標籤。

  • 分頁資訊:範例顯示標籤偵測資訊的一頁。您可以指定 GetLabelDetectionMaxResults 輸入參數中要傳回幾個 LabelDetection 物件。如果結果數目超過 MaxResultsGetLabelDetection 會傳回用來取得下一頁結果的字符 (NextToken)。如需詳細資訊,請參閱 取得 Amazon Rekognition Video 分析結果

  • 影片資訊:回應包含 GetLabelDetection 所傳回之每頁資訊中影片格式 (VideoMetadata) 的相關資訊。

以下是通過聚合JSON形式的示例 GetLabelDetection 響應TIMESTAMPS:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "Timestamp": 1000, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 1000, "Label": { "Name": "Cup", "Categories": [ { "Name": "Kitchen and Dining" } ], "Aliases": [ { "Name": "Mug" } ], "Parents": [], "Confidence": 99.9364013671875, // Classification confidence "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 // Detection confidence } ] } }, { "Timestamp": 2000, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567, }, "Confidence": 99.9364013671875 } ] } }, { "Timestamp": 4000, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875, "Instances": [ { "BoundingBox": { "Width": 0.26779675483703613, "Height": 0.8562285900115967, "Left": 0.3604024350643158, "Top": 0.09245597571134567 }, "Confidence": 99.9364013671875 } ] } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

以下是通過聚合JSON形式的示例 GetLabelDetection 響應SEGMENTS:

{ "JobStatus": "SUCCEEDED", "LabelModelVersion": "3.0", "Labels": [ { "StartTimestampMillis": 225, "EndTimestampMillis": 3578, "DurationMillis": 3353, "Label": { "Name": "Car", "Categories": [ { "Name": "Vehicles and Automotive" } ], "Aliases": [ { "Name": "Automobile" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 // Maximum confidence score for Segment mode } }, { "StartTimestampMillis": 7578, "EndTimestampMillis": 12371, "DurationMillis": 4793, "Label": { "Name": "Kangaroo", "Categories": [ { "Name": "Animals and Pets" } ], "Aliases": [ { "Name": "Wallaby" } ], "Parents": [ { "Name": "Mammal" } ], "Confidence": 99.9364013671875 } }, { "StartTimestampMillis": 22225, "EndTimestampMillis": 22578, "DurationMillis": 2353, "Label": { "Name": "Bicycle", "Categories": [ { "Name": "Hobbies and Interests" } ], "Aliases": [ { "Name": "Bike" } ], "Parents": [ { "Name": "Vehicle" } ], "Confidence": 99.9364013671875 } } ], "VideoMetadata": { "ColorRange": "FULL", "DurationMillis": 5000, "Format": "MP4", "FrameWidth": 1280, "FrameHeight": 720, "FrameRate": 24 } }

轉換響 GetLabelDetection 應

使用 GetLabelDetection API作業擷取結果時,您可能需要回應結構來模擬較舊的API回應結構,其中主要標籤和別名都包含在相同的清單中。

在上一節中找到的範例JSON回應會顯示API回應的目前格式 GetLabelDetection。

下列範例顯示先前的回應 GetLabelDetection API:

{ "Labels": [ { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 60.51791763305664, "Parents": [], "Name": "Leaf" } }, { "Timestamp": 0, "Label": { "Instances": [], "Confidence": 99.53411102294922, "Parents": [], "Name": "Human" } }, { "Timestamp": 0, "Label": { "Instances": [ { "BoundingBox": { "Width": 0.11109819263219833, "Top": 0.08098889887332916, "Left": 0.8881205320358276, "Height": 0.9073750972747803 }, "Confidence": 99.5831298828125 }, { "BoundingBox": { "Width": 0.1268676072359085, "Top": 0.14018426835536957, "Left": 0.0003282368124928324, "Height": 0.7993982434272766 }, "Confidence": 99.46029663085938 } ], "Confidence": 99.63411102294922, "Parents": [], "Name": "Person" } }, . . . { "Timestamp": 166, "Label": { "Instances": [], "Confidence": 73.6471176147461, "Parents": [ { "Name": "Clothing" } ], "Name": "Sleeve" } } ], "LabelModelVersion": "2.0", "JobStatus": "SUCCEEDED", "VideoMetadata": { "Format": "QuickTime / MOV", "FrameRate": 23.976024627685547, "Codec": "h264", "DurationMillis": 5005, "FrameHeight": 674, "FrameWidth": 1280 } }

如果需要,您可以將當前回應轉換為遵循舊回應的格式。您可以使用下列範例程式碼,將最新API回應轉換為先前的API回應結構:

from copy import deepcopy VIDEO_LABEL_KEY = "Labels" LABEL_KEY = "Label" ALIASES_KEY = "Aliases" INSTANCE_KEY = "Instances" NAME_KEY = "Name" #Latest API response sample for AggregatedBy SEGMENTS EXAMPLE_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, ] } #Output example after the transformation for AggregatedBy SEGMENTS EXPECTED_EXPANDED_SEGMENT_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label":{ "Name": "Person", "Confidence": 97.530106, "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, "StartTimestampMillis": 6400, "EndTimestampMillis": 8200, "DurationMillis": 1800 }, { "Timestamp": 0, "Label":{ "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, "StartTimestampMillis": 0, "EndTimestampMillis": 500666, "DurationMillis": 500666 }, ] } #Latest API response sample for AggregatedBy TIMESTAMPS EXAMPLE_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, ] } #Output example after the transformation for AggregatedBy TIMESTAMPS EXPECTED_EXPANDED_TIMESTAMP_OUTPUT = { "Labels": [ { "Timestamp": 0, "Label": { "Name": "Person", "Confidence": 97.530106, "Instances": [ { "BoundingBox": { "Height": 0.1549897, "Width": 0.07747964, "Top": 0.50858885, "Left": 0.00018205095 }, "Confidence": 97.530106 }, ], "Parents": [], "Aliases": [ { "Name": "Human" }, ], "Categories": [ { "Name": "Person Description" } ], }, }, { "Timestamp": 6400, "Label": { "Name": "Leaf", "Confidence": 89.77790069580078, "Instances": [], "Parents": [ { "Name": "Plant" } ], "Aliases": [], "Categories": [ { "Name": "Plants and Flowers" } ], }, }, { "Timestamp": 0, "Label": { "Name": "Human", "Confidence": 97.530106, "Parents": [], "Categories": [ { "Name": "Person Description" } ], }, }, ] } def expand_aliases(inferenceOutputsWithAliases): if VIDEO_LABEL_KEY in inferenceOutputsWithAliases: expandInferenceOutputs = [] for segmentLabelDict in inferenceOutputsWithAliases[VIDEO_LABEL_KEY]: primaryLabelDict = segmentLabelDict[LABEL_KEY] if ALIASES_KEY in primaryLabelDict: for alias in primaryLabelDict[ALIASES_KEY]: aliasLabelDict = deepcopy(segmentLabelDict) aliasLabelDict[LABEL_KEY][NAME_KEY] = alias[NAME_KEY] del aliasLabelDict[LABEL_KEY][ALIASES_KEY] if INSTANCE_KEY in aliasLabelDict[LABEL_KEY]: del aliasLabelDict[LABEL_KEY][INSTANCE_KEY] expandInferenceOutputs.append(aliasLabelDict) inferenceOutputsWithAliases[VIDEO_LABEL_KEY].extend(expandInferenceOutputs) return inferenceOutputsWithAliases if __name__ == "__main__": segmentOutputWithExpandAliases = expand_aliases(EXAMPLE_SEGMENT_OUTPUT) assert segmentOutputWithExpandAliases == EXPECTED_EXPANDED_SEGMENT_OUTPUT timestampOutputWithExpandAliases = expand_aliases(EXAMPLE_TIMESTAMP_OUTPUT) assert timestampOutputWithExpandAliases == EXPECTED_EXPANDED_TIMESTAMP_OUTPUT