检测存储视频中的文本 - Amazon Rekognition

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

检测存储视频中的文本

Amazon Rekognition Video 可以在 Amazon S3 存储桶存储的视频中检测人脸并提供一些信息,例如:

  • 在视频中检测到人脸的次数。

  • 人脸被检测到时在视频帧中的位置。

  • 人脸标记,例如左眼的位置。

  • 其他属性如人脸属性指南页面所述。

存储视频中的 Amazon Rekognition Video 人脸检测是一个异步操作。要开始检测视频中的人脸,请致电StartFaceDetection。Amazon Rekognition Video 会将视频分析的完成状态发布到 Amazon Simple Notification Service (Amazon SNS) 主题。如果视频分析成功,则可以GetFaceDetection致电获取视频分析结果。有关启动视频分析和获取结果的详细信息,请参阅调用 Amazon Rekognition Video 操作

此过程在使用 Java 或 Python 分析存储在亚马逊 S3 存储桶中的视频 (SDK)(使用 Amazon Simple Queue Service (Amazon SQS) 队列获取视频分析请求的完成状态)中的代码的基础上进行了扩展。

检测存储在 Amazon S3 存储桶内的视频中的人脸 (SDK)
  1. 执行使用 Java 或 Python 分析存储在亚马逊 S3 存储桶中的视频 (SDK)

  2. 将以下代码添加到您在步骤 1 中创建的类 VideoDetect

    AWS CLI
    • 在以下代码示例中,将bucket-namevideo-name更改为您在步骤 2 中指定的 Amazon S3 存储桶名称和文件名。

    • region-name更改为您使用的 AWS 区域。将profile_name的值替换为您的开发人员资料的名称。

    • TopicARN 更改为您在 配置 Amazon Rekognition Video 的步骤 3 中创建的 Amazon SNS 主题的 ARN。

    • RoleARN 更改为您在 配置 Amazon Rekognition Video 的步骤 7 中创建的 IAM 服务角色的 ARN。

    aws rekognition start-face-detection --video "{"S3Object":{"Bucket":"Bucket-Name","Name":"Video-Name"}}" --notification-channel \ "{"SNSTopicArn":"Topic-ARN","RoleArn":"Role-ARN"}" --region region-name --profile profile-name

    如果您在 Windows 设备上访问 CLI,请使用双引号代替单引号,并用反斜杠(即 \)对内部双引号进行转义,以解决可能遇到的任何解析器错误。例如,请参阅以下内容:

    aws rekognition start-face-detection --video "{\"S3Object\":{\"Bucket\":\"Bucket-Name\",\"Name\":\"Video-Name\"}}" --notification-channel \ "{\"SNSTopicArn\":\"Topic-ARN\",\"RoleArn\":\"Role-ARN\"}" --region region-name --profile profile-name

    运行 StartFaceDetection 操作并获取任务 ID 号后,运行以下 GetFaceDetection 操作并提供任务 ID 号:

    aws rekognition get-face-detection --job-id job-id-number --profile profile-name
    Java
    //Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. //PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-developer-guide/blob/master/LICENSE-SAMPLECODE.) private static void StartFaceDetection(String bucket, String video) throws Exception{ NotificationChannel channel= new NotificationChannel() .withSNSTopicArn(snsTopicArn) .withRoleArn(roleArn); StartFaceDetectionRequest req = new StartFaceDetectionRequest() .withVideo(new Video() .withS3Object(new S3Object() .withBucket(bucket) .withName(video))) .withNotificationChannel(channel); StartFaceDetectionResult startLabelDetectionResult = rek.startFaceDetection(req); startJobId=startLabelDetectionResult.getJobId(); } private static void GetFaceDetectionResults() throws Exception{ int maxResults=10; String paginationToken=null; GetFaceDetectionResult faceDetectionResult=null; do{ if (faceDetectionResult !=null){ paginationToken = faceDetectionResult.getNextToken(); } faceDetectionResult = rek.getFaceDetection(new GetFaceDetectionRequest() .withJobId(startJobId) .withNextToken(paginationToken) .withMaxResults(maxResults)); VideoMetadata videoMetaData=faceDetectionResult.getVideoMetadata(); System.out.println("Format: " + videoMetaData.getFormat()); System.out.println("Codec: " + videoMetaData.getCodec()); System.out.println("Duration: " + videoMetaData.getDurationMillis()); System.out.println("FrameRate: " + videoMetaData.getFrameRate()); //Show faces, confidence and detection times List<FaceDetection> faces= faceDetectionResult.getFaces(); for (FaceDetection face: faces) { long seconds=face.getTimestamp()/1000; System.out.print("Sec: " + Long.toString(seconds) + " "); System.out.println(face.getFace().toString()); System.out.println(); } } while (faceDetectionResult !=null && faceDetectionResult.getNextToken() != null); }

    在函数 main 中,将以下行:

    StartLabelDetection(bucket, video); if (GetSQSMessageSuccess()==true) GetLabelDetectionResults();

    替换为:

    StartFaceDetection(bucket, video); if (GetSQSMessageSuccess()==true) GetFaceDetectionResults();
    Java V2

    此代码取自 AWS 文档 SDK 示例 GitHub 存储库。请在此处查看完整示例。

    //snippet-start:[rekognition.java2.recognize_video_faces.import] import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider; import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.rekognition.RekognitionClient; import software.amazon.awssdk.services.rekognition.model.*; import java.util.List; //snippet-end:[rekognition.java2.recognize_video_faces.import] /** * Before running this Java V2 code example, set up your development environment, including your credentials. * * For more information, see the following documentation topic: * * https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html */ public class VideoDetectFaces { private static String startJobId =""; public static void main(String[] args) { final String usage = "\n" + "Usage: " + " <bucket> <video> <topicArn> <roleArn>\n\n" + "Where:\n" + " bucket - The name of the bucket in which the video is located (for example, (for example, myBucket). \n\n"+ " video - The name of video (for example, people.mp4). \n\n" + " topicArn - The ARN of the Amazon Simple Notification Service (Amazon SNS) topic. \n\n" + " roleArn - The ARN of the AWS Identity and Access Management (IAM) role to use. \n\n" ; if (args.length != 4) { System.out.println(usage); System.exit(1); } String bucket = args[0]; String video = args[1]; String topicArn = args[2]; String roleArn = args[3]; Region region = Region.US_EAST_1; RekognitionClient rekClient = RekognitionClient.builder() .region(region) .credentialsProvider(ProfileCredentialsProvider.create("profile-name")) .build(); NotificationChannel channel = NotificationChannel.builder() .snsTopicArn(topicArn) .roleArn(roleArn) .build(); StartFaceDetection(rekClient, channel, bucket, video); GetFaceResults(rekClient); System.out.println("This example is done!"); rekClient.close(); } // snippet-start:[rekognition.java2.recognize_video_faces.main] public static void StartFaceDetection(RekognitionClient rekClient, NotificationChannel channel, String bucket, String video) { try { S3Object s3Obj = S3Object.builder() .bucket(bucket) .name(video) .build(); Video vidOb = Video.builder() .s3Object(s3Obj) .build(); StartFaceDetectionRequest faceDetectionRequest = StartFaceDetectionRequest.builder() .jobTag("Faces") .faceAttributes(FaceAttributes.ALL) .notificationChannel(channel) .video(vidOb) .build(); StartFaceDetectionResponse startLabelDetectionResult = rekClient.startFaceDetection(faceDetectionRequest); startJobId=startLabelDetectionResult.jobId(); } catch(RekognitionException e) { System.out.println(e.getMessage()); System.exit(1); } } public static void GetFaceResults(RekognitionClient rekClient) { try { String paginationToken=null; GetFaceDetectionResponse faceDetectionResponse=null; boolean finished = false; String status; int yy=0 ; do{ if (faceDetectionResponse !=null) paginationToken = faceDetectionResponse.nextToken(); GetFaceDetectionRequest recognitionRequest = GetFaceDetectionRequest.builder() .jobId(startJobId) .nextToken(paginationToken) .maxResults(10) .build(); // Wait until the job succeeds while (!finished) { faceDetectionResponse = rekClient.getFaceDetection(recognitionRequest); status = faceDetectionResponse.jobStatusAsString(); if (status.compareTo("SUCCEEDED") == 0) finished = true; else { System.out.println(yy + " status is: " + status); Thread.sleep(1000); } yy++; } finished = false; // Proceed when the job is done - otherwise VideoMetadata is null VideoMetadata videoMetaData=faceDetectionResponse.videoMetadata(); System.out.println("Format: " + videoMetaData.format()); System.out.println("Codec: " + videoMetaData.codec()); System.out.println("Duration: " + videoMetaData.durationMillis()); System.out.println("FrameRate: " + videoMetaData.frameRate()); System.out.println("Job"); // Show face information List<FaceDetection> faces= faceDetectionResponse.faces(); for (FaceDetection face: faces) { String age = face.face().ageRange().toString(); String smile = face.face().smile().toString(); System.out.println("The detected face is estimated to be" + age + " years old."); System.out.println("There is a smile : "+smile); } } while (faceDetectionResponse !=null && faceDetectionResponse.nextToken() != null); } catch(RekognitionException | InterruptedException e) { System.out.println(e.getMessage()); System.exit(1); } } // snippet-end:[rekognition.java2.recognize_video_faces.main] }
    Python
    #Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. #PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-developer-guide/blob/master/LICENSE-SAMPLECODE.) # ============== Faces=============== def StartFaceDetection(self): response=self.rek.start_face_detection(Video={'S3Object': {'Bucket': self.bucket, 'Name': self.video}}, NotificationChannel={'RoleArn': self.roleArn, 'SNSTopicArn': self.snsTopicArn}) self.startJobId=response['JobId'] print('Start Job Id: ' + self.startJobId) def GetFaceDetectionResults(self): maxResults = 10 paginationToken = '' finished = False while finished == False: response = self.rek.get_face_detection(JobId=self.startJobId, MaxResults=maxResults, NextToken=paginationToken) print('Codec: ' + response['VideoMetadata']['Codec']) print('Duration: ' + str(response['VideoMetadata']['DurationMillis'])) print('Format: ' + response['VideoMetadata']['Format']) print('Frame rate: ' + str(response['VideoMetadata']['FrameRate'])) print() for faceDetection in response['Faces']: print('Face: ' + str(faceDetection['Face'])) print('Confidence: ' + str(faceDetection['Face']['Confidence'])) print('Timestamp: ' + str(faceDetection['Timestamp'])) print() if 'NextToken' in response: paginationToken = response['NextToken'] else: finished = True

    在函数 main 中,将以下行:

    analyzer.StartLabelDetection() if analyzer.GetSQSMessageSuccess()==True: analyzer.GetLabelDetectionResults()

    替换为:

    analyzer.StartFaceDetection() if analyzer.GetSQSMessageSuccess()==True: analyzer.GetFaceDetectionResults()
    注意

    如果您已运行使用 Java 或 Python 分析存储在亚马逊 S3 存储桶中的视频 (SDK)之外的视频示例,则要替换的函数名称将不同。

  3. 运行该代码。将显示有关在视频中检测到的人脸的信息。

GetFaceDetection 操作响应

GetFaceDetection 将返回一个数组 (Faces),其中包含有关在视频中检测到的人脸的信息。每次在视频中检测到人脸时,都会存在一个数组元素。FaceDetection返回的数组元素按时间顺序排列,从视频开始起计时,以毫秒为单位。

以下示例是来自 GetFaceDetection 的部分 JSON 响应。在响应中,请注意以下内容:

  • 边界框 – 人脸周围的边界框的坐标。

  • 置信度 - 边界框包含人脸的置信度级别。

  • 人脸标记 - 一组人脸标记。对于每个标记(例如,左眼、右眼和嘴),此响应将提供 x 坐标和 y 坐标。

  • 面部属性 — 一组面部属性,包括: AgeRange、胡子、情绪、眼镜、性别、 EyesOpen、胡子 MouthOpen、微笑和太阳镜。该值可以是不同的类型,例如布尔值类型(人员是否戴了太阳镜)或字符串(人员是男性还是女性)。此外,对于大多数属性,此响应还为属性提供检测到的值的置信度。请注意,虽然使用时支持 FaceOccluded 和 EyeDirection属性DetectFaces,但使用StartFaceDetection和分析视频时不支持这些属性GetFaceDetection

  • 时间戳 - 视频中检测到人脸的时间。

  • 分页信息 - 此示例显示一页人脸检测信息。您可以在 GetFaceDetectionMaxResults 输入参数中指定要返回的人员元素数量。如果存在的结果的数量超过了 MaxResults,则 GetFaceDetection 会返回一个令牌 (NextToken),用于获取下一页的结果。有关更多信息,请参阅 获取 Amazon Rekognition Video 分析结果

  • 视频信息 - 此响应包含有关由 VideoMetadata 返回的每页信息中的视频格式 (GetFaceDetection) 的信息。

  • 质量 - 描述人脸的亮度和锐度。

  • 姿势 - 描述人脸的旋转。

{ "Faces": [ { "Face": { "BoundingBox": { "Height": 0.23000000417232513, "Left": 0.42500001192092896, "Top": 0.16333332657814026, "Width": 0.12937499582767487 }, "Confidence": 99.97504425048828, "Landmarks": [ { "Type": "eyeLeft", "X": 0.46415066719055176, "Y": 0.2572723925113678 }, { "Type": "eyeRight", "X": 0.5068183541297913, "Y": 0.23705792427062988 }, { "Type": "nose", "X": 0.49765899777412415, "Y": 0.28383663296699524 }, { "Type": "mouthLeft", "X": 0.487221896648407, "Y": 0.3452930748462677 }, { "Type": "mouthRight", "X": 0.5142884850502014, "Y": 0.33167609572410583 } ], "Pose": { "Pitch": 15.966927528381348, "Roll": -15.547388076782227, "Yaw": 11.34195613861084 }, "Quality": { "Brightness": 44.80223083496094, "Sharpness": 99.95819854736328 } }, "Timestamp": 0 }, { "Face": { "BoundingBox": { "Height": 0.20000000298023224, "Left": 0.029999999329447746, "Top": 0.2199999988079071, "Width": 0.11249999701976776 }, "Confidence": 99.85971069335938, "Landmarks": [ { "Type": "eyeLeft", "X": 0.06842322647571564, "Y": 0.3010137975215912 }, { "Type": "eyeRight", "X": 0.10543643683195114, "Y": 0.29697132110595703 }, { "Type": "nose", "X": 0.09569807350635529, "Y": 0.33701086044311523 }, { "Type": "mouthLeft", "X": 0.0732642263174057, "Y": 0.3757539987564087 }, { "Type": "mouthRight", "X": 0.10589495301246643, "Y": 0.3722417950630188 } ], "Pose": { "Pitch": -0.5589138865470886, "Roll": -5.1093974113464355, "Yaw": 18.69594955444336 }, "Quality": { "Brightness": 43.052337646484375, "Sharpness": 99.68138885498047 } }, "Timestamp": 0 }, { "Face": { "BoundingBox": { "Height": 0.2177777737379074, "Left": 0.7593749761581421, "Top": 0.13333334028720856, "Width": 0.12250000238418579 }, "Confidence": 99.63436889648438, "Landmarks": [ { "Type": "eyeLeft", "X": 0.8005779385566711, "Y": 0.20915353298187256 }, { "Type": "eyeRight", "X": 0.8391435146331787, "Y": 0.21049551665782928 }, { "Type": "nose", "X": 0.8191410899162292, "Y": 0.2523227035999298 }, { "Type": "mouthLeft", "X": 0.8093273043632507, "Y": 0.29053622484207153 }, { "Type": "mouthRight", "X": 0.8366993069648743, "Y": 0.29101791977882385 } ], "Pose": { "Pitch": 3.165884017944336, "Roll": 1.4182015657424927, "Yaw": -11.151537895202637 }, "Quality": { "Brightness": 28.910892486572266, "Sharpness": 97.61507415771484 } }, "Timestamp": 0 }....... ], "JobStatus": "SUCCEEDED", "NextToken": "i7fj5XPV/fwviXqz0eag9Ow332Jd5G8ZGWf7hooirD/6V1qFmjKFOQZ6QPWUiqv29HbyuhMNqQ==", "VideoMetadata": { "Codec": "h264", "DurationMillis": 67301, "FileExtension": "mp4", "Format": "QuickTime / MOV", "FrameHeight": 1080, "FrameRate": 29.970029830932617, "FrameWidth": 1920 } }