Amazon A2I Output Data
When your machine learning workflow sends Amazon A2I a data object, a human loop is created and human reviewers receive a task to review that data object. The output data from each
human review task is stored in the Amazon Simple Storage Service (Amazon S3) output bucket you specify in your human
review workflow. In the path to the data,
represents the human loop creation date with year (YYYY
/MM
/DD
/hh
/mm
/ss
YYYY
), month
(MM
), and day (DD
), and the creation time with hour
(hh
), minute (mm
), and second (ss
).
s3://
customer-output-bucket-specified-in-flow-definition
/flow-definition-name
/YYYY
/MM
/DD
/hh
/mm
/ss
/human-loop-name
/output.json
The content of your output data depends on the type of task type (built-in or custom) and the type of workforce you use. Your output data always includes the response from the human worker. Additionally, output data may include metadata about the human loop, the human reviewer (worker), and the data object.
Use the following sections to learn more about Amazon A2I output data format for different task types and workforces.
Output Data From Built-In Task Types
Amazon A2I built-in task types include Amazon Textract and Amazon Rekognition. In addition to human responses, the output data from one of these tasks includes details about the reason the human loop was created and information about the integrated service used to create the human loop. Use the following table to learn more about the output data schema for all built-in task types. The value for each of these parameters depends on the service you use with Amazon A2I. Refer to the second table in this section for more information about these service-specific values.
Parameter | Value Type | Example Values | Description |
---|---|---|---|
awsManagedHumanLoopRequestSource |
String |
AWS/Rekognition/DetectModerationLabels/Image/V3 or
AWS/Textract/AnalyzeDocument/Forms/V1 |
The API operation and associated AWS services that requested that Amazon A2I create the a human loop. This is the API operation you use to configure your Amazon A2I human loop. |
flowDefinitionArn |
String |
arn:aws:sagemaker:us-west-2: |
The Amazon Resource Number (ARN) of the human review workflow (flow definition) used to create the human loop. |
humanAnswers |
List of JSON objects |
or
|
A list of JSON objects that contain worker responses in
answerContent . This object also contains submission details and, if a private workforce was used, worker metadata. To learn more, see Track Worker Activity. For human
loop output data produced from Amazon Rekognition
|
humanLoopName |
String |
|
The name of the human loop. |
inputContent |
JSON object |
|
The input content the AWS service sent to Amazon A2I when it requested a human loop be created. |
aiServiceRequest |
JSON object |
or
|
The original request sent to the AWS service integrated with
Amazon A2I. For example, if you use Amazon Rekognition with Amazon A2I, this
includes the request made through the API operation
|
aiServiceResponse |
JSON object |
or
|
The full response from the AWS service. This is the data that is used to determine if a human review is required. This object may contain metadata about the data object that is not shared with human reviewers. |
selectedAiServiceResponse |
JSON object |
or
|
The subset of the All data objects listed in |
humanTaskActivationConditionResults |
JSON object |
|
A JSON object in |
Select a tab on the following table to learn about the task type–specific parameters and see an example output-data code block for each of the built-in task types.
Output Data From Custom Task Types
When you add Amazon A2I to a custom human review workflow, you see the following parameters in the output data returned from human review tasks.
Parameter | Value Type | Description |
---|---|---|
|
String |
The Amazon Resource Number (ARN) of the human review workflow (flow definition) used to create the human loop. |
|
List of JSON objects |
A list of JSON objects that contain worker responses in
answerContent . The value in this parameter is
determined by the output received from your worker task
template. If you are using a private workforce, worker metadata is included. To learn more, see Track Worker Activity. |
|
String | The name of the human loop. |
|
JSON Object |
The input content sent to Amazon A2I in the request to |
The following is an example of output data from a custom integration with Amazon A2I
and Amazon Transcribe. In this example, the inputContent
consists of:
-
A path to an .mp4 file in Amazon S3 and the video title
-
The transcription returned from Amazon Transcribe (parsed from Amazon Transcribe output data)
-
A start and end time used by the worker task template to clip the .mp4 file and show workers a relevant portion of the video
{ "flowDefinitionArn": "arn:aws:sagemaker:us-west-2:
111122223333
:flow-definition/flow-definition-name
", "humanAnswers": [ { "answerContent": { "transcription": "use lambda to turn your notebook" }, "submissionTime": "2020-06-18T17:08:26.246Z", "workerId": "ef7294f850a3d9d1", "workerMetadata": { "identityData": { "identityProviderType": "Cognito", "issuer": "https://cognito-idp.us-west-2.amazonaws.com/us-west-2_111111
", "sub": "c6aa8eb7-9944-42e9-a6b9-
" } } } ], "humanLoopName": "111122223333
human-loop-name
", "inputContent": { "audioPath": "s3://amzn-s3-demo-bucket1
/a2i_transcribe_demo/Fully-Managed Notebook Instances with Amazon SageMaker - a Deep Dive.mp4", "end_time": 950.27, "original_words": "but definitely use Lambda to turn your ", "start_time": 948.51, "video_title": "Fully-Managed Notebook Instances with Amazon SageMaker - a Deep Dive.mp4" } }
Track Worker Activity
Amazon A2I provides information that you can use to track individual workers in task output data. To identify the worker that worked on the human review task, use the following from the output data in Amazon S3:
-
The
acceptanceTime
is the time that the worker accepted the task. The format of this date and time stamp isYYYY-MM-DDTHH:MM:SS.mmmZ
for the year (YYYY
), month (MM
), day (DD
), hour (HH
), minute (MM
), second (SS
), and millisecond (mmm
). The date and time are separated by a T. -
The
submissionTime
is the time that the worker submitted their annotations using the Submit button. The format of this date and time stamp isYYYY-MM-DDTHH:MM:SS.mmmZ
for the year (YYYY
), month (MM
), day (DD
), hour (HH
), minute (MM
), second (SS
), and millisecond (mmm
). The date and time are separated by a T. -
timeSpentInSeconds
reports the total time, in seconds, that a worker actively worked on that task. This metric does not include time when a worker paused or took a break. -
The
workerId
is unique to each worker. -
If you use a private workforce, in
workerMetadata
, you see the following.-
The
identityProviderType
is the service used to manage the private workforce. -
The
issuer
is the Amazon Cognito user pool or OpenID Connect (OIDC) Identity Provider (IdP) issuer associated with the work team assigned to this human review task. -
A unique
sub
identifier refers to the worker. If you create a workforce using Amazon Cognito, you can retrieve details about this worker (such as the name or user name) associated with this ID using Amazon Cognito. To learn how, see Managing and Searching for User Accounts in Amazon Cognito Developer Guide.
-
The following is an example of the output you may see if you use Amazon Cognito to create a
private workforce. This is identified in the identityProviderType
.
"submissionTime": "2020-12-28T18:59:58.321Z", "acceptanceTime": "2020-12-28T18:59:15.191Z", "timeSpentInSeconds": 40.543, "workerId": "a12b3cdefg4h5i67", "workerMetadata": { "identityData": { "identityProviderType": "Cognito", "issuer": "https://cognito-idp.aws-region.amazonaws.com/aws-region_123456789", "sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" } }
The following is an example of the output you may see if you use your own OIDC IdP to create a private workforce:
"workerMetadata": { "identityData": { "identityProviderType": "Oidc", "issuer": "https://example-oidc-ipd.com/adfs", "sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" } }
To learn more about using private workforces, see Private workforce.