

# Training data labeling using humans with Amazon SageMaker Ground Truth
<a name="sms"></a>

To train a machine learning model, you need a large, high-quality, labeled dataset. Ground Truth helps you build high-quality training datasets for your machine learning models. With Ground Truth, you can use workers from either Amazon Mechanical Turk, a vendor company that you choose, or an internal, private workforce along with machine learning to enable you to create a labeled dataset. You can use the labeled dataset output from Ground Truth to train your own models. You can also use the output as a training dataset for an Amazon SageMaker AI model.

Depending on your ML application, you can choose from one of the Ground Truth built-in task types to have workers generate specific types of labels for your data. You can also build a custom labeling workflow to provide your own UI and tools to workers labeling your data. To learn more about the Ground Truth built in task types, see [Built-in Task Types](sms-task-types.md). To learn how to create a custom labeling workflow, see [Custom labeling workflows](sms-custom-templates.md).

In order to automate labeling your training dataset, you can optionally use *automated data labeling*, a Ground Truth process that uses machine learning to decide which data needs to be labeled by humans. Automated data labeling may reduce the labeling time and manual effort required. For more information, see [Automate data labeling](sms-automated-labeling.md). To create a custom labeling workflow, see [Custom labeling workflows](sms-custom-templates.md).

Use either pre-built or custom tools to assign the labeling tasks for your training dataset. A *labeling UI template* is a webpage that Ground Truth uses to present tasks and instructions to your workers. The SageMaker AI console provides built-in templates for labeling data. You can use these templates to get started , or you can build your own tasks and instructions by using our HTML 2.0 components. For more information, see [Custom labeling workflows](sms-custom-templates.md). 

Use the workforce of your choice to label your dataset. You can choose your workforce from:
+ The Amazon Mechanical Turk workforce of over 500,000 independent contractors worldwide.
+ A private workforce that you create from your employees or contractors for handling data within your organization.
+ A vendor company that you can find in the AWS Marketplace that specializes in data labeling services.

For more information, see [Workforces](sms-workforce-management.md).

You store your datasets in Amazon S3 buckets. The buckets contain three things: The data to be labeled, an input manifest file that Ground Truth uses to read the data files, and an output manifest file. The output file contains the results of the labeling job. For more information, see [Use input and output data](sms-data.md).

Events from your labeling jobs appear in Amazon CloudWatch under the `/aws/sagemaker/LabelingJobs` group. CloudWatch uses the labeling job name as the name for the log stream.

## Are You a First-time User of Ground Truth?
<a name="what-first-time"></a>

If you are a first-time user of Ground Truth, we recommend that you do the following:

1. **Read [Getting started: Create a bounding box labeling job with Ground Truth](sms-getting-started.md)**—This section walks you through setting up your first Ground Truth labeling job.

1. **Explore other topics**—Depending on your needs, do the following:
   + **Explore built-in task types**— Use built-in task types to streamline the process of creating a labeling job. See [Built-in Task Types](sms-task-types.md) to learn more about Ground Truth built-in task types.
   + **Manage your labeling workforce**—Create new work teams and manage your existing workforce. For more information, see [Workforces](sms-workforce-management.md).
   + **Learn about streaming labeling jobs**— Create a streaming labeling job and send new dataset objects to workers in real time using a perpetually running labeling job. Workers continuously receive new data objects to label as long as the labeling job is active and new objects are being sent to it. To learn more, see [Ground Truth streaming labeling jobs](sms-streaming-labeling-job.md).

1. **To learn more about available operations to automate Ground Truth operations, see the [SageMaker AI service](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_Operations_Amazon_SageMaker_Service.html) API reference.**

# Getting started: Create a bounding box labeling job with Ground Truth
<a name="sms-getting-started"></a>

To get started using Amazon SageMaker Ground Truth, follow the instructions in the following sections. The sections here explain how to use the console to create a bounding box labeling job, assign a public or private workforce, and send the labeling job to your workforce. You can also learn how to monitor the progress of a labeling job.

This video shows you how to setup and use Amazon SageMaker Ground Truth. (Length: 9:37)

[![AWS Videos](http://img.youtube.com/vi/https://www.youtube.com/embed/_FPI6KjDlCI/0.jpg)](http://www.youtube.com/watch?v=https://www.youtube.com/embed/_FPI6KjDlCI)


If you want to create a custom labeling workflow, see [Custom labeling workflows](sms-custom-templates.md) for instructions.

Before you create a labeling job, you must upload your dataset to an Amazon S3 bucket. For more information, see [Use input and output data](sms-data.md).

**Topics**
+ [Before You Begin](#sms-getting-started-step1)
+ [Create a Labeling Job](#sms-getting-started-step2)
+ [Select Workers](#sms-getting-started-step3)
+ [Configure the Bounding Box Tool](#sms-getting-started-step4)
+ [Monitoring Your Labeling Job](sms-getting-started-step5.md)

## Before You Begin
<a name="sms-getting-started-step1"></a>

Before you begin using the SageMaker AI console to create a labeling job, you must set up the dataset for use. Do this:

1. Save two images at publicly available HTTP URLs. The images are used when creating instructions for completing a labeling task. The images should have an aspect ratio of around 2:1. For this exercise, the content of the images is not important.

1. Create an Amazon S3 bucket to hold the input and output files. The bucket must be in the same Region where you are running Ground Truth. Make a note of the bucket name because you use it during step 2.

   Ground Truth requires all S3 buckets that contain labeling job input image data have a CORS policy attached. To learn more about this change, see [CORS Requirement for Input Image Data](sms-cors-update.md).

1. You can create an IAM role or let SageMaker AI create a role with the [AmazonSageMakerFullAccess](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess) IAM policy. Refer to [Creating IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create.html) and assign the following permissions policy to the user that is creating the labeling job:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "sagemakergroundtruth",
               "Effect": "Allow",
               "Action": [
                   "cognito-idp:CreateGroup",
                   "cognito-idp:CreateUserPool",
                   "cognito-idp:CreateUserPoolDomain",
                   "cognito-idp:AdminCreateUser",
                   "cognito-idp:CreateUserPoolClient",
                   "cognito-idp:AdminAddUserToGroup",
                   "cognito-idp:DescribeUserPoolClient",
                   "cognito-idp:DescribeUserPool",
                   "cognito-idp:UpdateUserPool"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

------

## Create a Labeling Job
<a name="sms-getting-started-step2"></a>

In this step you use the console to create a labeling job. You tell Amazon SageMaker Ground Truth the Amazon S3 bucket where the manifest file is stored and configure the parameters for the job. For more information about storing data in an Amazon S3 bucket, see [Use input and output data](sms-data.md).

**To create a labeling job**

1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/).

1. From the left navigation, choose **Labeling jobs**.

1. Choose **Create labeling job** to start the job creation process.

1. In the **Job overview** section, provide the following information:
   + **Job name** – Give the labeling job a name that describes the job. This name is shown in your job list. The name must be unique in your account in an AWS Region.
   + **Label attribute name** – Leave this unchecked as the default value is the best option for this introductory job.
   + **Input data setup** – Select **Automated data setup**. This option allows you to automatically connect to your input data in S3. 
   + **S3 location for input datasets** – Enter the S3 location where you added the images in step 1.
   + **S3 location for output datasets** – The location where your output data is written in S3.
   + **Data type** – Use the drop down menu to select **Image**. Ground Truth will use all images found in the S3 location for input datasets as input for your labeling job.
   + **IAM role** – Create or choose an IAM role with the AmazonSageMakerFullAccess IAM policy attached.

1. In the **Task type** section, for the **Task category** field, choose **Image**. 

1. In the **Task selection** choose **Bounding box**. 

1. Choose **Next** to move on to configuring your labeling job.

## Select Workers
<a name="sms-getting-started-step3"></a>

In this step you choose a workforce for labeling your dataset. It is recommended that you create a private workforce to test Amazon SageMaker Ground Truth. Use email addresses to invite the members of your workforce. If you create a private workforce in this step you won't be able to import your Amazon Cognito user pool later. If you want to create a private workforce using an Amazon Cognito user pool, see [Manage a Private Workforce (Amazon Cognito)](sms-workforce-management-private.md) and use the Mechanical Turk workforce instead in this tutorial.

**Tip**  
To learn about the other workforce options you can use with Ground Truth, see [Workforces](sms-workforce-management.md). 

**To create a private workforce:**

1. In the **Workers** section, choose **Private**.

1. If this is your first time using a private workforce, in the **Email addresses** field, enter up to 100 email addresses. The addresses must be separated by a comma. You should include your own email address so that you are part of the workforce and can see data object labeling tasks.

1. In the **Organization name** field, enter the name of your organization. This information is used to customize the email sent to invite a person to your private workforce. You can change the organization name after the user pool is created through the console.

1. In the **Contact email** field enter an email address that members of the workforce use to report problems with the task.

If you add yourself to the private workforce, you will receive an email that looks similar to the following. **Amazon, Inc.** is replaced by the organization you enter in step 3 of the preceding procedure. Select the link in the email to log in using the temporary password provided. If prompted, change your password. When you successfully log in, you see the worker portal where your labeling tasks appear.

![\[Example email invitation to work on a labeling project.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/worker_portal_invite.png)


**Tip**  
You can find the link to your private workforce's worker portal in the **Labeling workforces** section of the Ground Truth area of the SageMaker AI console. To see the link, select the **Private** tab. The link is under the **Labeling portal sign-in URL** header in **Private workforce summary**.

If you choose to use the Amazon Mechanical Turk workforce to label the dataset, you are charged for labeling tasks completed on the dataset.

**To use the Amazon Mechanical Turk workforce:**

1. In the **Workers** section, choose **Public**.

1. Set a **Price per task**.

1. If applicable, choose **The dataset does not contain adult content** to acknowledge that the sample dataset has no adult content. This information enables Amazon SageMaker Ground Truth to warn external workers on Mechanical Turk that they might encounter potentially offensive content in your dataset.

1. Choose the check box next to the following statement to acknowledge that the sample dataset does not contain any personally identifiable information (PII). This is a requirement to use Mechanical Turk with Ground Truth. If your input data does contain PII, use the private workforce for this tutorial. 

   **You understand and agree that the Amazon Mechanical Turk workforce consists of independent contractors located worldwide and that you should not share confidential information, personal information or protected health information with this workforce.**

## Configure the Bounding Box Tool
<a name="sms-getting-started-step4"></a>

Finally you configure the bounding box tool to give instructions to your workers. You can configure a task title that describes the task and provides high-level instructions for the workers. You can provide both quick instructions and full instructions. Quick instructions are displayed next to the image to be labeled. Full instructions contain detailed instructions for completing the task. In this example, you only provide quick instructions. You can see an example of full instructions by choosing **Full instructions** at the bottom of the section.

**To configure the bounding box tool**

1. In the **Task description** field type in brief instructions for the task. For example:

   **Draw a box around any *objects* in the image.**

   Replace *objects* with the name of an object that appears in your images.

1. In the **Labels** field, type a category name for the objects that the worker should draw a bounding box around. For example, if you are asking the worker to draw boxes around football players, you could use "Football Player" in this field.

1. The **Short instructions** section enables you to create instructions that are displayed on the page with the image that your workers are labeling. We suggest that you include an example of a correctly drawn bounding box and an example of an incorrectly drawn box. To create your own instructions, use these steps:

   1. Select the text between **GOOD EXAMPLE** and the image placeholder. Replace it with the following text:

      **Draw the box around the object with a small border.**

   1. Select the first image placeholder and delete it.

   1. Choose the image button and then enter the HTTPS URL of one of the images that you created in step 1. It is also possible to embed images directly in the short instructions section, however this section has a quota of 100 kilobytes (including text). If your images and text exceed 100 kilobytes, you receive an error.

   1. Select the text between **BAD EXAMPLE** and the image placeholder. Replace it with the following text:

      **Don't make the bounding box too large or cut into the object.**

   1. Select the second image placeholder and delete it.

   1. Choose the image button and then enter the HTTPS URL of the other image that you created in step 1.

1. Select **Preview** to preview the worker UI. The preview opens in a new tab, and so if your browser blocks pop ups you may need to manually enable the tab to open. When you add one or more annotations to the preview and then select **Submit** you can see a preview of the output data your annotation would created.

1. After you have configured and verified your instructions, select **Create** to create the labeling job.

If you used a private workforce, you can navigate to the worker portal that you logged into in [Select Workers](#sms-getting-started-step3) of this tutorial to see your labeling tasks. The tasks may take a few minutes to appear.

Now that you've created a labeling job, you can [monitor it, or stop it](sms-getting-started-step5.md).

# Monitoring Your Labeling Job
<a name="sms-getting-started-step5"></a>

After you create your labeling job, you see a list of all the jobs that you have created. You can use this list to monitor that status of your labeling jobs. The list has the following fields:
+ **Name** – The name that you assigned the job when you created it.
+ **Status** – The completion status of the job. The status can be one of Complete, Failed, In progress, or Stopped.
+ **Labeled objects/total** – Shows the total number of objects in the labeling job and how many of them have been labeled.
+ **Creation time** – The date and time that you created the job.

You can also clone, chain, or stop a job. Select a job and then select one of the following from the **Actions** menu:
+ **Clone** – Creates a new labeling job with the configuration copied from the selected job. You can clone a job when you want to change to the job and run it again. For example, you can clone a job that was sent to a private workforce so that you can send it to the Amazon Mechanical Turk workforce. Or you can clone a job to rerun it against a new dataset stored in the same location as the original job.
+ **Chain** – Creates a new labeling job that can build upon the data and models (if any) of a stopped, failed, or completed job. For more information about the use cases and how to use it, see [Chaining labeling jobs](sms-reusing-data.md).
+ **Stop** – Stops a running job. You cannot restart a stopped job. You can clone a job to start over or chain the job to continue from where it left off. Labels for any already labeled objects are written to the output file location. For more information, see [Labeling job output data](sms-data-output.md).

# Label Images
<a name="sms-label-images"></a>

Use Ground Truth to label images. Select one of the following built in task types to learn more about that task type. Each page includes instructions to help you create a labeling job using that task type.

**Tip**  
To learn more about supported file types and input data quotas, see [Input data](sms-data-input.md).

**Topics**
+ [Classify image objects using a bounding box](sms-bounding-box.md)
+ [Identify image contents using semantic segmentation](sms-semantic-segmentation.md)
+ [Auto-Segmentation Tool](sms-auto-segmentation.md)
+ [Create an image classification job (Single Label)](sms-image-classification.md)
+ [Create an image classification job (Multi-label)](sms-image-classification-multilabel.md)
+ [Image Label Verification](sms-label-verification.md)

# Classify image objects using a bounding box
<a name="sms-bounding-box"></a>

The images used to train a machine learning model often contain more than one object. To classify and localize one or more objects within images, use the Amazon SageMaker Ground Truth bounding box labeling job task type. In this context, localization means the pixel-location of the bounding box. You create a bounding box labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation.

**Important**  
For this task type, if you create your own manifest file, use `"source-ref"` to identify the location of each image file in Amazon S3 that you want labeled. For more information, see [Input data](sms-data-input.md).

## Creating a Bounding Box Labeling Job (Console)
<a name="sms-creating-bounding-box-labeling-job-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a bounding box labeling job in the SageMaker AI console. In Step 10, choose **Image** from the **Task category** drop down menu, and choose **Bounding box** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and up to 50 labels that workers can choose from. 

![\[Gif showing how to draw a box around an object for a category.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/bb-sample.gif)


## Create a Bounding Box Labeling Job (API)
<a name="sms-creating-bounding-box-labeling-job-api"></a>

To create a bounding box labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-BoundingBox`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-BoundingBox`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-bounding-box-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-BoundingBox',
        'TaskKeywords': [
            'Bounding Box',
        ],
        'TaskTitle': 'Bounding Box task',
        'TaskDescription': 'Draw bounding boxes around objects in an image',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-BoundingBox'
          }
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Bounding Box Labeling Jobs
<a name="sms-create-labeling-job-bounding-box-api-template"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-bounding-box
    name="boundingBox"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="please draw box"
    labels="{{ task.input.labels | to_json | escape }}"
  >

    <full-instructions header="Bounding box instructions">
      <ol><li><strong>Inspect</strong> the image</li><li><strong>Determine</strong> 
      if the specified label is/are visible in the picture.</li>
      <li><strong>Outline</strong> each instance of the specified label in the image using the provided “Box” tool.</li></ol>
      <ul><li>Boxes should fit tight around each object</li>
      <li>Do not include parts of the object are overlapping or that cannot be seen, even though you think you can interpolate the whole shape.</li>
      <li>Avoid including shadows.</li>
      <li>If the target is off screen, draw the box up to the edge of the image.</li>    
    </full-instructions>
  
    <short-instructions>
      <h3><span style="color: rgb(0, 138, 0);">Good example</span></h3>
      <p>Enter description of a correct bounding box label and add images</p>
      <h3><span style="color: rgb(230, 0, 0);">Bad example</span></h3>
      <p>Enter description of an incorrect bounding box label and add images</p>
    </short-instructions>
  
  </crowd-bounding-box>
</crowd-form>
```

## Bounding Box Output Data
<a name="sms-bounding-box-output-data"></a>

Once you have created a bounding box labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

For example, the output manifest file of a successfully completed single-class bounding box task will contain the following: 

```
[
  {
    "boundingBox": {
      "boundingBoxes": [
        {
          "height": 2832,
          "label": "bird",
          "left": 681,
          "top": 599,
          "width": 1364
        }
      ],
      "inputImageProperties": {
        "height": 3726,
        "width": 2662
      }
    }
  }
]
```

The `boundingBoxes` parameter identifies the location of the bounding box drawn around an object identified as a "bird" relative to the top-left corner of the image which is taken to be the (0,0) pixel-coordinate. In the previous example, **`left`** and **`top`** identify the location of the pixel in the top-left corner of the bounding box relative to the top-left corner of the image. The dimensions of the bounding box are identified with **`height`** and **`width`**. The `inputImageProperties` parameter gives the pixel-dimensions of the original input image.

When you use the bounding box task type, you can create single- and multi-class bounding box labeling jobs. The output manifest file of a successfully completed multi-class bounding box will contain the following: 

```
[
  {
    "boundingBox": {
      "boundingBoxes": [
        {
          "height": 938,
          "label": "squirrel",
          "left": 316,
          "top": 218,
          "width": 785
        },
        {
          "height": 825,
          "label": "rabbit",
          "left": 1930,
          "top": 2265,
          "width": 540
        },
        {
          "height": 1174,
          "label": "bird",
          "left": 748,
          "top": 2113,
          "width": 927
        },
        {
          "height": 893,
          "label": "bird",
          "left": 1333,
          "top": 847,
          "width": 736
        }
      ],
      "inputImageProperties": {
        "height": 3726,
        "width": 2662
      }
    }
  }
]
```

To learn more about the output manifest file that results from a bounding box labeling job, see [Bounding box job output](sms-data-output.md#sms-output-box).

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

# Identify image contents using semantic segmentation
<a name="sms-semantic-segmentation"></a>

To identify the contents of an image at the pixel level, use an Amazon SageMaker Ground Truth semantic segmentation labeling task. When assigned a semantic segmentation labeling job, workers classify pixels in the image into a set of predefined labels or classes. Ground Truth supports single and multi-class semantic segmentation labeling jobs. You create a semantic segmentation labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. 

Images that contain large numbers of objects that need to be segmented require more time. To help workers (from a private or vendor workforce) label these objects in less time and with greater accuracy, Ground Truth provides an AI-assisted auto-segmentation tool. For information, see [Auto-Segmentation Tool](sms-auto-segmentation.md).

**Important**  
For this task type, if you create your own manifest file, use `"source-ref"` to identify the location of each image file in Amazon S3 that you want labeled. For more information, see [Input data](sms-data-input.md).

## Creating a Semantic Segmentation Labeling Job (Console)
<a name="sms-creating-ss-labeling-job-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a semantic segmentation labeling job in the SageMaker AI console. In Step 10, choose **Image** from the **Task category** drop down menu, and choose **Semantic segmentation** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Gif showing an example on how to create a semantic segmentation labeling job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/semantic_segmentation_sample.gif)


## Create a Semantic Segmentation Labeling Job (API)
<a name="sms-creating-ss-labeling-job-api"></a>

To create a semantic segmentation labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-SemanticSegmentation`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-SemanticSegmentation`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-semantic-segmentation-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-SemanticSegmentation,
        'TaskKeywords': [
            'Semantic Segmentation',
        ],
        'TaskTitle': 'Semantic segmentation task',
        'TaskDescription': 'For each category provided, segment out each relevant object using the color associated with that category',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-SemanticSegmentation'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Semantic Segmentation Labeling Jobs
<a name="sms-create-labeling-job-ss-api-template"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. 

Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-semantic-segmentation
    name="crowd-semantic-segmentation"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="Please segment out all pedestrians."
    labels="{{ task.input.labels | to_json | escape }}"
  >
    <full-instructions header="Segmentation instructions">
      <ol><li><strong>Read</strong> the task carefully and inspect the image.</li>
      <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
      <li><strong>Choose</strong> the appropriate label that best suits an object and paint that object using the tools provided.</li></ol>
    </full-instructions>
    <short-instructions>
      <h2><span style="color: rgb(0, 138, 0);">Good example</span></h2>
      <p>Enter description to explain a correctly done segmentation</p>
      <p><br></p><h2><span style="color: rgb(230, 0, 0);">Bad example</span></h2>
      <p>Enter description of an incorrectly done segmentation</p>
    </short-instructions>
  </crowd-semantic-segmentation>
</crowd-form>
```

## Semantic Segmentation Output Data
<a name="sms-ss-ouput-data"></a>

Once you have created a semantic segmentation labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of an output manifest file for a semantic segmentation labeling job, see [3D point cloud semantic segmentation output](sms-data-output.md#sms-output-point-cloud-segmentation).

# Auto-Segmentation Tool
<a name="sms-auto-segmentation"></a>

Image segmentation is the process of dividing an image into multiple segments, or sets of labeled pixels. In Amazon SageMaker Ground Truth, the process of identifying all pixels that fall under a given label involves applying a colored filler, or "mask", over those pixels. Some labeling job tasks contain images with a large numbers of objects that need to be segmented. To help workers label these objects in less time and with greater accuracy, Ground Truth provides an auto-segmentation tool for segmentation tasks assigned to private and vendor workforces. This tool uses a machine learning model to automatically segment individual objects in the image with minimal worker input. Workers can refine the mask generated by the auto-segmentation tool using other tools found in the worker console. This helps workers complete image segmentation tasks faster and more accurately, resulting in lower cost and higher label quality. The following page gives information about the tool and its availability.

**Note**  
The auto-segmentation tool is available for segmentation tasks that are sent to a private workforce or vendor workforce. It isn't available for tasks sent to the public workforce (Amazon Mechanical Turk). 

## Tool Preview
<a name="sms-auto-segment-tool-preview"></a>

When workers are assigned a labeling job that provides the auto-segmentation tool, they are provided with detailed instructions on how to use the tool. For example, a worker might see the following in the worker console: 

![\[Example UI with instructions on how to use the tool in the worker console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/semantic-segmentation.gif)


Workers can use **View full instructions** to learn how to use the tool. Workers will need to place a point on four extreme-points ( top-most, bottom-most, left-most, and right-most points ) of the object of interest, and the tool will automatically generate a mask for the object. Workers can further-refine the mask using the other tools provided, or by using the auto-segment tool on smaller portions of the object that were missed. 

## Tool Availability
<a name="sms-auto-segment-tool-availability"></a>

The auto-segmentation tool automatically appears in your workers' consoles if you create a semantic segmentation labeling job using the Amazon SageMaker AI console. While creating a semantic segmentation job in the SageMaker AI console, you will be able to preview the tool while creating worker instructions. To learn how to create a semantic segmentation labeling job in the SageMaker AI console, see [Getting started: Create a bounding box labeling job with Ground Truth](sms-getting-started.md). 

If you are creating a custom instance segmentation labeling job in the SageMaker AI console or creating an instance- or semantic-segmentation labeling job using the Ground Truth API, you need to create a custom task template to design your worker console and instructions. To include the auto-segmentation tool in your worker console, ensure that the following conditions are met in your custom task template:
+ For semantic segmentation labeling jobs created using the API, the `<crowd-semantic-segmentation>` is present in the task template. For custom instance segmentation labeling jobs, the `<crowd-instance-segmentation>` tag is present in the task template.
+ The task is assigned to a private workforce or vendor workforce. 
+ The images to be labeled are Amazon Simple Storage Service Amazon S3) objects that have been pre-signed for the Worker so that they can access it. This is true if the task template includes the `grant_read_access` filter. For information about the `grant_read_access` filter, see [Adding automation with Liquid](sms-custom-templates-step2-automate.md).

The following is an example of a custom task template for a custom instance segmentation labeling job, which includes the `<crowd-instance-segmentation/>` tag and the `grant_read_access` Liquid filter.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-instance-segmentation
    name="crowd-instance-segmentation"
    src="{{ task.input.taskObject | grant_read_access }}"
    labels="['Car','Road']"
   <full-instructions header="Segmentation instructions">
      Segment each instance of each class of objects in the image. 
    </full-instructions>

    <short-instructions>
      <p>Segment each instance of each class of objects in the image.</p>

      <h3 style="color: green">GOOD EXAMPLES</h3>
      <img src="path/to/image.jpg" style="width: 100%">
      <p>Good because A, B, C.</p>

      <h3 style="color: red">BAD EXAMPLES</h3>
      <img src="path/to/image.jpg" style="width: 100%">
      <p>Bad because X, Y, Z.</p>
    </short-instructions>
  </crowd-instance-segmentation>
</crowd-form>
```

# Create an image classification job (Single Label)
<a name="sms-image-classification"></a>

Use an Amazon SageMaker Ground Truth image classification labeling task when you need workers to classify images using predefined labels that you specify. Workers are shown images and are asked to choose one label for each image. You can create an image classification labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. 

**Important**  
For this task type, if you create your own manifest file, use `"source-ref"` to identify the location of each image file in Amazon S3 that you want labeled. For more information, see [Input data](sms-data-input.md).

## Create an Image Classification Labeling Job (Console)
<a name="sms-creating-image-classification-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a image classification labeling job in the SageMaker AI console. In Step 10, choose **Image** from the **Task category** drop down menu, and choose **Image Classification (Single Label)** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Example worker UI for labeling tasks, provided by Ground Truth.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/image-classification-example.png)


## Create an Image Classification Labeling Job (API)
<a name="sms-creating-image-classification-api"></a>

To create an image classification labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-ImageMultiClass`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-ImageMultiClass`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-image-classification-labeling-job',
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-ImageMultiClass,
        'TaskKeywords': [
            Image classification',
        ],
        'TaskTitle': Image classification task',
        'TaskDescription': 'Carefully inspect the image and classify it by selecting one label from the categories provided.',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-ImageMultiClass'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Image Classification Labeling Jobs
<a name="worker-template-image-classification"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. 

Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-image-classifier
    name="crowd-image-classifier"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="please classify"
    categories="{{ task.input.labels | to_json | escape }}"
  >
    <full-instructions header="Image classification instructions">
      <ol><li><strong>Read</strong> the task carefully and inspect the image.</li>
      <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
      <li><strong>Choose</strong> the appropriate label that best suits the image.</li></ol>
    </full-instructions>
    <short-instructions>
      <h3><span style="color: rgb(0, 138, 0);">Good example</span></h3>
      <p>Enter description to explain the correct label to the workers</p>
      <h3><span style="color: rgb(230, 0, 0);">Bad example</span></h3><p>Enter description of an incorrect label</p>
    </short-instructions>
  </crowd-image-classifier>
</crowd-form>
```

## Image Classification Output Data
<a name="sms-image-classification-output-data"></a>

Once you have created an image classification labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of an output manifest file from an image classification labeling job, see [Classification job output](sms-data-output.md#sms-output-class).

# Create an image classification job (Multi-label)
<a name="sms-image-classification-multilabel"></a>

Use an Amazon SageMaker Ground Truth multi-label image classification labeling task when you need workers to classify multiple objects in an image. For example, the following image features a dog and a cat. You can use multi-label image classification to associate the labels "dog" and "cat" with this image. The following page gives information about creating an image classification job.

![\[Photo by Anusha Barwa on Unsplash\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/dog-cat-photo.jpg)


When working on a multi-label image classification task, workers should choose all applicable labels, but must choose at least one. When creating a job using this task type, you can provide up to 50 label-categories. 

When creating a labeling job in the console, Ground Truth doesn't provide a "none" category for when none of the labels applies to an image. To provide this option to workers, include a label similar to "none" or "other" when you create a multi-label image classification job. 

To restrict workers to choosing a single label for each image, use the [Create an image classification job (Single Label)](sms-image-classification.md) task type.

**Important**  
For this task type, if you create your own manifest file, use `"source-ref"` to identify the location of each image file in Amazon S3 that you want labeled. For more information, see [Input data](sms-data-input.md).

## Create a Multi-Label Image Classification Labeling Job (Console)
<a name="sms-creating-multilabel-image-classification-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a multi-label image classification labeling job in the SageMaker AI console. In Step 10, choose **Image** from the **Task category** drop down menu, and choose **Image Classification (Multi-label)** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create a labeling job in the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Example worker UI for labeling tasks, provided by Ground Truth.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/image-classification-multilabel-example.png)


## Create a Multi-Label Image Classification Labeling Job (API)
<a name="sms-create-multi-select-image-classification-job-api"></a>

To create a multi-label image classification labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-ImageMultiClassMultiLabel`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-ImageMultiClassMultiLabel`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-multi-label-image-classification-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-ImageMultiClassMultiLabel',
        'TaskKeywords': [
            'Image Classification',
        ],
        'TaskTitle': 'Multi-label image classification task',
        'TaskDescription': 'Select all labels that apply to the images shown',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-ImageMultiClassMultiLabel'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Multi-label Image Classification
<a name="sms-custom-template-multi-image-label-classification"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. 

Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-image-classifier-multi-select
    name="crowd-image-classifier-multi-select"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="Please identify all classes in image"
    categories="{{ task.input.labels | to_json | escape }}"
  >
    <full-instructions header="Multi Label Image classification instructions">
      <ol><li><strong>Read</strong> the task carefully and inspect the image.</li>
      <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
       <li><strong>Choose</strong> the appropriate labels that best suit the image.</li></ol>
    </full-instructions>
    <short-instructions>
      <h3><span style="color: rgb(0, 138, 0);">Good example</span></h3>
      <p>Enter description to explain the correct label to the workers</p>
      <h3><span style="color: rgb(230, 0, 0);">Bad example</span></h3>
      <p>Enter description of an incorrect label</p>
   </short-instructions>
  </crowd-image-classifier-multi-select>
</crowd-form>
```

## Multi-label Image Classification Output Data
<a name="sms-image-classification-multi-output-data"></a>

Once you have created a multi-label image classification labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of output manifest files for multi-label image classification labeling job, see [Multi-label classification job output](sms-data-output.md#sms-output-multi-label-classification).

# Image Label Verification
<a name="sms-label-verification"></a>

Building a highly accurate training dataset for your machine learning (ML) algorithm is an iterative process. Typically, you review and continuously adjust your labels until you are satisfied that they accurately represent the ground truth, or what is directly observable in the real world. You can use an Amazon SageMaker Ground Truth image label verification task to direct workers to review a dataset's labels and improve label accuracy. Workers can indicate if the existing labels are correct or rate label quality. They can also add comments to explain their reasoning. Amazon SageMaker Ground Truth supports label verification for [Classify image objects using a bounding box](sms-bounding-box.md) and [Identify image contents using semantic segmentation](sms-semantic-segmentation.md) labels. You create an image label verification labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. 

Ground Truth provides a worker console similar to the following for labeling tasks. When you create the labeling job with the console, you can modify the images and content that are shown. To learn how to create a labeling job using the Ground Truth console, see [Create a Labeling Job (Console)](sms-create-labeling-job-console.md).

![\[Example worker console for labeling tasks, provided by Ground Truth.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/label-verification-example.png)


You can create a label verification labeling job using the SageMaker AI console or API. To learn how to create a labeling job using the Ground Truth API operation `CreateLabelingJob`, see [Create a Labeling Job (API)](sms-create-labeling-job-api.md).

# Text labeling with Ground Truth
<a name="sms-label-text"></a>

Use Ground Truth to label text. Ground Truth supports labeling text for named entity recognition, single label text classification, and multi-label text classification. The following topics give information about these built-in task types, as well as instructions to help you create a labeling job using that task type.

**Tip**  
To learn more about supported file types and input data quotas, see [Input data](sms-data-input.md).

**Topics**
+ [Extract text information using named entity recognition](sms-named-entity-recg.md)
+ [Categorize text with text classification (Single Label)](sms-text-classification.md)
+ [Categorize text with text classification (Multi-label)](sms-text-classification-multilabel.md)

# Extract text information using named entity recognition
<a name="sms-named-entity-recg"></a>

To extract information from unstructured text and classify it into predefined categories, use an Amazon SageMaker Ground Truth named entity recognition (NER) labeling task. Traditionally, NER involves sifting through text data to locate noun phrases, called *named entities*, and categorizing each with a label, such as "person," "organization," or "brand." You can broaden this task to label longer spans of text and categorize those sequences with predefined labels that you specify. You can create a named entity recognition labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation.

When tasked with a named entity recognition labeling job, workers apply your labels to specific words or phrases within a larger text block. They choose a label, then apply it by using the cursor to highlight the part of the text to which the label applies. The Ground Truth named entity recognition tool supports overlapping annotations, in-context label selection, and multi-label selection for a single highlight. Also, workers can use their keyboards to quickly select labels.

**Important**  
If you manually create an input manifest file, use `"source"` to identify the text that you want labeled. For more information, see [Input data](sms-data-input.md).

## Create a Named Entity Recognition Labeling Job (Console)
<a name="sms-creating-ner-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a named entity recognition labeling job in the SageMaker AI console. In Step 10, choose **Text** from the **Task category** drop down menu, and choose **Named entity recognition ** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Gif showing how to create a named entity recognition labeling job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/nertool.gif)


## Create a Named Entity Recognition Labeling Job (API)
<a name="sms-creating-ner-api"></a>

To create a named entity recognition labeling job, using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request:
+ Pre-annotation Lambda functions for this task type end with `PRE-NamedEntityRecognition`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-NamedEntityRecognition`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 
+ You must provide the following ARN for `[HumanTaskUiArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-HumanTaskUiArn)`:

  ```
  arn:aws:sagemaker:aws-region:394669845002:human-task-ui/NamedEntityRecognition
  ```

  Replace `aws-region` with the AWS Region you use to create the labeling job. For example, use `us-west-1` if you create a labeling job in US West (N. California).
+ Provide worker instructions in the label category configuration file using the `instructions` parameter. You can use a string, or HTML markup language in the `shortInstruction` and `fullInstruction` fields. For more details, see [Provide Worker Instructions in a Label Category Configuration File](#worker-instructions-ner).

  ```
  "instructions": {"shortInstruction":"<h1>Add header</h1><p>Add Instructions</p>", "fullInstruction":"<p>Add additional instructions.</p>"}
  ```

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-ner-labeling-job',
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*',
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'HumanTaskUiArn': 'arn:aws:sagemaker:us-east-1:394669845002:human-task-ui/NamedEntityRecognition'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-NamedEntityRecognition',
        'TaskKeywords': [
            'Named entity Recognition',
        ],
        'TaskTitle': 'Named entity Recognition task',
        'TaskDescription': 'Apply the labels provided to specific words or phrases within the larger text block.',
        'NumberOfHumanWorkersPerDataObject': 1,
        'TaskTimeLimitInSeconds': 28800,
        'TaskAvailabilityLifetimeInSeconds': 864000,
        'MaxConcurrentTaskCount': 1000,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-NamedEntityRecognition'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide Worker Instructions in a Label Category Configuration File
<a name="worker-instructions-ner"></a>

You must provide worker instructions in the label category configuration file you identify with the `LabelCategoryConfigS3Uri` parameter in `CreateLabelingJob`. You can use these instructions to provide details about the task you want workers to perform and help them use the tool efficiently.

You provide short and long instructions using `shortInstruction` and `fullInstruction` in the `instructions` parameter, respectively. To learn more about these instruction types, see [Create instruction pages](sms-creating-instruction-pages.md).

The following is an example of a label category configuration file with instructions that can be used for a named entity recognition labeling job.

```
{
  "document-version": "2018-11-28",
  "labels": [
    {
      "label": "label1",
      "shortDisplayName": "L1"
    },
    {
      "label": "label2",
      "shortDisplayName": "L2"
    },
    {
      "label": "label3",
      "shortDisplayName": "L3"
    },
    {
      "label": "label4",
      "shortDisplayName": "L4"
    },
    {
      "label": "label5",
      "shortDisplayName": "L5"
    }
  ],
  "instructions": {
    "shortInstruction": "<p>Enter description of the labels that workers have 
                        to choose from</p><br><p>Add examples to help workers understand the label</p>",
    "fullInstruction": "<ol>
                        <li><strong>Read</strong> the text carefully.</li>
                        <li><strong>Highlight</strong> words, phrases, or sections of the text.</li>
                        <li><strong>Choose</strong> the label that best matches what you have highlighted.</li>
                        <li>To <strong>change</strong> a label, choose highlighted text and select a new label.</li>
                        <li>To <strong>remove</strong> a label from highlighted text, choose the X next to the 
                        abbreviated label name on the highlighted text.</li>
                        <li>You can select all of a previously highlighted text, but not a portion of it.</li>
                        </ol>"
  }
}
```

## Named Entity Recognition Output Data
<a name="sms-ner-output-data"></a>

Once you have created a named entity recognition labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

# Categorize text with text classification (Single Label)
<a name="sms-text-classification"></a>

To categorize articles and text into predefined categories, use text classification. For example, you can use text classification to identify the sentiment conveyed in a review or the emotion underlying a section of text. Use Amazon SageMaker Ground Truth text classification to have workers sort text into categories that you define. You create a text classification labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. 

**Important**  
If you manually create an input manifest file, use `"source"` to identify the text that you want labeled. For more information, see [Input data](sms-data-input.md).

## Create a Text Classification Labeling Job (Console)
<a name="sms-creating-text-classification-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a text classification labeling job in the SageMaker AI console. In Step 10, choose **Text** from the **Task category** drop down menu, and choose **Text Classification (Single Label)** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Gif showing how to create a text classification labeling job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/single-label-text.gif)


## Create a Text Classification Labeling Job (API)
<a name="sms-creating-text-classification-api"></a>

To create a text classification labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-TextMultiClass`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-TextMultiClass`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-text-classification-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-TextMultiClass,
        'TaskKeywords': [
            Text classification',
        ],
        'TaskTitle': Text classification task',
        'TaskDescription': 'Carefully read and classify this text using the categories provided.',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-TextMultiClass'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Text Classification Labeling Jobs
<a name="worker-template-text-classification"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. 

Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-classifier
    name="crowd-classifier"
    categories="{{ task.input.labels | to_json | escape }}"
    header="classify text"
  >
    <classification-target style="white-space: pre-wrap">
      {{ task.input.taskObject }}
    </classification-target>
    <full-instructions header="Classifier instructions">
      <ol><li><strong>Read</strong> the text carefully.</li>
      <li><strong>Read</strong> the examples to understand more about the options.</li>
      <li><strong>Choose</strong> the appropriate labels that best suit the text.</li></ol>
    </full-instructions>
    <short-instructions>
      <p>Enter description of the labels that workers have to choose from</p>
      <p><br></p><p><br></p><p>Add examples to help workers understand the label</p>
      <p><br></p><p><br></p><p><br></p><p><br></p><p><br></p>
    </short-instructions>
  </crowd-classifier>
  </crowd-form>
```

## Text Classification Output Data
<a name="sms-text-classification-output-data"></a>

Once you have created a text classification labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of an output manifest files from a text classification labeling job, see [Classification job output](sms-data-output.md#sms-output-class).

# Categorize text with text classification (Multi-label)
<a name="sms-text-classification-multilabel"></a>

To categorize articles and text into multiple predefined categories, use the multi-label text classification task type. For example, you can use this task type to identify more than one emotion conveyed in text. The following sections give information about how to create a multi-label text classification task from the console and API.

When working on a multi-label text classification task, workers should choose all applicable labels, but must choose at least one. When creating a job using this task type, you can provide up to 50 label categories. 

Amazon SageMaker Ground Truth doesn't provide a "none" category for when none of the labels applies. To provide this option to workers, include a label similar to "none" or "other" when you create a multi-label text classification job. 

To restrict workers to choosing a single label for each document or text selection, use the [Categorize text with text classification (Single Label)](sms-text-classification.md) task type. 

**Important**  
If you manually create an input manifest file, use `"source"` to identify the text that you want labeled. For more information, see [Input data](sms-data-input.md).

## Create a Multi-Label Text Classification Labeling Job (Console)
<a name="sms-creating-multilabel-text-classification-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a multi-label text classification labeling job in the Amazon SageMaker AI console. In Step 10, choose **Text** from the **Task category** drop down menu, and choose **Text Classification (Multi-label)** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create the labeling job with the console, you specify instructions to help workers complete the job and labels that workers can choose from. 

![\[Gif showing how to create a multi-label text classification labeling job in the Amazon SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/multi-label-text.gif)


## Create a Multi-Label Text Classification Labeling Job (API)
<a name="sms-creating-multilabel-text-classification-api"></a>

To create a multi-label text classification labeling job, use the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Pre-annotation Lambda functions for this task type end with `PRE-TextMultiClassMultiLabel`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Annotation-consolidation Lambda functions for this task type end with `ACS-TextMultiClassMultiLabel`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. All parameters in red should be replaced with your specifications and resources. 

```
response = client.create_labeling_job(
    LabelingJobName='example-multi-label-text-classification-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/custom-worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda::function:PRE-TextMultiClassMultiLabel,
        'TaskKeywords': [
            'Text Classification',
        ],
        'TaskTitle': 'Multi-label text classification task',
        'TaskDescription': 'Select all labels that apply to the text shown',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-TextMultiClassMultiLabel'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Create a Template for Multi-label Text Classification
<a name="custom-template-multi-label-text-classification"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template. Only modify the [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-quick-instructions), [https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-creating-instruction-pages.html#sms-creating-full-instructions), and `header`. 

Upload this template to S3, and provide the S3 URI for this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-classifier-multi-select
    name="crowd-classifier-multi-select"
    categories="{{ task.input.labels | to_json | escape }}"
    header="Please identify all classes in the below text"
  >
    <classification-target style="white-space: pre-wrap">
      {{ task.input.taskObject }}
    </classification-target>
    <full-instructions header="Classifier instructions">
      <ol><li><strong>Read</strong> the text carefully.</li>
      <li><strong>Read</strong> the examples to understand more about the options.</li>
      <li><strong>Choose</strong> the appropriate labels that best suit the text.</li></ol>
    </full-instructions>
    <short-instructions>
      <p>Enter description of the labels that workers have to choose from</p>
      <p><br></p>
      <p><br></p><p>Add examples to help workers understand the label</p>
      <p><br></p><p><br></p><p><br></p><p><br></p><p><br></p>
    </short-instructions>
  </crowd-classifier-multi-select>
  </crowd-form>
```

To learn how to create a custom template, see [Custom labeling workflows](sms-custom-templates.md). 

## Multi-label Text Classification Output Data
<a name="sms-text-classification-multi-select-output-data"></a>

Once you have created a multi-label text classification labeling job, your output data will be located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of output manifest files for multi-label text classification labeling job, see [Multi-label classification job output](sms-data-output.md#sms-output-multi-label-classification).

# Videos and video frame labeling
<a name="sms-video"></a>

You can use Ground Truth to classify videos and annotate video frames (still images extracted from videos) using one of the three built-in video task types. These task types streamline the process of creating video and video frame labeling jobs using the Amazon SageMaker AI console, API, and language-specific SDKs. 
+ Video clip classification – Enable workers to classify videos into categories you specify. For example, you can use this task type to have workers categorize videos into topics like sports, comedy, music, and education. To learn more, see [Classify videos](sms-video-classification.md).
+ Video frame labeling jobs – Enable workers to annotate video frames extracted from a video using bounding boxes, polylines, polygons or keypoint annotation tools. Ground Truth offers two built-in task types to label video frames:
  + *Video frame object detection*: Enable workers to identify and locate objects in video frames. 
  + *Video frame object tracking*: Enable workers to track the movement of objects across video frames.
  + *Video frame adjustment jobs*: Have workers adjust labels, label category attributes, and frame attributes from a previous video frame object detection or object tracking labeling job.
  + *Video frame verification jobs*: Have workers verify labels, label category attributes, and frame attributes from a previous video frame object detection or object tracking labeling job.

  If you have video files, you can use the Ground Truth automatic frame extraction tool to extract video frames from your videos. To learn more, see [Video Frame Input Data](sms-video-frame-input-data-overview.md).

**Tip**  
To learn more about supported file types and input data quotas, see [Input data](sms-data-input.md).

**Topics**
+ [Classify videos](sms-video-classification.md)
+ [Video frames](sms-video-task-types.md)
+ [Worker Instructions](sms-video-worker-instructions.md)

# Classify videos
<a name="sms-video-classification"></a>

Use an Amazon SageMaker Ground Truth video classification labeling task when you need workers to classify videos using predefined labels that you specify. Workers are shown videos and are asked to choose one label for each video. You create a video classification labeling job using the Ground Truth section of the Amazon SageMaker AI console or the [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. 

Your video files must be encoded in a format that is supported by the browser used by the work team that labels your data. It is recommended that you verify that all video file formats in your input manifest file display correctly using the worker UI preview. You can communicate supported browsers to your workers using worker instructions. To see supported file formats, see [Supported data formats](sms-supported-data-formats.md).

**Important**  
For this task type, if you create your own manifest file, use `"source-ref"` to identify the location of each video file in Amazon S3 that you want labeled. For more information, see [Input data](sms-data-input.md).



## Create a Video Classification Labeling Job (Console)
<a name="sms-creating-video-classification-console"></a>

You can follow the instructions in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a video classification labeling job in the SageMaker AI console. In step 10, choose **Video** from the **Task category** dropdown list, and choose **Video Classification** as the task type. 

Ground Truth provides a worker UI similar to the following for labeling tasks. When you create a labeling job in the console, you specify instructions to help workers complete the job and labels from which workers can choose. 

![\[Gif showing how to create a video classification labeling job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/vid_classification.gif)


## Create a Video Classification Labeling Job (API)
<a name="sms-creating-video-classification-api"></a>

This section covers details you need to know when you create a labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html).

Follow the instructions on [Create a Labeling Job (API)](sms-create-labeling-job-api.md) and do the following while you configure your request: 
+ Use a pre-annotation Lambda function that ends with `PRE-VideoClassification`. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) . 
+ Use an annotation-consolidation Lambda function that ends with `ACS-VideoClassification`. To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. 

```
response = client.create_labeling_job(
    LabelingJobName='example-video-classification-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:region:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-VideoClassification',
        'TaskKeywords': [
            'Video Classification',
        ],
        'TaskTitle': 'Video classification task',
        'TaskDescription': 'Select a label to classify this video',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-VideoClassification'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

### Provide a Template for Video Classification
<a name="sms-custom-template-video-classification"></a>

If you create a labeling job using the API, you must supply a worker task template in `UiTemplateS3Uri`. Copy and modify the following template by modifying the `short-instructions`, `full-instructions`, and `header`. Upload this template to Amazon S3, and provide the Amazon S3 URI to this file in `UiTemplateS3Uri`.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

              <crowd-form>
                  <crowd-classifier
                    name="crowd-classifier"
                    categories="{{ task.input.labels | to_json | escape }}"
                    header="Please classify video"
                  >
                    <classification-target>
                       <video width="100%" controls/>
                        <source src="{{ task.input.taskObject | grant_read_access }}" type="video/mp4"/>
                        <source src="{{ task.input.taskObject | grant_read_access }}" type="video/webm"/>
                        <source src="{{ task.input.taskObject | grant_read_access }}" type="video/ogg"/>
                      Your browser does not support the video tag.
                      </video>
                    </classification-target>
                    <full-instructions header="Video classification instructions">
                      <ol><li><strong>Read</strong> the task carefully and inspect the video.</li>
                        <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
                        <li><strong>Choose</strong> the appropriate label that best suits the video.</li></ol>
                    </full-instructions>
                    <short-instructions>
                      <h3><span style="color: rgb(0, 138, 0);">Good example</span></h3>
                        <p>Enter description to explain the correct label to the workers</p>
                        <p><img src="https://d7evko5405gb7.cloudfront.net/fe4fed9b-660c-4477-9294-2c66a15d6bbe/src/images/quick-instructions-example-placeholder.png" style="max-width:100%"></p>
                        <h3><span style="color: rgb(230, 0, 0);">Bad example</span></h3>
                        <p>Enter description of an incorrect label</p>
                        <p><img src="https://d7evko5405gb7.cloudfront.net/fe4fed9b-660c-4477-9294-2c66a15d6bbe/src/images/quick-instructions-example-placeholder.png" style="max-width:100%"></p>
                    </short-instructions>
                  </crowd-classifier>
              </crowd-form>
```

## Video Classification Output Data
<a name="sms-vido-classification-output-data"></a>

Once you have created a video classification labeling job, your output data is located in the Amazon S3 bucket specified in the `S3OutputPath` parameter when using the API or in the **Output dataset location** field of the **Job overview** section of the console. 

To learn more about the output manifest file generated by Ground Truth and the file structure the Ground Truth uses to store your output data, see [Labeling job output data](sms-data-output.md). 

To see an example of output manifest files for video classification labeling jobs, see [Classification job output](sms-data-output.md#sms-output-class).

 

# Video frames
<a name="sms-video-task-types"></a>

You can use Ground Truth built-in video frame task types to have workers annotate video frames using bounding boxes, polylines, polygons or keypoints. A *video frame* is a sequence of images that have been extracted from a video.

If you do not have video frames, you can provide video files (MP4 files) and use the Ground Truth automated frame extraction tool to extract video frames. To learn more, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

You can use the following built-in video task types to create video frame labeling jobs using the Amazon SageMaker AI console, API, and language-specific SDKs.
+ **Video frame object detection** – Use this task type when you want workers to identify and locate objects in sequences of video frames. You provide a list of categories, and workers can select one category at a time and annotate objects which the category applies to in all frames. For example, you can use this task to ask workers to identify and localize various objects in a scene, such as cars, bikes, and pedestrians.
+ **Video frame object tracking** – Use this task type when you want workers to track the movement of instances of objects across sequences of video frames. When a worker adds an annotation to a single frame, that annotation is associated with a unique instance ID. The worker adds annotations associated with the same ID in all other frames to identify the same object or person. For example, a worker can track the movement of a vehicle across a sequences of video frames by drawing bounding boxes associated with the same ID around the vehicle in each frame that it appears. 

Use the following topics to learn more about these built-in task types and to how to create a labeling job using each task type. See [Task types](sms-video-overview.md#sms-video-frame-tools) to learn more about the annotations tools (bounding boxes, polylines, polygons and keypoints) available for these task types.

Before you create a labeling job, we recommend that you review [Video frame labeling job reference](sms-video-overview.md).

**Topics**
+ [Identify objects using video frame object detection](sms-video-object-detection.md)
+ [Track objects in video frames using video frame object tracking](sms-video-object-tracking.md)
+ [Video frame labeling job reference](sms-video-overview.md)

# Identify objects using video frame object detection
<a name="sms-video-object-detection"></a>

You can use the video frame object detection task type to have workers identify and locate objects in a sequence of video frames (images extracted from a video) using bounding boxes, polylines, polygons or keypoint *annotation tools*. The tool you choose defines the video frame task type you create. For example, you can use a bounding box video frame object detection task type workers to identify and localize various objects in a series of video frames, such as cars, bikes, and pedestrians. You can create a video frame object detection labeling job using the Amazon SageMaker AI Ground Truth console, the SageMaker API, and language-specific AWS SDKs. To learn more, see [Create a Video Frame Object Detection Labeling Job](#sms-video-od-create-labeling-job) and select your preferred method. See [Task types](sms-video-overview.md#sms-video-frame-tools) to learn more about the annotations tools you can choose from when you create a labeling job.

Ground Truth provides a worker UI and tools to complete your labeling job tasks: [Preview the Worker UI](#sms-video-od-worker-ui).

You can create a job to adjust annotations created in a video object detection labeling job using the video object detection adjustment task type. To learn more, see [Create Video Frame Object Detection Adjustment or Verification Labeling Job](#sms-video-od-adjustment).

## Preview the Worker UI
<a name="sms-video-od-worker-ui"></a>

Ground Truth provides workers with a web user interface (UI) to complete your video frame object detection annotation tasks. You can preview and interact with the worker UI when you create a labeling job in the console. If you are a new user, we recommend that you create a labeling job through the console using a small input dataset to preview the worker UI and ensure your video frames, labels, and label attributes appear as expected. 

The UI provides workers with the following assistive labeling tools to complete your object detection tasks:
+ For all tasks, workers can use the **Copy to next** and **Copy to all** features to copy an annotation to the next frame or to all subsequent frames respectively. 
+ For tasks that include the bounding box tools, workers can use a **Predict next** feature to draw a bounding box in a single frame, and then have Ground Truth predict the location of boxes with the same label in all other frames. Workers can then make adjustments to correct predicted box locations. 

The following video shows how a worker might use the worker UI with the bounding box tool to complete your object detection tasks.

![\[Gif showing how a worker can use the bounding box tool for their object detection tasks.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/kitti-od-general-labeling-job.gif)


## Create a Video Frame Object Detection Labeling Job
<a name="sms-video-od-create-labeling-job"></a>

You can create a video frame object detection labeling job using the SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) API operation. 

This section assumes that you have reviewed the [Video frame labeling job reference](sms-video-overview.md) and have chosen the type of input data and the input dataset connection you are using. 

### Create a Labeling Job (Console)
<a name="sms-video-od-create-labeling-job-console"></a>

You can follow the instructions in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a video frame object tracking job in the SageMaker AI console. In step 10, choose **Video - Object detection** from the **Task category** dropdown list. Select the task type you want by selecting one of the cards in **Task selection**.

![\[Gif showing how to create a video frame object tracking job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/task-type-vod.gif)


### Create a Labeling Job (API)
<a name="sms-video-od-create-labeling-job-api"></a>

You create an object detection labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

[Create a Labeling Job (API)](sms-create-labeling-job-api.md) provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/VideoObjectDetection`. Replace `<region>` with the AWS Region in which you are creating the labeling job. 

  Do not include an entry for the `UiTemplateS3Uri` parameter. 
+ Your [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) must end in `-ref`. For example, `video-od-labels-ref`. 
+ Your input manifest file must be a video frame sequence manifest file. You can create this manifest file using the SageMaker AI console, or create it manually and upload it to Amazon S3. For more information, see [Input Data Setup](sms-video-data-setup.md). 
+ You can only use private or vendor work teams to create video frame object detection labeling jobs. 
+ You specify your labels, label category and frame attributes, the task type, and worker instructions in a label category configuration file. Specify the task type (bounding boxes, polylines, polygons or keypoint) using `annotationType` in your label category configuration file. For more information, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md) to learn how to create this file. 
+ You need to provide pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region in which you are creating your labeling job to find the correct ARN that ends with `PRE-VideoObjectDetection`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region in which you are creating your labeling job to find the correct ARN that ends with `ACS-VideoObjectDetection`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` must be `1`. 
+ Automated data labeling is not supported for video frame labeling jobs. Do not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`. 
+ Video frame object tracking labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604,800 seconds). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. 

```
response = client.create_labeling_job(
    LabelingJobName='example-video-od-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://amzn-s3-demo-bucket/path/video-frame-sequence-input-manifest.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://amzn-s3-demo-bucket/prefix/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/prefix/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:us-east-1:*:workteam/private-crowd/*',
        'UiConfig': {
            'HumanTaskUiArn: 'arn:aws:sagemaker:us-east-1:394669845002:human-task-ui/VideoObjectDetection'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-VideoObjectDetection',
        'TaskKeywords': [
            'Video Frame Object Detection',
        ],
        'TaskTitle': 'Video frame object detection task',
        'TaskDescription': 'Classify and identify the location of objects and people in video frames',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-VideoObjectDetection'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

## Create Video Frame Object Detection Adjustment or Verification Labeling Job
<a name="sms-video-od-adjustment"></a>

You can create an adjustment and verification labeling job using the Ground Truth console or `CreateLabelingJob` API. To learn more about adjustment and verification labeling jobs, and to learn how create one, see [Label verification and adjustment](sms-verification-data.md).

## Output Data Format
<a name="sms-video-od-output-data"></a>

When you create a video frame object detection labeling job, tasks are sent to workers. When these workers complete their tasks, labels are written to the Amazon S3 output location you specified when you created the labeling job. To learn about the video frame object detection output data format, see [Video frame object detection output](sms-data-output.md#sms-output-video-object-detection). If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. 

# Track objects in video frames using video frame object tracking
<a name="sms-video-object-tracking"></a>

You can use the video frame object tracking task type to have workers track the movement of objects in a sequence of video frames (images extracted from a video) using bounding boxes, polylines, polygons or keypoint *annotation tools*. The tool you choose defines the video frame task type you create. For example, you can use a bounding box video frame object tracking task type to ask workers to track the movement of objects, such as cars, bikes, and pedestrians by drawing boxes around them. 

You provide a list of categories, and each annotation that a worker adds to a video frame is identified as an *instance* of that category using an instance ID. For example, if you provide the label category car, the first car that a worker annotates will have the instance ID car:1. The second car the worker annotates will have the instance ID car:2. To track an object's movement, the worker adds annotations associated with the same instance ID around to object in all frames. 

You can create a video frame object tracking labeling job using the Amazon SageMaker AI Ground Truth console, the SageMaker API, and language-specific AWS SDKs. To learn more, see [Create a Video Frame Object Detection Labeling Job](sms-video-object-detection.md#sms-video-od-create-labeling-job) and select your preferred method. See [Task types](sms-video-overview.md#sms-video-frame-tools) to learn more about the annotations tools you can choose from when you create a labeling job.

Ground Truth provides a worker UI and tools to complete your labeling job tasks: [Preview the Worker UI](sms-video-object-detection.md#sms-video-od-worker-ui).

You can create a job to adjust annotations created in a video object detection labeling job using the video object detection adjustment task type. To learn more, see [Create Video Frame Object Detection Adjustment or Verification Labeling Job](sms-video-object-detection.md#sms-video-od-adjustment).

## Preview the Worker UI
<a name="sms-video-ot-worker-ui"></a>

Ground Truth provides workers with a web user interface (UI) to complete your video frame object tracking annotation tasks. You can preview and interact with the worker UI when you create a labeling job in the console. If you are a new user, we recommend that you create a labeling job through the console using a small input dataset to preview the worker UI and ensure your video frames, labels, and label attributes appear as expected. 

The UI provides workers with the following assistive labeling tools to complete your object tracking tasks:
+ For all tasks, workers can use the **Copy to next** and **Copy to all** features to copy an annotation with the same unique ID to the next frame or to all subsequent frames respectively. 
+ For tasks that include the bounding box tools, workers can use a **Predict next** feature to draw a bounding box in a single frame, and then have Ground Truth predict the location of boxes with the same unique ID in all other frames. Workers can then make adjustments to correct predicted box locations. 

The following video shows how a worker might use the worker UI with the bounding box tool to complete your object tracking tasks.

![\[Gif showing how a worker can use the bounding box tool with the predict next feature.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/ot_predict_next.gif)


## Create a Video Frame Object Tracking Labeling Job
<a name="sms-video-ot-create-labeling-job"></a>

You can create a video frame object tracking labeling job using the SageMaker AI console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) API operation. 

This section assumes that you have reviewed the [Video frame labeling job reference](sms-video-overview.md) and have chosen the type of input data and the input dataset connection you are using. 

### Create a Labeling Job (Console)
<a name="sms-video-ot-create-labeling-job-console"></a>

You can follow the instructions in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to learn how to create a video frame object tracking job in the SageMaker AI console. In step 10, choose **Video - Object tracking** from the **Task category** dropdown list. Select the task type you want by selecting one of the cards in **Task selection**.

![\[Gif showing how to create a video frame object tracking job in the SageMaker AI console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/task-type-vot.gif)


### Create a Labeling Job (API)
<a name="sms-video-ot-create-labeling-job-api"></a>

You create an object tracking labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

[Create a Labeling Job (API)](sms-create-labeling-job-api.md) provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/VideoObjectTracking`. Replace `<region>` with the AWS Region in which you are creating the labeling job. 

  Do not include an entry for the `UiTemplateS3Uri` parameter. 
+ Your [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) must end in `-ref`. For example, `ot-labels-ref`. 
+ Your input manifest file must be a video frame sequence manifest file. You can create this manifest file using the SageMaker AI console, or create it manually and upload it to Amazon S3. For more information, see [Input Data Setup](sms-video-data-setup.md). If you create a streaming labeling job, the input manifest file is optional. 
+ You can only use private or vendor work teams to create video frame object detection labeling jobs.
+ You specify your labels, label category and frame attributes, the task type, and worker instructions in a label category configuration file. Specify the task type (bounding boxes, polylines, polygons or keypoint) using `annotationType` in your label category configuration file. For more information, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md) to learn how to create this file. 
+ You need to provide pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region in which you are creating your labeling job to find the correct ARN that ends with `PRE-VideoObjectTracking`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region in which you are creating your labeling job to find the correct ARN that ends with `ACS-VideoObjectTracking`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` must be `1`. 
+ Automated data labeling is not supported for video frame labeling jobs. Do not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`. 
+ Video frame object tracking labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604,800 seconds). 

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job in the US East (N. Virginia) Region. 

```
response = client.create_labeling_job(
    LabelingJobName='example-video-ot-labeling-job,
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://amzn-s3-demo-bucket/path/video-frame-sequence-input-manifest.json'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://amzn-s3-demo-bucket/prefix/file-to-store-output-data',
        'KmsKeyId': 'string'
    },
    RoleArn='arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri='s3://bucket/prefix/label-categories.json',
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:us-east-1:*:workteam/private-crowd/*',
        'UiConfig': {
            'HumanTaskUiArn: 'arn:aws:sagemaker:us-east-1:394669845002:human-task-ui/VideoObjectTracking'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-VideoObjectTracking',
        'TaskKeywords': [
            'Video Frame Object Tracking,
        ],
        'TaskTitle': 'Video frame object tracking task',
        'TaskDescription': Tracking the location of objects and people across video frames',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-VideoObjectTracking'
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

## Create a Video Frame Object Tracking Adjustment or Verification Labeling Job
<a name="sms-video-ot-adjustment"></a>

You can create an adjustment and verification labeling job using the Ground Truth console or `CreateLabelingJob` API. To learn more about adjustment and verification labeling jobs, and to learn how create one, see [Label verification and adjustment](sms-verification-data.md).

## Output Data Format
<a name="sms-video-ot-output-data"></a>

When you create a video frame object tracking labeling job, tasks are sent to workers. When these workers complete their tasks, labels are written to the Amazon S3 output location you specified when you created the labeling job. To learn about the video frame object tracking output data format, see [Video frame object tracking output](sms-data-output.md#sms-output-video-object-tracking). If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. 

# Video frame labeling job reference
<a name="sms-video-overview"></a>

Use this page to learn about the object detection and object tracking video frame labeling jobs. The information on this page applies to both of these built-in task types. 

The video frame labeling job is unique because of the following:
+ You can either provide data objects that are ready to be annotated (video frames), or you can provide video files and have Ground Truth automatically extract video frames. 
+ Workers have the ability to save work as they go. 
+ You cannot use the Amazon Mechanical Turk workforce to complete your labeling tasks. 
+ Ground Truth provides a worker UI, as well as assistive and basic labeling tools, to help workers complete your tasks. You do not need to provide a worker task template. 

Use the following topics to learn more about video frame labeling jobs.

**Topics**
+ [Input data](#sms-video-input-overview)
+ [Job completion times](#sms-video-job-completion-times)
+ [Task types](#sms-video-frame-tools)
+ [Workforces](#sms-video-workforces)
+ [Worker user interface (UI)](#sms-video-worker-task-ui)
+ [Video frame job permission requirements](#sms-security-permission-video-frame)

## Input data
<a name="sms-video-input-overview"></a>

The video frame labeling job uses *sequences* of video frames. A single sequence is a series of images that have been extracted from a single video. You can either provide your own sequences of video frames, or have Ground Truth automatically extract video frame sequences from your video files. To learn more, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

Ground Truth uses sequence files to identify all images in a single sequence. All of the sequences that you want to include in a single labeling job are identified in an input manifest file. Each sequence is used to create a single worker task. You can automatically create sequence files and an input manifest file using Ground Truth automatic data setup. To learn more, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md). 

To learn how to manually create sequence files and an input manifest file, see [Create a Video Frame Input Manifest File](sms-video-manual-data-setup.md#sms-video-create-manifest). 

## Job completion times
<a name="sms-video-job-completion-times"></a>

Video and video frame labeling jobs can take workers hours to complete. You can set the total amount of time that workers can work on each task when you create a labeling job. The maximum time you can set for workers to work on tasks is 7 days. The default value is 3 days. 

We strongly recommend that you create tasks that workers can complete within 12 hours. Workers must keep the worker UI open while working on a task. They can save work as they go and Ground Truth saves their work every 15 minutes.

When using the SageMaker AI `CreateLabelingJob` API operation, set the total time a task is available to workers in the `TaskTimeLimitInSeconds` parameter of `HumanTaskConfig`.

When you create a labeling job in the console, you can specify this time limit when you select your workforce type and your work team.

## Task types
<a name="sms-video-frame-tools"></a>

When you create a video object tracking or video object detection labeling job, you specify the type of annotation that you want workers to create while working on your labeling task. The annotation type determines the type of output data Ground Truth returns and defines the *task type* for your labeling job. 

If you are creating a labeling job using the API operation [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html), you specify the task type using the label category configuration file parameter `annotationType`. To learn more, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md).

The following task types are available for both video object tracking or video object detection labeling jobs: 
+ **Bounding box ** – Workers are provided with tools to create bounding box annotations. A bounding box is a box that a worker draws around an objects to identify the pixel-location and label of that object in the frame. 
+ **Polyline** – Workers are provided with tools to create polyline annotations. A polyline is defined by the series of ordered x, y coordinates. Each point added to the polyline is connected to the previous point by a line. The polyline does not have to be closed (the start point and end point do not have to be the same) and there are no restrictions on the angles formed between lines. 
+ **Polygon ** – Workers are provided with tools to create polygon annotations. A polygon is a closed shape defined by a series of ordered x, y coordinates. Each point added to the polygon is connected to the previous point by a line and there are no restrictions on the angles formed between lines. Two lines (sides) of the polygon cannot cross. The start and end point of a polygon must be the same. 
+ **Keypoint** – Workers are provided with tools to create keypoint annotations. A keypoint is a single point associated with an x, y coordinate in the video frame.

## Workforces
<a name="sms-video-workforces"></a>

When you create a video frame labeling job, you need to specify a work team to complete your annotation tasks. You can choose a work team from a private workforce of your own workers, or from a vendor workforce that you select in the AWS Marketplace. You cannot use the Amazon Mechanical Turk workforce for video frame labeling jobs. 

To learn more about vendor workforces, see [Subscribe to vendor workforces](sms-workforce-management-vendor.md).

To learn how to create and manage a private workforce, see [Private workforce](sms-workforce-private.md).

## Worker user interface (UI)
<a name="sms-video-worker-task-ui"></a>

Ground Truth provides a worker user interface (UI), tools, and assistive labeling features to help workers complete your video labeling tasks. You can preview the worker UI when you create a labeling job in the console.

When you create a labeling job using the API operation `CreateLabelingJob`, you must provide an ARN provided by Ground Truth in the parameter [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-UiTemplateS3Uri](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-UiTemplateS3Uri) to specify the worker UI for your task type. You can use `HumanTaskUiArn` with the SageMaker AI [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_RenderUiTemplate.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_RenderUiTemplate.html) API operation to preview the worker UI. 

You provide worker instructions, labels, and optionally, attributes that workers can use to provide more information about labels and video frames. These attributes are referred to as label category attributes and frame attributes respectively. They are all displayed in the worker UI.

### Label category and frame attributes
<a name="sms-video-label-attributes"></a>

When you create a video object tracking or video object detection labeling job, you can add one or more *label category attributes* and *frame attributes*:
+ **Label category attribute** – A list of options (strings), a free form text box, or a numeric field associated with one or more labels. It is used by workers to provide metadata about a label. 
+ **Frame attribute** – A list of options (strings), a free form text box, or a numeric field that appears on each video frame a worker is sent to annotate. It is used by workers to provide metadata about video frames. 

Additionally, you can use label and frame attributes to have workers verify labels in a video frame label verification job. 

Use the following sections to learn more about these attributes. To learn how to add label category and frame attributes to a labeling job, use the **Create Labeling Job** sections on the [task type page](sms-video-task-types.md) of your choice.

#### Label category attributes
<a name="sms-video-label-category-attributes"></a>

Add label category attributes to labels to give workers the ability to provide more information about the annotations they create. A label category attribute is added to an individual label, or to all labels. When a label category attribute is applied to all labels it is referred to as a *global label category attribute*. 

For example, if you add the label category *car*, you might also want to capture additional data about your labeled cars, such as if they are occluded or the size of the car. You can capture this metadata using label category attributes. In this example, if you added the attribute *occluded* to the car label category, you can assign *partial*, *completely*, *no* to the *occluded* attribute and enable workers to select one of these options. 

When you create a label verification job, you add labels category attributes to each label you want workers to verify.

#### Frame level attributes
<a name="sms-video-frame-attributes"></a>

Add frame attributes to give workers the ability to provide more information about individual video frames. Each frame attribute you add appears on all frames. 

For example, you can add a number-frame attribute to have workers identify the number of objects they see in a particular frame. 

In another example, you may want to provide a free-form text box to give workers the ability to provide an answer to a question. 

When you create a label verification job, you can add one or more frame attributes to ask workers to provide feedback on all labels in a video frame.

### Worker instructions
<a name="sms-video-worker-instructions-general"></a>

You can provide worker instructions to help your workers complete your video frame labeling tasks. You might want to cover the following topics when writing your instructions: 
+ Best practices and things to avoid when annotating objects.
+ The label category attributes provided (for object detection and object tracking tasks) and how to use them.
+ How to save time while labeling by using keyboard shortcuts. 

You can add your worker instructions using the SageMaker AI console while creating a labeling job. If you create a labeling job using the API operation `CreateLabelingJob`, you specify worker instructions in your label category configuration file. 

In addition to your instructions, Ground Truth provides a link to help workers navigate and use the worker portal. View these instructions by selecting the task type on [Worker Instructions](sms-video-worker-instructions.md).

### Declining tasks
<a name="sms-decline-task-video"></a>

Workers are able to decline tasks. 

Workers decline a task if the instructions are not clear, input data is not displaying correctly, or if they encounter some other issue with the task. If the number of workers per dataset object ([https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject)) decline the task, the data object is marked as expired and will not be sent to additional workers.

## Video frame job permission requirements
<a name="sms-security-permission-video-frame"></a>

When you create a video frame labeling job, in addition to the permission requirements found in [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md), you must add a CORS policy to your S3 bucket that contains your input manifest file. 

### CORS permission policy for your S3 bucket
<a name="sms-permissions-add-cors-video-frame"></a>

When you create a video frame labeling job, you specify buckets in S3 where your input data and manifest file are located and where your output data will be stored. These buckets may be the same. You must attach the following Cross-origin resource sharing (CORS) policy to your input and output buckets. If you use the Amazon S3 console to add the policy to your bucket, you must use the JSON format.

**JSON**

```
[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "GET",
            "HEAD",
            "PUT"
        ],
        "AllowedOrigins": [
            "*"
        ],
        "ExposeHeaders": [
            "Access-Control-Allow-Origin"
        ],
        "MaxAgeSeconds": 3000
    }
]
```

**XML**

```
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>HEAD</AllowedMethod>
    <AllowedMethod>PUT</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <ExposeHeader>Access-Control-Allow-Origin</ExposeHeader>
    <AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
```

To learn how to add a CORS policy to an S3 bucket, see [ How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html) in the Amazon Simple Storage Service User Guide.

# Worker Instructions
<a name="sms-video-worker-instructions"></a>

This topic provides an overview of the Ground Truth worker portal and the tools available to complete your video frame labeling task. First, select the type of task you are working on from **Topics**.

**Important**  
It is recommended that you complete your task using a Google Chrome or Firefox web browser. 

For adjustment jobs, select the original labeling job task type that produced the labels you are adjusting. Review and adjust the labels in your task as needed.

**Topics**
+ [Navigate the UI](sms-video-worker-instructions-worker-ui-ot.md)
+ [Bulk Edit Label and Frame Attributes](sms-video-frame-worker-instructions-ot-bulk-edit.md)
+ [Tool Guide](sms-video-worker-instructions-tool-guide.md)
+ [Icons Guide](sms-video-worker-instructions-ot-icons.md)
+ [Shortcuts](sms-video-worker-instructions-ot-hot-keys.md)
+ [Understand Release, Stop and Resume, and Decline Task Options](sms-video-worker-instructions-skip-reject-ot.md)
+ [Saving Your Work and Submitting](sms-video-worker-instructions-saving-work-ot.md)
+ [Video Frame Object Tracking Tasks](sms-video-ot-worker-instructions.md)
+ [Video Frame Object Detection Tasks](sms-video-od-worker-instructions.md)

# Navigate the UI
<a name="sms-video-worker-instructions-worker-ui-ot"></a>

You can navigate between video frames using the navigation bar in the bottom-left corner of your UI. 

Use the play button to automatically move through the entire sequence of frames. 

Use the next frame and previous frame buttons to move forward or back one frame at a time. You can also input a frame number to navigate to that frame. 



The following video demonstrates how to navigate between video frames. 

![\[Gif showing how to navigate between video frames.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/nav_video_ui.gif)


You can zoom in to and out of all video frames. Once you have zoomed into a video frame, you can move around in that frame using the move icon. When you set a new view in a single video frame by zooming and moving within that frame, all video frames are set to the same view. You can reset all video frames to their original view using the fit screen icon. For additional view options, see [Icons Guide](sms-video-worker-instructions-ot-icons.md). 

When you are in the worker UI, you see the following menus:
+ **Instructions** – Review these instructions before starting your task. Additionally, select **More instructions** and review these instructions. 
+ **Shortcuts** – Use this menu to view keyboard shortcuts that you can use to navigate video frames and use the tools provided. 
+ **Help** – Use this option to refer back to this documentation. 

# Bulk Edit Label and Frame Attributes
<a name="sms-video-frame-worker-instructions-ot-bulk-edit"></a>

You can bulk edit label attributes and frame attributes (attributes). 

When you bulk edit an attribute, you specify one or more ranges of frames that you want to apply the edit to. The attribute you select is edited in all frames in that range, including the start and end frames you specify. When you bulk edit label attributes, the range you specify *must* contain the label that the label attribute is attached to. If you specify frames that do not contain this label, you will receive an error. 

To bulk edit an attribute you *must* specify the desired value for the attribute first. For example, if you want to change an attribute from *Yes* to *No*, you must select *No*, and then perform the bulk edit. 

You can also specify a new value for an attribute that has not been filled in and then use the bulk edit feature to fill in that value in multiple frames. To do this, select the desired value for the attribute and complete the following procedure. 

**To bulk edit a label or attribute:**

1. Use your mouse to right click the attribute you want to bulk edit.

1. Specify the range of frames you want to apply the bulk edit to using a dash (`-`) in the text box. For example, if you want to apply the edit to frames one through ten, enter `1-10`. If you want to apply the edit to frames two to five, eight to ten and twenty enter `2-5,8-10,20`.

1. Select **Confirm**.

If you get an error message, verify that you entered a valid range and that the label associated with the label attribute you are editing (if applicable) exists in all frames specified.

You can quickly add a label to all previous or subsequent frames using the **Duplicate to previous frames** and **Duplicate to next frames** options in the **Label** menu at the top of your screen. 

# Tool Guide
<a name="sms-video-worker-instructions-tool-guide"></a>

Your task will include one or more tools. The tool provided dictates the type of annotations you will create to identify and track objects. Use the following table to learn more about each tool provided. 


****  

| Tool | Icon | Action | Description | 
| --- | --- | --- | --- | 
|  Bounding box  |  ![\[The Bounding box icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Bounding%20Box.png)  |  Add a bounding box annotation.  |  Choose this icon to add a bounding box. Each bounding box you add is associated with the category you choose from the Label category drop down menu. Select the bounding box or its associated label to adjust it.   | 
| Predict next |  ![\[The Predict next icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/PredictNext.png)  |  Predict bounding boxes in the next frame.  |  Select a bounding box, and then choose this icon to predict the location of that box in the next frame. You can select the icon multiple times in a row to automatically detect the location of box in multiple frames. For example, select this icon 5 times to predict the location of a bounding box in the next 5 frames.   | 
|  Keypoint  |  ![\[The Keypoint icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Keypoint.png)  |  Add a keypoint annotation.  |  Choose this icon to add a keypoint. Click on an object the image to place the keypoint at that location.  Each keypoint you add is associated with the category you choose from the Label category drop down menu. Select a keypoint or its associated label to adjust it.   | 
|  Polyline  |  ![\[The Polyline icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/polyline.png)  |  Add a polyline annotation.  |  Choose this icon to add a polyline. To add a polyline, continuously click around the object of interest to add new points. To stop drawing a polyline, select the last point that you placed a second time (this point will be green), or press **Enter** on your keyboard.  Each point added to the polyline is connected to the previous point by a line. The polyline does not have to be closed (the start point and end point do not have to be the same) and there are no restrictions on the angles formed between lines.  Each polyline you add is associated with the category you choose from the Label category drop down menu. Select the polyline or its associated label to adjust it.   | 
|  Polygon  |  ![\[The Polygon icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Polygon.png)  |  Add a polygon annotation.  |  Choose this icon to add a polygon. To add a polygon, continuously click around the object of interest to add new points. To stop drawing the polygon, select the start point (this point will be green).  A polygon is a closed shape defined by a series of points that you place. Each point added to the polygon is connected to the previous point by a line and there are no restrictions on the angles formed between lines. The start and end point must be the same.  Each polygon you add is associated with the category you choose from the Label category drop down menu. Select the polygon or its associated label to adjust it.   | 
|  Copy to Next  |  ![\[The Copy to Next icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/copy_to_next.png)  |  Copy annotations to the next frame.   |  If one or more annotations are selected in the current frame, those annotations are copied to the next frame. If no annotations are selected, all annotations in the current frame will be copied to the next frame.   | 
|  Copy to All  |  ![\[The Copy to All icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/copy_to_all.png)  |  Copy annotations to all subsequent frames.  |  If one or more annotations are selected in the current frame, those annotations are copied to all subsequent frames. If no annotations are selected, all annotations in the current frame will be copied to all subsequent frames.   | 

# Icons Guide
<a name="sms-video-worker-instructions-ot-icons"></a>

Use this table to learn about the icons you see in your UI. You can automatically select some of these icons using the keyboard shortcuts found in the **Shortcuts** menu. 


| Icon | Action  | Description | 
| --- | --- | --- | 
|  ![\[The Brightness icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Brightness.png)  |  brightness  |  Choose this icon to adjust the brightness of all video frames.   | 
|  ![\[The Contrast icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Contrast.png)  |  contrast  |  Choose this icon to adjust the contrast of all video frames.   | 
|  ![\[The Zoom in icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Zoom-in.png)  |  zoom in  |  Choose this icon to zoom into all of the video frames.  | 
|  ![\[The Zoom out icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Zoom-out.png)  |  zoom out  |  Choose this icon to zoom out of all of the video frames.   | 
|  ![\[The Move icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Move.png)  |  move screen  |  After you've zoomed into a video frame, choose this icon to move around in that video frame. You can move around the video frame using your mouse by clicking and dragging the frame in the direction you want it to move. This will change the view in all view frames.  | 
|  ![\[The Fit screen icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Fit%20screen.png)  | fit screen |  Reset all video frames to their original position.   | 
|  ![\[The Undo icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Undo.png)  | undo |  Undo an action. You can use this icon to remove a bounding box that you just added, or to undo an adjustment you made to a bounding box.   | 
|  ![\[The Redo icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Redo.png)  | redo | Redo an action that was undone using the undo icon. | 
|  ![\[The Delete labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Delete.png)  | delete label | Delete a label. This will delete the bounding box associated with the label in a single frame.  | 
|  ![\[The Show or hide label icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Show_Hide.png)  | show or hide label | Select this icon to show a label that has been hidden. If this icon has a slash through it, select it to hide the label.  | 
|  ![\[The Edit icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/Edit.png)  | edit label | Select this icon to open the Edit instance menu. Use this menu to edit a label category, ID, and to add or edit label attributes.  | 

# Shortcuts
<a name="sms-video-worker-instructions-ot-hot-keys"></a>

The keyboard shortcuts listed in the **Shortcuts** menu can help you quickly select icons, undo and redo annotations, and use tools to add and edit annotations. For example, once you add a bounding box, you can use **P** to quickly predict the location of that box in subsequent frames. 

Before you start your task, it is recommended that you review the **Shortcuts** menu and become acquainted with these commands.

# Understand Release, Stop and Resume, and Decline Task Options
<a name="sms-video-worker-instructions-skip-reject-ot"></a>

When you open the labeling task, three buttons on the top right allow you to decline the task (**Decline task**), release it (**Release task**), and stop and resume it at a later time (**Stop and resume later**). The following list describes what happens when you select one of these options:
+ **Decline task**: You should only decline a task if something is wrong with the task, such as unclear video frame images or an issue with the UI. If you decline a task, you will not be able to return to the task.
+ **Release Task**: Use this option to release a task and allow others to work on it. When you release a task, you loose all work done on that task and other workers on your team can pick it up. If enough workers pick up the task, you may not be able to return to it. When you select this button and then select **Confirm**, you are returned to the worker portal. If the task is still available, its status will be **Available**. If other workers pick it up, it will disappear from your portal. 
+ **Stop and resume later**: You can use the **Stop and resume later** button to stop working and return to the task at a later time. You should use the **Save** button to save your work before you select **Stop and resume later**. When you select this button and then select **Confirm**, you are returned to the worker portal, and the task status is **Stopped**. You can select the same task to resume work on it. 

  Be aware that the person that creates your labeling tasks specifies a time limit in which all tasks much be completed by. If you do not return to and complete this task within that time limit, it will expire and your work will not be submitted. Contact your administrator for more information. 

![\[Gif showing the locations of Decline task, Release task, and Stop and resume later in the UI.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/reject-decline-task.gif)


# Saving Your Work and Submitting
<a name="sms-video-worker-instructions-saving-work-ot"></a>

You should periodically save your work using the **Save** button. Ground Truth will automatically save your work ever 15 minutes. 

When you open a task, you must complete your work on it before pressing **Submit**. 

# Video Frame Object Tracking Tasks
<a name="sms-video-ot-worker-instructions"></a>

Video frame object tracking tasks require you to track the movement of objects across video frames. A video frame is a still image from a video scene. You can use the worker UI to navigate between video frames and use the tools provided to identify unique objects and track their movement from one from to the next. Use the following topics to learn how to navigate your worker UI, use the tools provided, and complete your task. 

It is recommended that you complete your task using a Google Chrome or Firefox web browser. 

**Important**  
If you see annotations have already been added to one or more video frames when you open your task, adjust those annotations and add additional annotations as needed. 

**Topics**
+ [Your Task](sms-video-worker-instructions-ot-task.md)

# Your Task
<a name="sms-video-worker-instructions-ot-task"></a>

When you work on a video frame object tracking task, you need to select a category from the **Label category** menu on the right side of your worker portal to start annotating. After you've chosen a category, use the tools provided to annotate the objects that the category applies to. This annotation will be associated with a unique label ID that should only be used for that object. Use this same label ID to create additional annotations for the same object in all of the video frames that it appears in. Refer to [Tool Guide](sms-video-worker-instructions-tool-guide.md) to learn more about the tools provided.

After you've added a label, you may see a downward pointing arrow next to the label in the **Labels** menu. Select this arrow and then select one option for each label attribute you see to provide more information about that label.

You may see frame attributes under the **Labels** menu. These attributes will appear on each frame in your task. Use these attribute prompts to enter additional information about each frame. 

![\[Example frame attribute prompt.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/frame-attributes.png)


After you've added a label, you can quickly add and edit a label category attribute value by using the downward pointing arrow next to the label in the **Labels** menu. If you select the pencil icon next to the label in the **Labels** menu, the **Edit instance** menu will appear. You can edit the label ID, label category, and label category attributes using this menu. 

![\[Gif showing how you can edit the annotation for labels in the frame.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/kitti-ot-general.gif)


To edit an annotation, select the label of the annotation that you want to edit in the **Labels** menu or select the annotation in the frame. When you edit or delete an annotation, the action will only modify the annotation in a single frame. 

If you are working on a task that includes a bounding box tool, use the predict next icon to predict the location of all bounding boxes that you have drawn in a frame in the next frame. If you select a single box and then select the predict next icon, only that box will be predicted in the next frame. If you have not added any boxes to the current frame, you will receive an error. You must add at least one box to the frame before using this feature. 

After you've used the predict next icon, review the location of each box in the next frame and make adjustments to the box location and size if necessary. 

The following graphic demonstrates how to use the predict next tool:

![\[Gif showing how you can adjust the predicted boxes for the next frame.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/kitti-ot-predict-next.gif)


For all other tools, you can use the **Copy to next** and **Copy to all** tools to copy your annotations to the next or all frames respectively. 

# Video Frame Object Detection Tasks
<a name="sms-video-od-worker-instructions"></a>

Video frame object detection tasks required you to classify and identify the location of objects in video frames using annotations. A video frame is a still image from a video scene. You can use the worker UI to navigate between video frames and create annotations to identify objects of interest. Use the following topics to learn how to navigate your worker UI, use the tools provided, and complete your task. 

It is recommended that you complete your task using a Google Chrome web browser. 

**Important**  
If you see annotations have already been added to one or more video frames when you open your task, adjust those annotations and add additional annotations as needed. 

**Topics**
+ [Your Task](sms-video-worker-instructions-od-task.md)

# Your Task
<a name="sms-video-worker-instructions-od-task"></a>

When you work on a video frame object detection task, you need to select a category from the **Label category** menu on the right side of your worker portal to start annotating. After you've chosen a category, draw annotations around objects that this category applies to. To learn more about the tools you see in your worker UI, refer to the [Tool Guide](sms-video-worker-instructions-tool-guide.md).

After you've added a label, you may see a downward pointing arrow next to the label in the **Labels** menu. Select this arrow and then select one option for each label attribute you see to provide more information about that label.

![\[Gif showing how a worker can use the bounding box tool for their object detection tasks.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/kitti-od-general-labeling-job.gif)


You may see frame attributes under the **Labels** menu. These attributes will appear on each frame in your task. Use these attribute prompts to enter additional information about each frame. 

![\[Example frame attribute prompt.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/frame-attributes.png)


To edit an annotation, select the label of the annotation that you want to edit in the **Labels** menu or select the annotation in the frame. When you edit or delete an annotation, the action will only modify the annotation in a single frame. 

If you are working on a task that includes a bounding box tool, use the predict next icon to predict the location of all bounding boxes that you have drawn in a frame in the next frame. If you select a single box and then select the predict next icon, only that box will be predicted in the next frame. If you have not added any boxes to the current frame, you will receive an error. You must add at least one box to the frame before using this feature. 

**Note**  
The predict next feature will not overwrite manually created annotations. It will only add annotations. If you use predict next and as a result have more than one bounding box around a single object, delete all but one box. Each object should only be identified with a single box. 

After you've used the predict next icon, review the location of each box in the next frame and make adjustments to the box location and size if necessary. 

The following graphic demonstrates how to use the predict next tool:

![\[Gif showing how a worker can adjust the predicted boxes in the next frame.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/video/kitti-video-od.gif)


For all other tools, you can use the **Copy to next** and **Copy to all** tools to copy your annotations to the next or all frames respectively. 

# Use Ground Truth to Label 3D Point Clouds
<a name="sms-point-cloud"></a>

Create a 3D point cloud labeling job to have workers label objects in 3D point clouds generated from 3D sensors like Light Detection and Ranging (LiDAR) sensors and depth cameras, or generated from 3D reconstruction by stitching images captured by an agent like a drone. 

## 3D Point Clouds
<a name="sms-point-cloud-define"></a>

Point clouds are made up of three-dimensional (3D) visual data that consists of points. Each point is described using three coordinates, typically `x`, `y`, and `z`. To add color or variations in point intensity to the point cloud, points may be described with additional attributes, such as `i` for intensity or values for the red (`r`), green (`g`), and blue (`b`) 8-bit color channels. When you create a Ground Truth 3D point cloud labeling job, you can provide point cloud and, optionally, sensor fusion data. 

The following image shows a single, 3D point cloud scene rendered by Ground Truth and displayed in the semantic segmentation worker UI.

![\[Gif showing how workers can use the 3D point cloud and 2D image together to paint objects.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss_paint_sf.gif)


### LiDAR
<a name="sms-point-cloud-data-types-lidar"></a>

A Light Detection and Ranging (LiDAR) sensor is a common type of sensor used to collect measurements that are used to generate point cloud data. LiDAR is a remote sensing method that uses light in the form of a pulsed laser to measure the distances of objects from the sensor. You can provide 3D point cloud data generated from a LiDAR sensor for a Ground Truth 3D point cloud labeling job using the raw data formats described in [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md).

### Sensor Fusion
<a name="sms-point-cloud-data-types-sensor"></a>

Ground Truth 3D point cloud labeling jobs include a sensor fusion feature that supports video camera sensor fusion for all task types. Some sensors come with multiple LiDAR devices and video cameras that capture images and associate them with a LiDAR frame. To help annotators visually complete your tasks with high confidence, you can use the Ground Truth sensor fusion feature to project annotations (labels) from a 3D point cloud to 2D camera images and vice versa using 3D scanner (such as LiDAR) extrinsic matrix and camera extrinsic and intrinsic matrices. To learn more, see [Sensor Fusion](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-sensor-fusion).

## Label 3D Point Clouds
<a name="sms-point-cloud-annotation-define"></a>

Ground Truth provides a user interface (UI) and tools that workers use to label or *annotate* 3D point clouds. When you use the object detection or semantic segmentation task types, workers can annotate a single point cloud frame. When you use object tracking, workers annotate a sequence of frames. You can use object tracking to track object movement across all frames in a sequence. 

The following demonstrates how a worker would use the Ground Truth worker portal and tools to annotate a 3D point cloud for an object detection task. For similar visual examples of other task types, see [3D Point Cloud Task types](sms-point-cloud-task-types.md).

![\[Gif showing how a worker can annotate a 3D point cloud in the Ground Truth worker portal.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_detection/ot_basic_tools.gif)


### Assistive Labeling Tools for Point Cloud Annotation
<a name="sms-point-cloud-assistive-labeling-tools"></a>

Ground Truth offers assistive labeling tools to help workers complete your point cloud annotation tasks faster and with more accuracy. For details about assistive labeling tools that are included in the worker UI for each task type, [select a task type](sms-point-cloud-task-types.md) and refer to the **View the Worker Task Interface** section of that page.

## Next Steps
<a name="sms-point-cloud-next-steps-getting-started"></a>

You can create six types of tasks when you use Ground Truth 3D point cloud labeling jobs. Use the topics in [3D Point Cloud Task types](sms-point-cloud-task-types.md) to learn more about these *task types* and to learn how to create a labeling job using the task type of your choice. 

The 3D point cloud labeling job is different from other Ground Truth labeling modalities. Before creating a labeling job, we recommend that you read [3D point cloud labeling jobs overview](sms-point-cloud-general-information.md). Additionally, review input data quotas in [3D Point Cloud and Video Frame Labeling Job Quotas](input-data-limits.md#sms-input-data-quotas-other).

**Important**  
If you use a notebook instance created before June 5th, 2020 to run this notebook, you must stop and restart that notebook instance for the notebook to work. 

**Topics**
+ [3D Point Clouds](#sms-point-cloud-define)
+ [Label 3D Point Clouds](#sms-point-cloud-annotation-define)
+ [Next Steps](#sms-point-cloud-next-steps-getting-started)
+ [3D Point Cloud Task types](sms-point-cloud-task-types.md)
+ [3D point cloud labeling jobs overview](sms-point-cloud-general-information.md)
+ [Worker instructions](sms-point-cloud-worker-instructions.md)

# 3D Point Cloud Task types
<a name="sms-point-cloud-task-types"></a>

You can use Ground Truth 3D point cloud labeling modality for a variety of use cases. The following list briefly describes each 3D point cloud task type. For additional details and instructions on how to create a labeling job using a specific task type, select the task type name to see its task type page. 
+ [3D point cloud object detection](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-object-detection.html) – Use this task type when you want workers to locate and classify objects in a 3D point cloud by adding and fitting 3D cuboids around objects. 
+ [3D point cloud object tracking](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-object-tracking.html) – Use this task type when you want workers to add and fit 3D cuboids around objects to track their movement across a sequence of 3D point cloud frames. For example, you can use this task type to ask workers to track the movement of vehicles across multiple point cloud frames.
+ [3D point cloud semantic segmentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-semantic-segmentation.html) – Use this task type when you want workers to create a point-level semantic segmentation mask by painting objects in a 3D point cloud using different colors where each color is assigned to one of the classes you specify. 
+  3D point cloud adjustment task types – Each of the task types above has an associated *adjustment* task type that you can use to audit and adjust annotations generated from a 3D point cloud labeling job. Refer to the task type page of the associated type to learn how to create an adjustment labeling job for that task. 

# Classify objects in a 3D point cloud with object detection
<a name="sms-point-cloud-object-detection"></a>

Use this task type when you want workers to classify objects in a 3D point cloud by drawing 3D cuboids around objects. For example, you can use this task type to ask workers to identify different types of objects in a point cloud, such as cars, bikes, and pedestrians. The following page gives important information about the labeling job, as well as steps to create one.

For this task type, the *data object* that workers label is a single point cloud frame. Ground Truth renders a 3D point cloud using point cloud data you provide. You can also provide camera data to give workers more visual information about scenes in the frame, and to help workers draw 3D cuboids around objects. 

Ground Truth providers workers with tools to annotate objects with 9 degrees of freedom (x,y,z,rx,ry,rz,l,w,h) in three dimensions in both 3D scene and projected side views (top, side, and back). If you provide sensor fusion information (like camera data), when a worker adds a cuboid to identify an object in the 3D point cloud, the cuboid shows up and can be modified in the 2D images. After a cuboid has been added, all edits made to that cuboid in the 2D or 3D scene are projected into the other view.

You can create a job to adjust annotations created in a 3D point cloud object detection labeling job using the 3D point cloud object detection adjustment task type. 

If you are a new user of the Ground Truth 3D point cloud labeling modality, we recommend you review [3D point cloud labeling jobs overview](sms-point-cloud-general-information.md). This labeling modality is different from other Ground Truth task types, and this page provides an overview of important details you should be aware of when creating a 3D point cloud labeling job.

**Topics**
+ [View the Worker Task Interface](#sms-point-cloud-object-detection-worker-ui)
+ [Create a 3D Point Cloud Object Detection Labeling Job](#sms-point-cloud-object-detection-create-labeling-job)
+ [Create a 3D Point Cloud Object Detection Adjustment or Verification Labeling Job](#sms-point-cloud-object-detection-adjustment-verification)
+ [Output Data Format](#sms-point-cloud-object-detection-output-data)

## View the Worker Task Interface
<a name="sms-point-cloud-object-detection-worker-ui"></a>

Ground Truth provides workers with a web portal and tools to complete your 3D point cloud object detection annotation tasks. When you create the labeling job, you provide the Amazon Resource Name (ARN) for a pre-built Ground Truth worker UI in the `HumanTaskUiArn` parameter. When you create a labeling job using this task type in the console, this worker UI is automatically used. You can preview and interact with the worker UI when you create a labeling job in the console. If you are a new user, it is recommended that you create a labeling job using the console to ensure your label attributes, point cloud frames, and if applicable, images, appear as expected. 

The following is a GIF of the 3D point cloud object detection worker task interface. If you provide camera data for sensor fusion in the world coordinate system, images are matched up with scenes in the point cloud frame. These images appear in the worker portal as shown in the following GIF. 

![\[Gif showing how a worker can annotate a 3D point cloud in the Ground Truth worker portal.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_detection/ot_basic_tools.gif)


Worker can navigate in the 3D scene using their keyboard and mouse. They can:
+ Double click on specific objects in the point cloud to zoom into them.
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

Once a worker places a cuboid in the 3D scene, a side-view will appear with the three projected side views: top, side, and back. These side-views show points in and around the placed cuboid and help workers refine cuboid boundaries in that area. Workers can zoom in and out of each of those side-views using their mouse. 

The following video demonstrates movements around the 3D point cloud and in the side-view. 

![\[Gif showing movements around the 3D point cloud and the side-view.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_detection/navigate_od_worker_ui.gif)


Additional view options and features are available in the **View** menu in the worker UI. See the [worker instruction page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-worker-instructions-object-detection) for a comprehensive overview of the Worker UI. 

**Assistive Labeling Tools**  
Ground Truth helps workers annotate 3D point clouds faster and more accurately using machine learning and computer vision powered assistive labeling tools for 3D point cloud object tracking tasks. The following assistive labeling tools are available for this task type:
+ **Snapping** – Workers can add a cuboid around an object and use a keyboard shortcut or menu option to have Ground Truth's autofit tool snap the cuboid tightly around the object. 
+ **Set to ground **– After a worker adds a cuboid to the 3D scene, the worker can automatically snap the cuboid to the ground. For example, the worker can use this feature to snap a cuboid to the road or sidewalk in the scene. 
+ **Multi-view labeling** – After a worker adds a 3D cuboid to the 3D scene, a side panel displays front, side, and top perspectives to help the worker adjust the cuboid tightly around the object. In all of these views, the cuboid includes an arrow that indicates the orientation, or heading of the object. When the worker adjusts the cuboid, the adjustment will appear in real time on all of the views (that is, 3D, top, side, and front). 
+ **Sensor fusion** – If you provide data for sensor fusion, workers can adjust annotations in the 3D scenes and in 2D images, and the annotations will be projected into the other view in real time. Additionally, workers will have the option to view the direction the camera is facing and the camera frustum.
+ **View options **– Enables workers to easily hide or view cuboids, label text, a ground mesh, and additional point attributes like color or intensity. Workers can also choose between perspective and orthogonal projections. 

## Create a 3D Point Cloud Object Detection Labeling Job
<a name="sms-point-cloud-object-detection-create-labeling-job"></a>

You can create a 3D point cloud labeling job using the SageMaker AI console or API operation, [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To create a labeling job for this task type you need the following: 
+ A single-frame input manifest file. To learn how to create this type of manifest file, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). If you are a new user of Ground Truth 3D point cloud labeling modalities, you may also want to review [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). 
+ A work team from a private or vendor workforce. You cannot use Amazon Mechanical Turk for video frame labeling jobs. To learn how to create workforces and work teams, see [Workforces](sms-workforce-management.md).

Additionally, make sure that you have reviewed and satisfied the [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

Use one of the following sections to learn how to create a labeling job using the console or an API. 

### Create a Labeling Job (Console)
<a name="sms-point-cloud-object-detection-create-labeling-job-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) in order to learn how to create a 3D point cloud object detection labeling job in the SageMaker AI console. While you are creating your labeling job, be aware of the following: 
+ Your input manifest file must be a single-frame manifest file. For more information, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). 
+ Optionally, you can provide label category and frame attributes. Workers can assign one or more of these attributes to annotations to provide more information about that object. For example, you might want to use the attribute *occluded* to have workers identify when an object is partially obstructed.
+ Automated data labeling and annotation consolidation are not supported for 3D point cloud labeling tasks.
+ 3D point cloud object detection labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs when you select your work team (up to 7 days, or 604800 seconds). 

### Create a Labeling Job (API)
<a name="sms-point-cloud-object-detection-create-labeling-job-api"></a>

This section covers details you need to know when you create a labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

[Create a Labeling Job (API)](sms-create-labeling-job-api.md), provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/PointCloudObjectDetection`. Replace `<region>` with the AWS Region you are creating the labeling job in. 

  There should not be an entry for the `UiTemplateS3Uri` parameter. 
+ Your input manifest file must be a single-frame manifest file. For more information, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). 
+ You specify your labels, label category and frame attributes, and worker instructions in a label category configuration file. To learn how to create this file, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md). 
+ You need to provide pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN. For example, if you are creating your labeling job in us-east-1, the ARN will be `arn:aws:lambda:us-east-1:432418664414:function:PRE-3DPointCloudObjectDetection`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN. For example, if you are creating your labeling job in us-east-1, the ARN will be `arn:aws:lambda:us-east-1:432418664414:function:ACS-3DPointCloudObjectDetection`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` must be `1`. 
+ Automated data labeling is not supported for 3D point cloud labeling jobs. You should not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`.
+ 3D point cloud object detection labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604,800 seconds). 

## Create a 3D Point Cloud Object Detection Adjustment or Verification Labeling Job
<a name="sms-point-cloud-object-detection-adjustment-verification"></a>

You can create an adjustment or verification labeling job using the Ground Truth console or `CreateLabelingJob` API. To learn more about adjustment and verification labeling jobs, and to learn how create one, see [Label verification and adjustment](sms-verification-data.md).

When you create an adjustment labeling job, your input data to the labeling job can include labels, and yaw, pitch, and roll measurements from a previous labeling job or external source. In the adjustment job, pitch, and roll will be visualized in the worker UI, but cannot be modified. Yaw is adjustable. 

Ground Truth uses Tait-Bryan angles with the following intrinsic rotations to visualize yaw, pitch and roll in the worker UI. First, rotation is applied to the vehicle according to the z-axis (yaw). Next, the rotated vehicle is rotated according to the intrinsic y'-axis (pitch). Finally, the vehicle is rotated according to the intrinsic x''-axis (roll). 

## Output Data Format
<a name="sms-point-cloud-object-detection-output-data"></a>

When you create a 3D point cloud object detection labeling job, tasks are sent to workers. When these workers complete their tasks, labels are written to the Amazon S3 bucket you specified when you created the labeling job. The output data format determines what you see in your Amazon S3 bucket when your labeling job status ([LabelingJobStatus](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeLabelingJob.html#API_DescribeLabelingJob_ResponseSyntax)) is `Completed`. 

If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. To learn about the 3D point cloud object detection output data format, see [3D point cloud object detection output](sms-data-output.md#sms-output-point-cloud-object-detection). 

# Understand the 3D point cloud object tracking task type
<a name="sms-point-cloud-object-tracking"></a>

Use this task type when you want workers to add and fit 3D cuboids around objects to track their movement across 3D point cloud frames. For example, you can use this task type to ask workers to track the movement of vehicles across multiple point cloud frames. 

For this task type, the data object that workers label is a sequence of point cloud frames. A *sequence* is defined as a temporal series of point cloud frames. Ground Truth renders a series of 3D point cloud visualizations using a sequence you provide and workers can switch between these 3D point cloud frames in the worker task interface. 

Ground Truth provides workers with tools to annotate objects with 9 degrees of freedom (x,y,z,rx,ry,rz,l,w,h) in three dimensions in both 3D scene and projected side views (top, side, and back). When a worker draws a cuboid around an object, that cuboid is given a unique ID, for example `Car:1` for one car in the sequence and `Car:2` for another. Workers use that ID to label the same object in multiple frames.

You can also provide camera data to give workers more visual information about scenes in the frame, and to help workers draw 3D cuboids around objects. When a worker adds a 3D cuboid to identify an object in either the 2D image or the 3D point cloud, and the cuboid shows up in the other view. 

You can adjust annotations created in a 3D point cloud object detection labeling job using the 3D point cloud object tracking adjustment task type. 

If you are a new user of the Ground Truth 3D point cloud labeling modality, we recommend you review [3D point cloud labeling jobs overview](sms-point-cloud-general-information.md). This labeling modality is different from other Ground Truth task types, and this page provides an overview of important details you should be aware of when creating a 3D point cloud labeling job.

The following topics explain how to create a 3D point cloud object tracking job, show what the worker task interface looks like (what workers see when they work on this task), and provide an overview of the output data you get when workers complete their tasks. The final topic provides useful information for creating object tracking adjustment or verification labeling jobs.

**Topics**
+ [Create a 3D point cloud object tracking labeling job](sms-point-cloud-object-tracking-create-labeling-job.md)
+ [View the worker task interface for a 3D point cloud object tracking task](sms-point-cloud-object-tracking-worker-ui.md)
+ [Output data for a 3D point cloud object tracking labeling job](sms-point-cloud-object-tracking-output-data.md)
+ [Information for creating a 3D point cloud object tracking adjustment or verification labeling job](sms-point-cloud-object-tracking-adjustment-verification.md)

# Create a 3D point cloud object tracking labeling job
<a name="sms-point-cloud-object-tracking-create-labeling-job"></a>

You can create a 3D point cloud labeling job using the SageMaker AI console or API operation, [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To create a labeling job for this task type you need the following: 
+ A sequence input manifest file. To learn how to create this type of manifest file, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). If you are a new user of Ground Truth 3D point cloud labeling modalities, we recommend that you review [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). 
+ A work team from a private or vendor workforce. You cannot use Amazon Mechanical Turk for 3D point cloud labeling jobs. To learn how to create workforces and work teams, see [Workforces](sms-workforce-management.md).

Additionally, make sure that you have reviewed and satisfied the [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

To learn how to create a labeling job using the console or an API, see the following sections. 

## Create a labeling job (console)
<a name="sms-point-cloud-object-tracking-create-labeling-job-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) in order to learn how to create a 3D point cloud object tracking labeling job in the SageMaker AI console. While you are creating your labeling job, be aware of the following: 
+ Your input manifest file must be a sequence manifest file. For more information, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). 
+ Optionally, you can provide label category attributes. Workers can assign one or more of these attributes to annotations to provide more information about that object. For example, you might want to use the attribute *occluded* to have workers identify when an object is partially obstructed.
+ Automated data labeling and annotation consolidation are not supported for 3D point cloud labeling tasks. 
+ 3D point cloud object tracking labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs when you select your work team (up to 7 days, or 604800 seconds). 

## Create a labeling job (API)
<a name="sms-point-cloud-object-tracking-create-labeling-job-api"></a>

This section covers details you need to know when you create a labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

[Create a Labeling Job (API)](sms-create-labeling-job-api.md) provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/PointCloudObjectTracking`. Replace `<region>` with the AWS Region you are creating the labeling job in. 

  There should not be an entry for the `UiTemplateS3Uri` parameter. 
+ Your [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) must end in `-ref`. For example, `ot-labels-ref`. 
+ Your input manifest file must be a point cloud frame sequence manifest file. For more information, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). 
+ You specify your labels, label category and frame attributes, and worker instructions in a label category configuration file. For more information, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md) to learn how to create this file. 
+ You need to provide pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN that ends with `PRE-3DPointCloudObjectTracking`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN that ends with `ACS-3DPointCloudObjectTracking`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` should be `1`. 
+ Automated data labeling is not supported for 3D point cloud labeling jobs. You should not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`. 
+ 3D point cloud object tracking labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604,800 seconds). 

# View the worker task interface for a 3D point cloud object tracking task
<a name="sms-point-cloud-object-tracking-worker-ui"></a>

Ground Truth provides workers with a web portal and tools to complete your 3D point cloud object tracking annotation tasks. When you create the labeling job, you provide the Amazon Resource Name (ARN) for a pre-built Ground Truth UI in the `HumanTaskUiArn` parameter. When you create a labeling job using this task type in the console, this UI is automatically used. You can preview and interact with the worker UI when you create a labeling job in the console. If you are a new use, it is recommended that you create a labeling job using the console to ensure your label attributes, point cloud frames, and if applicable, images, appear as expected. 

The following is a GIF of the 3D point cloud object tracking worker task interface and demonstrates how the worker can navigate the point cloud frames in the sequence. The annotating tools are a part of the worker task interface. They are not available for the preview interface. 

![\[Gif showing how the worker can navigate the point cloud frames in the sequence.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/nav_frames.gif)


Once workers add a single cuboid, that cuboid is replicated in all frames of the sequence with the same ID. Once workers adjust the cuboid in another frame, Ground Truth will interpolate the movement of that object and adjust all cuboids between the manually adjusted frames. The following GIF demonstrates this interpolation feature. In the navigation bar on the bottom-left, red-areas indicate manually adjusted frames. 

![\[Gif showing how the location of a cuboid is inferred in in-between frames.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/label-interpolation.gif)


If you provide camera data for sensor fusion, images are matched up with scenes in point cloud frames. These images appear in the worker portal as shown in the following GIF. 

Worker can navigate in the 3D scene using their keyboard and mouse. They can:
+ Double click on specific objects in the point cloud to zoom into them.
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

Once a worker places a cuboids in the 3D scene, a side-view will appear with the three projected side views: top, side, and back. These side-views show points in and around the placed cuboid and help workers refine cuboid boundaries in that area. Workers can zoom in and out of each of those side-views using their mouse. 

The following video demonstrates movements around the 3D point cloud and in the side-view. 

![\[Gif showing movements around the 3D point cloud showing a street scene.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/nav_general_UI.gif)


Additional view options and features are available. See the [worker instruction page](https://docs.aws.amazon.com//sagemaker/latest/dg/sms-point-cloud-worker-instructions-object-tracking.html) for a comprehensive overview of the Worker UI. 

## Worker tools
<a name="sms-point-cloud-object-tracking-worker-tools"></a>

Workers can navigate through the 3D point cloud by zooming in and out, and moving in all directions around the cloud using the mouse and keyboard shortcuts. If workers click on a point in the point cloud, the UI will automatically zoom into that area. Workers can use various tools to draw 3D cuboid around objects. For more information, see **Assistive Labeling Tools**. 

After workers have placed a 3D cuboid in the point cloud, they can adjust these cuboids to fit tightly around cars using a variety of views: directly in the 3D cuboid, in a side-view featuring three zoomed-in perspectives of the point cloud around the box, and if you include images for sensor fusion, directly in the 2D image. 

View options that enable workers to easily hide or view label text, a ground mesh, and additional point attributes. Workers can also choose between perspective and orthogonal projections. 

**Assistive Labeling Tools**  
Ground Truth helps workers annotate 3D point clouds faster and more accurately using UX, machine learning and computer vision powered assistive labeling tools for 3D point cloud object tracking tasks. The following assistive labeling tools are available for this task type:
+ **Label autofill** – When a worker adds a cuboid to a frame, a cuboid with the same dimensions and orientation is automatically added to all frames in the sequence. 
+ **Label interpolation** – After a worker has labeled a single object in two frames, Ground Truth uses those annotations to interpolate the movement of that object between those two frames. Label interpolation can be turned on and off.
+ **Bulk label and attribute management** – Workers can add, delete, and rename annotations, label category attributes, and frame attributes in bulk. 
  + Workers can manually delete annotations for a given object before or after a frame. For example, a worker can delete all labels for an object after frame 10 if that object is no longer located in the scene after that frame. 
  + If a worker accidentally bulk deletes all annotations for a object, they can add them back. For example, if a worker deletes all annotations for an object before frame 100, they can bulk add them to those frames. 
  + Workers can rename a label in one frame and all 3D cuboids assigned that label are updated with the new name across all frames. 
  + Workers can use bulk editing to add or edit label category attributes and frame attributes in multiple frames.
+ **Snapping** – Workers can add a cuboid around an object and use a keyboard shortcut or menu option to have Ground Truth's autofit tool snap the cuboid tightly around the object's boundaries. 
+ **Fit to ground** – After a worker adds a cuboid to the 3D scene, the worker can automatically snap the cuboid to the ground. For example, the worker can use this feature to snap a cuboid to the road or sidewalk in the scene. 
+ **Multi-view labeling** – After a worker adds a 3D cuboid to the 3D scene, a side -panel displays front and two side perspectives to help the worker adjust the cuboid tightly around the object. Workers can annotation the 3D point cloud, the side panel and the adjustments appear in the other views in real time. 
+ **Sensor fusion** – If you provide data for sensor fusion, workers can adjust annotations in the 3D scenes and in 2D images, and the annotations will be projected into the other view in real time. 
+ **Auto-merge cuboids **– Workers can automatically merge two cuboids across all frames if they determine that cuboids with different labels actually represent a single object. 
+ **View options **– Enables workers to easily hide or view label text, a ground mesh, and additional point attributes like color or intensity. Workers can also choose between perspective and orthogonal projections. 

# Output data for a 3D point cloud object tracking labeling job
<a name="sms-point-cloud-object-tracking-output-data"></a>

When you create a 3D point cloud object tracking labeling job, tasks are sent to workers. When these workers complete their tasks, their annotations are written to the Amazon S3 bucket you specified when you created the labeling job. The output data format determines what you see in your Amazon S3 bucket when your labeling job status ([LabelingJobStatus](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeLabelingJob.html#API_DescribeLabelingJob_ResponseSyntax)) is `Completed`. 

If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. To learn about the 3D point cloud object tracking output data format, see [3D point cloud object tracking output](sms-data-output.md#sms-output-point-cloud-object-tracking). 

# Information for creating a 3D point cloud object tracking adjustment or verification labeling job
<a name="sms-point-cloud-object-tracking-adjustment-verification"></a>

You can create an adjustment and verification labeling job using the Ground Truth console or `CreateLabelingJob` API. To learn more about adjustment and verification labeling jobs, and to learn how create one, see [Label verification and adjustment](sms-verification-data.md).

When you create an adjustment labeling job, your input data to the labeling job can include labels, and yaw, pitch, and roll measurements from a previous labeling job or external source. In the adjustment job, pitch, and roll will be visualized in the worker UI, but cannot be modified. Yaw is adjustable. 

Ground Truth uses Tait-Bryan angles with the following intrinsic rotations to visualize yaw, pitch and roll in the worker UI. First, rotation is applied to the vehicle according to the z-axis (yaw). Next, the rotated vehicle is rotated according to the intrinsic y'-axis (pitch). Finally, the vehicle is rotated according to the intrinsic x''-axis (roll). 

# Understand the 3D point cloud semantic segmentation task type
<a name="sms-point-cloud-semantic-segmentation"></a>

Semantic segmentation involves classifying individual points of a 3D point cloud into pre-specified categories. Use this task type when you want workers to create a point-level semantic segmentation mask for 3D point clouds. For example, if you specify the classes `car`, `pedestrian`, and `bike`, workers select one class at a time, and color all of the points that this class applies to the same color in the point cloud. 

For this task type, the data object that workers label is a single point cloud frame. Ground Truth generates a 3D point cloud visualization using point cloud data you provide. You can also provide camera data to give workers more visual information about scenes in the frame, and to help workers paint objects. When a worker paints an object in either the 2D image or the 3D point cloud, the paint shows up in the other view. 

You can also adjust or verify annotations created in a 3D point cloud object detection labeling job using the 3D point cloud semantic segmentation adjustment or labeling task type. To learn more about adjustment and verification labeling jobs, and to learn how create one, see [Label verification and adjustment](sms-verification-data.md).

If you are a new user of the Ground Truth 3D point cloud labeling modality, we recommend you review [3D point cloud labeling jobs overview](sms-point-cloud-general-information.md). This labeling modality is different from other Ground Truth task types, and this topic provides an overview of important details you should be aware of when creating a 3D point cloud labeling job.

The following topics explain how to create a 3D point cloud semantic segmentation job, show what the worker task interface looks like (what workers see when they work on this task), and provide an overview of the output data you get when workers complete their tasks.

**Topics**
+ [Create a 3D point cloud semantic segmentation labeling job](sms-point-cloud-semantic-segmentation-create-labeling-job.md)
+ [View the worker task interface for a 3D point cloud semantic segmentation job](sms-point-cloud-semantic-segmentation-worker-ui.md)
+ [Output data for a 3D point cloud semantic segmentation job](sms-point-cloud-semantic-segmentation-input-data.md)

# Create a 3D point cloud semantic segmentation labeling job
<a name="sms-point-cloud-semantic-segmentation-create-labeling-job"></a>

You can create a 3D point cloud labeling job using the SageMaker AI console or API operation, [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To create a labeling job for this task type you need the following: 
+ A single-frame input manifest file. To learn how to create this type of manifest file, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). If you are a new user of Ground Truth 3D point cloud labeling modalities, we recommend that you review [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). 
+ A work team from a private or vendor workforce. You cannot use Amazon Mechanical Turk workers for 3D point cloud labeling jobs. To learn how to create workforces and work teams, see [Workforces](sms-workforce-management.md).
+ A label category configuration file. For more information, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md). 

Additionally, make sure that you have reviewed and satisfied the [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

Use one of the following sections to learn how to create a labeling job using the console or an API. 

## Create a labeling job (console)
<a name="sms-point-cloud-semantic-segmentation-console"></a>

You can follow the instructions [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) in order to learn how to create a 3D point cloud semantic segmentation labeling job in the SageMaker AI console. While you are creating your labeling job, be aware of the following: 
+ Your input manifest file must be a single-frame manifest file. For more information, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). 
+ Automated data labeling and annotation consolidation are not supported for 3D point cloud labeling tasks. 
+ 3D point cloud semantic segmentation labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs when you select your work team (up to 7 days, or 604800 seconds). 

## Create a labeling job (API)
<a name="sms-point-cloud-semantic-segmentation-api"></a>

This section covers details you need to know when you create a labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

The page, [Create a Labeling Job (API)](sms-create-labeling-job-api.md), provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/PointCloudSemanticSegmentation`. Replace `<region>` with the AWS Region you are creating the labeling job in. 

  There should not be an entry for the `UiTemplateS3Uri` parameter. 
+ Your [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) must end in `-ref`. For example, `ss-labels-ref`. 
+ Your input manifest file must be a single-frame manifest file. For more information, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). 
+ You specify your labels and worker instructions in a label category configuration file. See [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md) to learn how to create this file. 
+ You need to provide a pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN. For example, if you are creating your labeling job in us-east-1, the ARN will be `arn:aws:lambda:us-east-1:432418664414:function:PRE-3DPointCloudSemanticSegmentation`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN. For example, if you are creating your labeling job in us-east-1, the ARN will be `arn:aws:lambda:us-east-1:432418664414:function:ACS-3DPointCloudSemanticSegmentation`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` should be `1`. 
+ Automated data labeling is not supported for 3D point cloud labeling jobs. You should not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`. 
+ 3D point cloud semantic segmentation labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604800 seconds). 

# View the worker task interface for a 3D point cloud semantic segmentation job
<a name="sms-point-cloud-semantic-segmentation-worker-ui"></a>

Ground Truth provides workers with a web portal and tools to complete your 3D point cloud semantic segmentation annotation tasks. When you create the labeling job, you provide the Amazon Resource Name (ARN) for a pre-built Ground Truth UI in the `HumanTaskUiArn` parameter. When you create a labeling job using this task type in the console, this UI is automatically used. You can preview and interact with the worker UI when you create a labeling job in the console. If you are a new use, it is recommended that you create a labeling job using the console to ensure your label attributes, point cloud frames, and if applicable, images, appear as expected. 

The following is a GIF of the 3D point cloud semantic segmentation worker task interface. If you provide camera data for sensor fusion, images are matched with scenes in the point cloud frame. Workers can paint objects in either the 3D point cloud or the 2D image, and the paint appears in the corresponding location in the other medium. These images appear in the worker portal as shown in the following GIF. 

![\[Gif showing how workers can use the 3D point cloud and 2D image together to paint objects.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss_paint_sf.gif)


Worker can navigate in the 3D scene using their keyboard and mouse. They can:
+ Double click on specific objects in the point cloud to zoom into them.
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

The following video demonstrates movements around the 3D point cloud. Workers can hide and re-expand all side views and menus. In this GIF, the side-views and menus have been collapsed. 

![\[Gif showing how workers can move around the 3D point cloud.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss_nav_worker_portal.gif)


The following GIF demonstrates how a worker can label multiple objects quickly, refine painted objects using the Unpaint option and then view only points that have been painted. 

![\[Gif showing how a worker can label multiple objects.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss-view-options.gif)


Additional view options and features are available. See the [worker instruction page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-worker-instructions-semantic-segmentation.html) for a comprehensive overview of the Worker UI. 

**Worker Tools**  
Workers can navigate through the 3D point cloud by zooming in and out, and moving in all directions around the cloud using the mouse and keyboard shortcuts. When you create a semantic segmentation job, workers have the following tools available to them: 
+ A paint brush to paint and unpaint objects. Workers paint objects by selecting a label category and then painting in the 3D point cloud. Workers unpaint objects by selecting the Unpaint option from the label category menu and using the paint brush to erase paint. 
+ A polygon tool that workers can use to select and paint an area in the point cloud. 
+ A background paint tool, which enables workers to paint behind objects they have already annotated without altering the original annotations. For example, workers might use this tool to paint the road after painting all of the cars on the road. 
+ View options that enable workers to easily hide or view label text, a ground mesh, and additional point attributes like color or intensity. Workers can also choose between perspective and orthogonal projections. 

# Output data for a 3D point cloud semantic segmentation job
<a name="sms-point-cloud-semantic-segmentation-input-data"></a>

When you create a 3D point cloud semantic segmentation labeling job, tasks are sent to workers. When these workers complete their tasks, their annotations are written to the Amazon S3 bucket you specified when you created the labeling job. The output data format determines what you see in your Amazon S3 bucket when your labeling job status ([LabelingJobStatus](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeLabelingJob.html#API_DescribeLabelingJob_ResponseSyntax)) is `Completed`. 

If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. To learn about the 3D point cloud object detection output data format, see [3D point cloud semantic segmentation output](sms-data-output.md#sms-output-point-cloud-segmentation). 

# Understand the 3D-2D point cloud object tracking task type
<a name="sms-point-cloud-3d-2d-object-tracking"></a>

Use this task type when you want workers to link 3D point cloud annotations with 2D images annotations and also link 2D image annotations among various cameras. Currently, Ground Truth supports cuboids for annotation in a 3D point cloud and bounding boxes for annotation in 2D videos. For example, you can use this task type to ask workers to link the movement of a vehicle in 3D point cloud with its 2D video. Using 3D-2D linking, you can easily correlate point cloud data (like the distance of a cuboid) to video data (bounding box) for up to 8 cameras.

 Ground Truth provides workers with tools to annotate cuboids in a 3D point cloud and bounding boxes in up to 8 cameras using the same annotation UI. Workers can also link various bounding boxes for the same object across different cameras. For example, a bounding box in camera1 can be linked to a bounding box in camera2. This lets you to correlate an object across multiple cameras using a unique ID. 

**Note**  
Currently, SageMaker AI does not support creating a 3D-2D linking job using the console. To create a 3D-2D linking job using the SageMaker API, see [Create a labeling job (API)](sms-3d-2d-point-cloud-object-tracking-create-labeling-job.md#sms-point-cloud-3d-2d-object-tracking-create-labeling-job-api). 

The following topics explain how to create a 3D-2D point cloud object tracking labeling job, show what the worker task interface looks like (what workers see when they work on this task), and provide an overview of the output data you get when workers complete their tasks.

**Topics**
+ [Create a 3D-2D point cloud object tracking labeling job](sms-3d-2d-point-cloud-object-tracking-create-labeling-job.md)
+ [View the worker task interface for a 3D-2D object tracking labeling job](sms-point-cloud-3d-2d-object-tracking-worker-ui.md)
+ [Output data for a 3D-2D object tracking labeling job](sms-point-cloud-3d-2d-object-tracking-output-data.md)

# Create a 3D-2D point cloud object tracking labeling job
<a name="sms-3d-2d-point-cloud-object-tracking-create-labeling-job"></a>

You can create a 3D-2D point cloud labeling job using the SageMaker API operation, [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To create a labeling job for this task type you need the following: 
+ A work team from a private or vendor workforce. You cannot use Amazon Mechanical Turk for 3D point cloud labeling jobs. To learn how to create workforces and work teams, see [Workforces](sms-workforce-management.md).
+ Add a CORS policy to an S3 bucket that contains input data in the Amazon S3 console. To set the required CORS headers on the S3 bucket that contains your input images in the S3 console, follow the directions detailed in [CORS Permission Requirement](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-cors-update.html).
+ Additionally, make sure that you have reviewed and satisfied the [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

To learn how to create a labeling job using the API, see the following sections. 

## Create a labeling job (API)
<a name="sms-point-cloud-3d-2d-object-tracking-create-labeling-job-api"></a>

This section covers details you need to know when you create a 3D-2D object tracking labeling job using the SageMaker API operation `CreateLabelingJob`. This API defines this operation for all AWS SDKs. To see a list of language-specific SDKs supported for this operation, review the **See Also** section of [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). 

[Create a Labeling Job (API)](sms-create-labeling-job-api.md) provides an overview of the `CreateLabelingJob` operation. Follow these instructions and do the following while you configure your request: 
+ You must enter an ARN for `HumanTaskUiArn`. Use `arn:aws:sagemaker:<region>:394669845002:human-task-ui/PointCloudObjectTracking`. Replace `<region>` with the AWS Region you are creating the labeling job in. 

  There should not be an entry for the `UiTemplateS3Uri` parameter. 
+ Your [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) must end in `-ref`. For example, `ot-labels-ref`. 
+ Your input manifest file must be a point cloud frame sequence manifest file. For more information, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). You also need to provide a label category configuration file as mentioned above.
+ You need to provide pre-defined ARNs for the pre-annotation and post-annotation (ACS) Lambda functions. These ARNs are specific to the AWS Region you use to create your labeling job. 
  + To find the pre-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN that ends with `PRE-3DPointCloudObjectTracking`. 
  + To find the post-annotation Lambda ARN, refer to [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). Use the Region you are creating your labeling job in to find the correct ARN that ends with `ACS-3DPointCloudObjectTracking`. 
+ The number of workers specified in `NumberOfHumanWorkersPerDataObject` should be `1`. 
+ Automated data labeling is not supported for 3D point cloud labeling jobs. You should not specify values for parameters in `[LabelingJobAlgorithmsConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelingJobAlgorithmsConfig)`. 
+ 3D-2D object tracking labeling jobs can take multiple hours to complete. You can specify a longer time limit for these labeling jobs in `TaskTimeLimitInSeconds` (up to 7 days, or 604,800 seconds). 

**Note**  
After you have successfully created a 3D-2D object tracking job, it shows up on the console under labeling jobs. The task type for the job is displayed as **Point Cloud Object Tracking**.

## Input data format
<a name="sms-point-cloud-3d-2d-object-tracking-input-data"></a>

You can create a 3D-2D object tracking job using the SageMaker API operation, [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To create a labeling job for this task type you need the following:
+ A sequence input manifest file. To learn how to create this type of manifest file, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). If you are a new user of Ground Truth 3D point cloud labeling modalities, we recommend that you review [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). 
+ You specify your labels, label category and frame attributes, and worker instructions in a label category configuration file. For more information, see [Create a Labeling Category Configuration File with Label Category and Frame Attributes](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-label-cat-config-attributes.html) to learn how to create this file. The following is an example showing a label category configuration file for creating a 3D-2D object tracking job.

  ```
  {
      "document-version": "2020-03-01",
      "categoryGlobalAttributes": [
          {
              "name": "Occlusion",
              "description": "global attribute that applies to all label categories",
              "type": "string",
              "enum":[
                  "Partial",
                  "Full"
              ]
          }
      ],
      "labels":[
          {
              "label": "Car",
              "attributes": [
                  {
                      "name": "Type",
                      "type": "string",
                      "enum": [
                          "SUV",
                          "Sedan"
                      ]
                  } 
              ]
          },
          {
              "label": "Bus",
              "attributes": [
                  {
                      "name": "Size",
                      "type": "string",
                      "enum": [
                          "Large",
                          "Medium",
                          "Small"
                      ]
                  }
              ]
          }
      ],
      "instructions": {
          "shortIntroduction": "Draw a tight cuboid around objects after you select a category.",
          "fullIntroduction": "<p>Use this area to add more detailed worker instructions.</p>"
      },
      "annotationType": [
          {
              "type": "BoundingBox"
          },
          {
              "type": "Cuboid"
          }
      ]
  }
  ```
**Note**  
You need to provide `BoundingBox` and `Cuboid` as annotationType in the label category configuration file to create a 3D-2D object tracking job. 

# View the worker task interface for a 3D-2D object tracking labeling job
<a name="sms-point-cloud-3d-2d-object-tracking-worker-ui"></a>

Ground Truth provides workers with a web portal and tools to complete your 3D-2D object tracking annotation tasks. When you create the labeling job, you provide the Amazon Resource Name (ARN) for a pre-built Ground Truth UI in the `HumanTaskUiArn` parameter. To use the UI when you create a labeling job for this task type using the API, you need to provide the `HumanTaskUiArn`. You can preview and interact with the worker UI when you create a labeling job through the API. The annotating tools are a part of the worker task interface. They are not available for the preview interface. The following image demonstrates the worker task interface used for the 3D-2D point cloud object tracking annotation task.

![\[The worker task interface used for the 3D-2D point cloud object tracking annotation task.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms-sensor-fusion.png)


When interpolation is enabled by default. After a worker adds a single cuboid, that cuboid is replicated in all frames of the sequence with the same ID. If the worker adjusts the cuboid in another frame, Ground Truth interpolates the movement of that object and adjust all cuboids between the manually adjusted frames. Additionally, using the camera view section, a cuboid can be shown with a projection (using to B button for "toggle labels" in the camera view) that provides the worker with a reference from the camera images. The accuracy of the cuboid to image projection is based on accuracy of calibrations captured in the extrinsic and intrinsinc data.

If you provide camera data for sensor fusion, images are matched up with scenes in point cloud frames. Note that the camera data should be time synchronized with the point cloud data to ensure an accurate depiction of point cloud to imagery over each frame in the sequence as shown in the following image.

![\[The manifest file, the worker portal with point cloud data and the camera data.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/3d_2d_link_ss.png)


The manifest file holds the extrinsic and intrinsic data and the pose to allow the cuboid projection on the camera image to be shown by using the **P button**.

Worker can navigate in the 3D scene using their keyboard and mouse. They can:
+ Double click on specific objects in the point cloud to zoom into them.
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

Once a worker places a cuboids in the 3D scene, a side-view appears with the three projected side views: top, side, and front. These side-views show points in and around the placed cuboid and help workers refine cuboid boundaries in that area. Workers can zoom in and out of each of those side-views using their mouse.

The worker should first select the cuboid to draw a corresponding bounding box on any of the camera views. This links the cuboid and the bounding box with a common name and unique ID.

The worker can also first draw a bounding box, select it and draw the corresponding cuboid to link them.

Additional view options and features are available. See the [worker instruction page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-worker-instructions-object-tracking.html) for a comprehensive overview of the Worker UI. 

## Worker tools
<a name="sms-point-cloud-object-tracking-worker-tools"></a>

Workers can navigate through the 3D point cloud by zooming in and out, and moving in all directions around the cloud using the mouse and keyboard shortcuts. If workers click on a point in the point cloud, the UI automatically zooms into that area. Workers can use various tools to draw 3D cuboid around objects. For more information, see **Assistive Labeling Tools** in the following discussion. 

After workers have placed a 3D cuboid in the point cloud, they can adjust these cuboids to fit tightly around cars using a variety of views: directly in the 3D point cloud, in a side-view featuring three zoomed-in perspectives of the point cloud around the box, and if you include images for sensor fusion, directly in the 2D image. 

Additional view options enable workers to easily hide or view label text, a ground mesh, and additional point attributes. Workers can also choose between perspective and orthogonal projections. 

**Assistive Labeling Tools**  
Ground Truth helps workers annotate 3D point clouds faster and more accurately using UX, machine learning and computer vision powered assistive labeling tools for 3D point cloud object tracking tasks. The following assistive labeling tools are available for this task type:
+ **Label autofill** – When a worker adds a cuboid to a frame, a cuboid with the same dimensions, orientation and xyz position is automatically added to all frames in the sequence. 
+ **Label interpolation** – After a worker has labeled a single object in two frames, Ground Truth uses those annotations to interpolate the movement of that object between all the frames. Label interpolation can be turned on and off. It is on by default. For example, if a worker working with 5 frames adds a cuboid in frame 2, it is copied to all the 5 frames. If the worker then makes adjustments in frame 4, frame 2 and 4 now act as two points, through which a line is fit. The cuboid is then interpolated in frames 1,3 and 5.
+ **Bulk label and attribute management** – Workers can add, delete, and rename annotations, label category attributes, and frame attributes in bulk.
  + Workers can manually delete annotations for a given object before and after a frame, or in all frames. For example, a worker can delete all labels for an object after frame 10 if that object is no longer located in the scene after that frame. 
  + If a worker accidentally bulk deletes all annotations for a object, they can add them back. For example, if a worker deletes all annotations for an object before frame 100, they can bulk add them to those frames. 
  + Workers can rename a label in one frame and all 3D cuboids assigned that label are updated with the new name across all frames. 
  + Workers can use bulk editing to add or edit label category attributes and frame attributes in multiple frames.
+ **Snapping** – Workers can add a cuboid around an object and use a keyboard shortcut or menu option to have Ground Truth's autofit tool snap the cuboid tightly around the object's boundaries. 
+ **Fit to ground** – After a worker adds a cuboid to the 3D scene, the worker can automatically snap the cuboid to the ground. For example, the worker can use this feature to snap a cuboid to the road or sidewalk in the scene. 
+ **Multi-view labeling** – After a worker adds a 3D cuboid to the 3D scene, a side-panel displays front and two side perspectives to help the worker adjust the cuboid tightly around the object. Workers can annotation the 3D point cloud, the side panel and the adjustments appear in the other views in real time. 
+ **Sensor fusion** – If you provide data for sensor fusion, workers can adjust annotations in the 3D scenes and in 2D images, and the annotations are projected into the other view in real time. To learn more about the data for sensor fusion, see [Understand Coordinate Systems and Sensor Fusion](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-sensor-fusion-details.html#sms-point-cloud-sensor-fusion).
+ **Auto-merge cuboids **– Workers can automatically merge two cuboids across all frames if they determine that cuboids with different labels actually represent a single object. 
+ **View options **– Enables workers to easily hide or view label text, a ground mesh, and additional point attributes like color or intensity. Workers can also choose between perspective and orthogonal projections. 

# Output data for a 3D-2D object tracking labeling job
<a name="sms-point-cloud-3d-2d-object-tracking-output-data"></a>

When you create a 3D-2D object tracking labeling job, tasks are sent to workers. When these workers complete their tasks, their annotations are written to the Amazon S3 bucket you specified when you created the labeling job. The output data format determines what you see in your Amazon S3 bucket when your labeling job status ([LabelingJobStatus](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeLabelingJob.html#API_DescribeLabelingJob_ResponseSyntax)) is `Completed`. 

If you are a new user of Ground Truth, see [Labeling job output data](sms-data-output.md) to learn more about the Ground Truth output data format. To learn about the 3D-2D point cloud object tracking output data format, see [3D-2D object tracking point cloud object tracking output](sms-data-output.md#sms-output-3d-2d-point-cloud-object-tracking). 

# 3D point cloud labeling jobs overview
<a name="sms-point-cloud-general-information"></a>

This topic provides an overview of the unique features of a Ground Truth 3D point cloud labeling job. You can use the 3D point cloud labeling jobs to have workers label objects in a 3D point cloud generated from a 3D sensors like LiDAR and depth cameras or generated from 3D reconstruction by stitching images captured by an agent like a drone. 

## Job pre-processing time
<a name="sms-point-cloud-job-creation-time"></a>

When you create a 3D point cloud labeling job, you need to provide an [input manifest file](sms-point-cloud-input-data.md). The input manifest file can be:
+ A *frame input manifest file* that has a single point cloud frame on each line. 
+ A *sequence input manifest file* that has a single sequence on each line. A sequence is defined as a temporal series of point cloud frames. 

For both types of manifest files, *job pre-processing time* (that is, the time before Ground Truth starts sending tasks to your workers) depends on the total number and size of point cloud frames you provide in your input manifest file. For frame input manifest files, this is the number of lines in your manifest file. For sequence manifest files, this is the number of frames in each sequence multiplied by the total number of sequences, or lines, in your manifest file. 

Additionally, the number of points per point cloud and the number of fused sensor data objects (like images) factor into job pre-processing times. On average, Ground Truth can pre-process 200 point cloud frames in approximately 5 minutes. If you create a 3D point cloud labeling job with a large number of point cloud frames, you might experience longer job pre-processing times. For example, if you create a sequence input manifest file with 4 point cloud sequences, and each sequence contains 200 point clouds, Ground Truth pre-processes 800 point clouds and so your job pre-processing time might be around 20 minutes. During this time, your labeling job status is `InProgress`. 

While your 3D point cloud labeling job is pre-processing, you receive CloudWatch messages notifying you of the status of your job. To identify these messages, search for `3D_POINT_CLOUD_PROCESSING_STATUS` in your labeling job logs. 

For **frame input manifest files**, your CloudWatch logs will have a message similar to the following:

```
{
    "labeling-job-name": "example-point-cloud-labeling-job",
    "event-name": "3D_POINT_CLOUD_PROCESSING_STATUS",
    "event-log-message": "datasetObjectId from: 0 to 10, status: IN_PROGRESS"
}
```

The event log message, `datasetObjectId from: 0 to 10, status: IN_PROGRESS` identifies the number of frames from your input manifest that have been processed. You receive a new message every time a frame has been processed. For example, after a single frame has processed, you receive another message that says `datasetObjectId from: 1 to 10, status: IN_PROGRESS`. 

For **sequence input manifest files**, your CloudWatch logs will have a message similar to the following:

```
{
    "labeling-job-name": "example-point-cloud-labeling-job",
    "event-name": "3D_POINT_CLOUD_PROCESSING_STATUS",
    "event-log-message": "datasetObjectId: 0, status: IN_PROGRESS"
}
```

The event log message, `datasetObjectId from: 0, status: IN_PROGRESS` identifies the number of sequences from your input manifest that have been processed. You receive a new message every time a sequence has been processed. For example, after a single sequence has processed, you receive a message that says `datasetObjectId from: 1, status: IN_PROGRESS` as the next sequence begins processing. 

## Job completion times
<a name="sms-point-cloud-job-completion-times"></a>

3D point cloud labeling jobs can take workers hours to complete. You can set the total amount of time that workers can work on each task when you create a labeling job. The maximum time you can set for workers to work on tasks is 7 days. The default value is 3 days. 

It is strongly recommended that you create tasks that workers can complete within 12 hours. Workers must keep the worker UI open while working on a task. They can save work as they go and Ground Truth will save their work every 15 minutes.

When using the SageMaker AI `CreateLabelingJob` API operation, set the total time a task is available to workers in the `TaskTimeLimitInSeconds` parameter of `HumanTaskConfig`. 

When you create a labeling job in the console, you can specify this time limit when you select your workforce type and your work team.

## Workforces
<a name="sms-point-cloud-workforces"></a>

When you create a 3D point cloud labeling job, you need to specify a work team that will complete your point cloud annotation tasks. You can choose a work team from a private workforce of your own workers, or from a vendor workforce that you select in the AWS Marketplace. You cannot use the Amazon Mechanical Turk workforce for 3D point cloud labeling jobs. 

To learn more about vendor workforce, see [Subscribe to vendor workforces](sms-workforce-management-vendor.md).

To learn how to create and manage a private workforce, see [Private workforce](sms-workforce-private.md).

## Worker user interface (UI)
<a name="sms-point-cloud-worker-task-ui"></a>

Ground Truth provides a worker user interface (UI), tools, and assistive labeling features to help workers complete your 3D point cloud labeling tasks. 

You can preview the worker UI when you create a labeling job in the console.

When you create a labeling job using the API operation `CreateLabelingJob`, you must provide an ARN provided by Ground Truth in the parameter [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-UiTemplateS3Uri](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-UiTemplateS3Uri) to specify the worker UI for your task type. You can use `HumanTaskUiArn` with the SageMaker AI [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_RenderUiTemplate.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_RenderUiTemplate.html) API operation to preview the worker UI. 

You provide worker instructions, labels, and optionally, label category attributes that are displayed in the worker UI.

### Label category attributes
<a name="sms-point-cloud-label-and-frame-attributes"></a>

When you create a 3D point cloud object tracking or object detection labeling job, you can add one or more *label category attributes*. You can add *frame attributes* to all 3D point cloud task types: 
+ **Label category attribute** – A list of options (strings), a free form text box, or a numeric field associated with one or more labels. It is used by workers to to provide metadata about a label. 
+ **Frame attribute** – A list of options (strings), a free form text box, or a numeric field that appears on each point cloud frame a worker is sent to annotate. It is used by workers to provide metadata about frames. 

Additionally, you can use label and frame attributes to have workers verify labels in a 3D point cloud label verification job. 

Use the following sections to learn more about these attributes. To learn how to add label category and frame attributes to a labeling job, use the **Create Labeling Job** section on the [task type page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-task-types) of your choice.

#### Label category attributes
<a name="sms-point-cloud-label-attributes"></a>

Add label category attributes to labels to give workers the ability to provide more information about the annotations they create. A label category attribute is added to an individual label, or to all labels. When a label category attribute is applied to all labels it is referred to as a *global label category attribute*. 

For example, if you add the label category *car*, you might also want to capture additional data about your labeled cars, such as if they are occluded or the size of the car. You can capture this metadata using label category attributes. In this example, if you added the attribute *occluded* to the car label category, you can assign *partial*, *completely*, *no* to the *occluded* attribute and enable workers to select one of these options. 

When you create a label verification job, you add labels category attributes to each label you want workers to verify.

#### Frame attributes
<a name="sms-point-cloud-frame-attributes"></a>

Add frame attributes to give workers the ability to provide more information about individual point cloud frames. You can specify up to 10 frame attributes, and these attributes will appear on all frames.

For example, you can add a frame attribute that allows workers to enter a number. You may want to use this attribute to have workers identify the number of objects they see in a particular frame. 

In another example, you may want to provide a free-form text box to give workers the ability to provide a free form answer to a question.

When you create a label verification job, you can add one or more frame attributes to ask workers to provide feedback on all labels in a point cloud frame. 

### Worker instructions
<a name="sms-point-cloud-worker-instructions-general"></a>

You can provide worker instructions to help your workers complete your point cloud labeling tasks. You might want to use these instructions to do the following:
+ Best practices and things to avoid when annotating objects.
+ Explanation of the label category attributes provided (for object detection and object tracking tasks), and how to use them.
+ Advice on how to save time while labeling by using keyboard shortcuts. 

You can add your worker instructions using the SageMaker AI console while creating a labeling job. If you create a labeling job using the API operation `CreateLabelingJob`, you specify worker instructions in your label category configuration file. 

In addition to your instructions, Ground Truth provides a link to help workers navigate and use the worker portal. View these instructions by selecting the task type on [Worker instructions](sms-point-cloud-worker-instructions.md). 

### Declining tasks
<a name="sms-decline-task-point-cloud"></a>

Workers are able to decline tasks. 

Workers decline a task if the instructions are not clear, input data is not displaying correctly, or if they encounter some other issue with the task. If the number of workers per dataset object ([https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject)) decline the task, the data object is marked as expired and will not be sent to additional workers.

# 3D point cloud labeling job permission requirements
<a name="sms-security-permission-3d-point-cloud"></a>

When you create a 3D point cloud labeling job, in addition to the permission requirements found in [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md), you must add a CORS policy to your S3 bucket that contains your input manifest file. 

## Add a CORS permission policy to S3 bucket
<a name="sms-permissions-execution-role"></a>

When you create a 3D point cloud labeling job, you specify buckets in S3 where your input data and manifest file are located and where your output data will be stored. These buckets may be the same. You must attach the following Cross-origin resource sharing (CORS) policy to your input and output buckets. If you use the Amazon S3 console to add the policy to your bucket, you must use the JSON format.

**JSON**

```
[
        {
            "AllowedHeaders": [
                "*"
            ],
            "AllowedMethods": [
                "GET",
                "HEAD",
                "PUT"
            ],
            "AllowedOrigins": [
                "*"
            ],
            "ExposeHeaders": [
                "Access-Control-Allow-Origin"
            ],
            "MaxAgeSeconds": 3000
        }
    ]
```

**XML**

```
<?xml version="1.0" encoding="UTF-8"?>
    <CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <AllowedMethod>HEAD</AllowedMethod>
        <AllowedMethod>PUT</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <ExposeHeader>Access-Control-Allow-Origin</ExposeHeader>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
    </CORSConfiguration>
```

To learn how to add a CORS policy to an S3 bucket, see [How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html) in the Amazon Simple Storage Service User Guide.

# Worker instructions
<a name="sms-point-cloud-worker-instructions"></a>

This topic provides an overview of the Ground Truth worker portal and the tools available to complete your 3D Point Cloud labeling task. First, select the type of task you are working on from **Topics**. 

For adjustment jobs, select the original labeling job task type that produced the labels you are adjusting. Review and adjust the labels in your task as needed. 

**Important**  
It is recommended that you complete your task using a Google Chrome or Firefox web browser.

**Topics**
+ [3D point cloud semantic segmentation](sms-point-cloud-worker-instructions-semantic-segmentation.md)
+ [3D point cloud object detection](sms-point-cloud-worker-instructions-object-detection.md)
+ [3D point cloud object tracking](sms-point-cloud-worker-instructions-object-tracking.md)

# 3D point cloud semantic segmentation
<a name="sms-point-cloud-worker-instructions-semantic-segmentation"></a>

Use this page to become familiarize with the user interface and tools available to complete your 3D point cloud semantic segmentation task.

**Topics**
+ [Your Task](#sms-point-cloud-worker-instructions-ss-task)
+ [Navigate the UI](#sms-point-cloud-worker-instructions-worker-ui-ss)
+ [Icon Guide](#sms-point-cloud-worker-instructions-ss-icons)
+ [Shortcuts](#sms-point-cloud-worker-instructions-ss-hot-keys)
+ [Release, Stop and Resume, and Decline Tasks](#sms-point-cloud-worker-instructions-skip-reject-ss)
+ [Saving Your Work and Submitting](#sms-point-cloud-worker-instructions-saving-work-ss)

## Your Task
<a name="sms-point-cloud-worker-instructions-ss-task"></a>

When you work on a 3D point cloud semantic segmentation task, you need to select a category from the **Annotations** menu on the right side of your worker portal using the drop down menu **Label Categories**. After you've selected a category, use the paint brush and polygon tools to paint each object in the 3D point cloud that this category applies to. For example, if you select the category **Car**, you would use these tools to paint all of the cars in the point cloud. The following video demonstrates how to use the paint brush tool to paint an object. 

If you see one or more images in your worker portal, you can paint in the images or paint in the 3D point cloud and the paint will show up in the other medium. 

You may see frame attributes under the **Labels** menu. Use these attribute prompts to enter additional information about the point cloud. 

![\[Example frame attribute prompt.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/frame-attributes.png)


**Important**  
If you see that objects have already been painted when you open the task, adjust those annotations.

The following video includes an image that can be annotated. You may not see an image in your task. 

![\[Gif showing how workers can use the 3D point cloud and 2D image together to paint objects.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss_paint_sf.gif)


After you've painted one or more objects using a label category, you can select that category from the Label Category menu on the right to only view points painted for that category. 

![\[Gif showing how workers can move around the 3D point cloud.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss-view-options.gif)


## Navigate the UI
<a name="sms-point-cloud-worker-instructions-worker-ui-ss"></a>

You can navigate in the 3D scene using their keyboard and mouse. You can:
+ Double click on specific objects in the point cloud to zoom into them.
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

The following video demonstrates movements around the 3D point cloud and in the side-view. You can hide and re-expand all side views using the full screen icon. In this GIF, the side-views and menus have been collapsed.

![\[Gif shows how a worker can use the 3D point cloud in the point cloud view UI.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/semantic_seg/ss_nav_worker_portal.gif)


When you are in the worker UI, you see the following menus:
+ **Instructions** – Review these instructions before starting your task.
+ **Shortcuts** – Use this menu to view keyboard shortcuts that you can use to navigate the point cloud and use the annotation tools provided. 
+ **View** – Use this menu to toggle different view options on and off. For example, you can use this menu to add a ground mesh to the point cloud, and to choose the projection of the point cloud. 
+ **3D Point Cloud** – Use this menu to add additional attributes to the points in the point cloud, such as color, and pixel intensity. Note that some or all of these options may not be available.
+ **Paint** – Use this menu to modify the functionality of the paint brush. 

When you open a task, the move scene icon is on, and you can move around the point cloud using your mouse and the navigation buttons in the point cloud area of the screen. To return to the original view you see when you first opened the task, choose the reset scene icon. 

After you select the paint icon, you can add paint to the point cloud and images (if included). You must select the move scene icon again to move to another area in the 3D point cloud or image. 

To collapse all panels on the right and make the 3D point cloud full screen, select the full screen icon. 

For the camera images and side-panels, you have the following view options:
+ **C** – View the camera angle on point cloud view.
+ **F** – View the frustum, or field of view, of the camera used to capture that image on point cloud view. 
+ **P** – View the point cloud overlaid on the image. 

## Icon Guide
<a name="sms-point-cloud-worker-instructions-ss-icons"></a>

Use this table to learn about the icons available in your worker task portal. 


| Icon | Name | Description | 
| --- | --- | --- | 
|  ![\[The Brush icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/brush.png)  |  brush  |  Choose this icon to turn on the brush tool. To use with this tool, choose and move over the objects that you want to paint with your mouse. After you choose it, everything you paint be associated with the category you chose.  | 
|  ![\[The Polygon icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/polygon.png)  |  polygon  |  Choose this icon to use the polygon paint tool. Use this tool to draw polygons around objects that you want to paint. After you choose it, everything you draw a polygon around will be associated with the category you have chosen.  | 
|  ![\[The Reset scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fit_scene.png)  |  reset scene  | Choose this icon to reset the view of the point cloud, side panels, and if applicable, all images to their original position when the task was first opened.  | 
|  ![\[The Move scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/move_scene.png)  |  move scene  |  Choose this icon to move the scene. By default, this icon will be selected when you first start a task.   | 
|  ![\[The Full screen icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fullscreen.png)  |  full screen   |  Choose this icon to make the 3D point cloud visualization full screen, and to collapse all side panels.  | 
|  ![\[The Ruler icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/Ruler_icon.png)  |  ruler  |  Use this icon to measure distances, in meters, in the point cloud. You may want to use this tool if your instructions ask you to annotate all objects in a given distance from the center of the cuboid or the object used to capture data. When you select this icon, you can place the starting point (first marker) anywhere in the point cloud by selecting it with your mouse. The tool will automatically use interpolation to place a marker on the closest point within threshold distance to the location you select, otherwise the marker will be placed on ground. If you place a starting point by mistake, you can use the Escape key to revert marker placement.  After you place the first marker, you see a dotted line and a dynamic label that indicates the distance you have moved away from the first marker. Click somewhere else on the point cloud to place a second marker. When you place the second marker, the dotted line becomes solid, and the distance is set.  After you set a distance, you can edit it by selecting either marker. You can delete a ruler by selecting anywhere on the ruler and using the Delete key on your keyboard.   | 

## Shortcuts
<a name="sms-point-cloud-worker-instructions-ss-hot-keys"></a>

The shortcuts listed in the **Shortcuts** menu can help you navigate the 3D point cloud and use the paint tool. 

Before you start your task, it is recommended that you review the **Shortcuts** menu and become acquainted with these commands. 

## Release, Stop and Resume, and Decline Tasks
<a name="sms-point-cloud-worker-instructions-skip-reject-ss"></a>

When you open the labeling task, three buttons on the top right allow you to decline the task (**Decline task**), release it (**Release task**), and stop and resume it at a later time (**Stop and resume later**). The following list describes what happens when you select one of these options:
+ **Decline task**: You should only decline a task if something is wrong with the task, such as an issue with the 3D point cloud, images or the UI. If you decline a task, you will not be able to return to the task.
+ **Release Task**: If you release a task, you loose all work done on that task. When the task is released, other workers on your team can pick it up. If enough workers pick up the task, you may not be able to return to it. When you select this button and then select **Confirm**, you are returned to the worker portal. If the task is still available, its status will be **Available**. If other workers pick it up, it will disappear from your portal. 
+ **Stop and resume later**: You can use the **Stop and resume later** button to stop working and return to the task at a later time. You should use the **Save** button to save your work before you select **Stop and resume later**. When you select this button and then select **Confirm**, you are returned to the worker portal, and the task status is **Stopped**. You can select the same task to resume work on it. 

  Be aware that the person that creates your labeling tasks specifies a time limit in which all tasks much be completed by. If you do not return to and complete this task within that time limit, it will expire and your work will not be submitted. Contact your administrator for more information. 

## Saving Your Work and Submitting
<a name="sms-point-cloud-worker-instructions-saving-work-ss"></a>

You should periodically save your work. Ground Truth will automatically save your work ever 15 minutes. 

When you open a task, you must complete your work on it before pressing **Submit**. 

# 3D point cloud object detection
<a name="sms-point-cloud-worker-instructions-object-detection"></a>

Use this page to familiarize yourself with the user interface and tools available to complete your 3D point cloud object detection task.

**Topics**
+ [Your Task](#sms-point-cloud-worker-instructions-od-task)
+ [Navigate the UI](#sms-point-cloud-worker-instructions-worker-ui-od)
+ [Icon Guide](#sms-point-cloud-worker-instructions-od-icons)
+ [Shortcuts](#sms-point-cloud-worker-instructions-od-hot-keys)
+ [Release, Stop and Resume, and Decline Tasks](#sms-point-cloud-worker-instructions-skip-reject-od)
+ [Saving Your Work and Submitting](#sms-point-cloud-worker-instructions-saving-work-od)

## Your Task
<a name="sms-point-cloud-worker-instructions-od-task"></a>

When you work on a 3D point cloud object detection task, you need to select a category from the **Annotations** menu on the right side of your worker portal using the **Label Categories** menu. After you've chosen a category, use the add cuboid and fit cuboid tools to fit a cuboid around objects in the 3D point cloud that this category applies to. After you place a cuboid, you can modify its dimensions, location, and orientation directly in the point cloud, and the three panels shown on the right. 

If you see one or more images in your worker portal, you can also modify cuboids in the images or in the 3D point cloud and the edits will show up in the other medium. 

If you see cuboids have already been added to the 3D point cloud when you open your task, adjust those cuboids and add additional cuboids as needed. 

To edit a cuboid, including moving, re-orienting, and changing cuboid dimensions, you must use shortcut keys. You can see a full list of shortcut keys in the **Shortcuts** menu in your UI. The following are important key-combinations that you should become familiar with before starting your labeling task. 


****  

| Mac Command | Windows Command | Action | 
| --- | --- | --- | 
|  Cmd \$1 Drag  |  Ctrl \$1 Drag  |  Modify the dimensions of the cuboid.  | 
|  Option \$1 Drag  |  Alt \$1 Drag  |   Move the cuboid.   | 
|  Shift \$1 Drag  |  Shift \$1 Drag  |  Rotate the cuboid.   | 
|  Option \$1 O  |  Alt \$1 O  |  Fit the cuboid tightly around the points it has been drawn around. Before using the option, make sure the cuboid fully-surrounds the object of interest.   | 
|  Option \$1 G  |  Alt \$1 G  |  Set the cuboid to the ground.   | 

Individual labels may have one or more label attributes. If a label has a label attribute associated with it, it will appear when you select the downward pointing arrow next to the label from the **Label Id** menu. Fill in required values for all label attributes. 

You may see frame attributes under the **Labels** menu. Use these attribute prompts to enter additional information about each frame. 

![\[Example frame attribute prompt.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/frame-attributes.png)


## Navigate the UI
<a name="sms-point-cloud-worker-instructions-worker-ui-od"></a>

You can navigate in the 3D scene using your keyboard and mouse. You can:
+ Double click on specific objects in the point cloud to zoom into them. 
+ You can use the [ and ] keys on your keyboard to zoom into and move from one label to the next. If no label is selected, when you select [ or ], the UI will zoom into the first label in the **Lable Id** list. 
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

Once you place a cuboids in the 3D scene, a side-view will appear with three projected views: top, side, and back. These side-views show points in and around the placed cuboid and help workers refine cuboid boundaries in that area. Workers can zoom in and out of each of those side-views using their mouse. 

The following video demonstrates movements around the 3D point cloud and in the side-view. 

![\[Gif showing movements around the 3D point cloud and the side-view.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_detection/navigate_od_worker_ui.gif)


When you are in the worker UI, you see the following menus:
+ **Instructions** – Review these instructions before starting your task.
+ **Shortcuts** – Use this menu to view keyboard shortcuts that you can use to navigate the point cloud and use the annotation tools provided. 
+ **Label** – Use this menu to modify a cuboid. First, select a cuboid, and then choose an option from this menu. This menu includes assistive labeling tools like setting a cuboid to the ground and automatically fitting the cuboid to the object's boundaries. 
+ **View** – Use this menu to toggle different view options on and off. For example, you can use this menu to add a ground mesh to the point cloud, and to choose the projection of the point cloud. 
+ **3D Point Cloud** – Use this menu to add additional attributes to the points in the point cloud, such as color, and pixel intensity. Note that these options may not be available.

When you open a task, the move scene icon is on, and you can move around the point cloud using your mouse and the navigation buttons in the point cloud area of the screen. To return to the original view you see when you first opened the task, choose the reset scene icon. Resetting the view will not modify your annotations. 

After you select the add cuboid icon, you can add cuboids to the 3D point cloud visualization. Once you've added a cuboid, you can adjust it in the three views (top, side, and front) and in the images (if included). 

![\[Gif showing how a worker can annotate a 3D point cloud in the Ground Truth worker portal.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_detection/ot_basic_tools.gif)


You must choose the move scene icon again to move to another area in the 3D point cloud or image. 

To collapse all panels on the right and make the 3D point cloud full-screen, choose the full screen icon. 

If camera images are included, you may have the following view options:
+ **C** – View the camera angle on point cloud view.
+ **F** – View the frustum, or field of view, of the camera used to capture that image on point cloud view. 
+ **P** – View the point cloud overlaid on the image.
+ **B** – View cuboids in the image. 

The following video demonstrates how to use these view options. The **F** option is used to view the field of view of the camera (the gray area), the **C** options shows the direction the camera is facing and angle of the camera (blue lines), and the **B** option is used to view the cuboid. 

![\[Gif showing how to use various view options.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/view-options-side.gif)


## Icon Guide
<a name="sms-point-cloud-worker-instructions-od-icons"></a>

Use this table to learn about the icons you see in your worker task portal. 


| Icon | Name | Description | 
| --- | --- | --- | 
|  ![\[The Add cuboid icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/add_cuobid.png)  |  add cuboid  |  Choose this icon to add a cuboid. Each cuboid you add is associated with the category you chose.   | 
|  ![\[The Edit cuboid icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/edit_cuboid.png)  |  edit cuboid  |  Choose this icon to edit a cuboid. After you have added a cuboid, you can edit its dimensions, location, and orientation. After a cuboid is added, it automatically switches to edit cuboid mode.   | 
|  ![\[The Ruler icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/Ruler_icon.png)  |  ruler  |  Use this icon to measure distances, in meters, in the point cloud. You may want to use this tool if your instructions ask you to annotate all objects in a given distance from the center of the cuboid or the object used to capture data. When you select this icon, you can place the starting point (first marker) anywhere in the point cloud by selecting it with your mouse. The tool will automatically use interpolation to place a marker on the closest point within threshold distance to the location you select, otherwise the marker will be placed on ground. If you place a starting point by mistake, you can use the Escape key to revert marker placement.  After you place the first marker, you see a dotted line and a dynamic label that indicates the distance you have moved away from the first marker. Click somewhere else on the point cloud to place a second marker. When you place the second marker, the dotted line becomes solid, and the distance is set.  After you set a distance, you can edit it by selecting either marker. You can delete a ruler by selecting anywhere on the ruler and using the Delete key on your keyboard.   | 
|  ![\[The Reset scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fit_scene.png)  |  reset scene  |  Choose this icon to reset the view of the point cloud, side panels, and if applicable, all images to their original position when the task was first opened.   | 
|  ![\[The Move scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/move_scene.png)  |  move scene  |  Choose this icon to move the scene. By default, this icon is chosen when you first start a task.   | 
|  ![\[The Full screen icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fullscreen.png)  |  full screen   |  Choose this icon to make the 3D point cloud visualization full screen, and to collapse all side panels.  | 
|  ![\[The Show labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/show.png)  | show labels |  Show labels in the 3D point cloud visualization, and if applicable, in images.   | 
|  ![\[The Hide labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/hide.png)  | hide labels |  Hide labels in the 3D point cloud visualization, and if applicable, in images.   | 
|  ![\[The Delete labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/delete.png)  | delete labels |  Delete a label.   | 

## Shortcuts
<a name="sms-point-cloud-worker-instructions-od-hot-keys"></a>

The shortcuts listed in the **Shortcuts** menu can help you navigate the 3D point cloud and use tools to add and edit cuboids. 

Before you start your task, it is recommended that you review the **Shortcuts** menu and become acquainted with these commands. You need to use some of the 3D cuboid controls to edit your cuboid. 

## Release, Stop and Resume, and Decline Tasks
<a name="sms-point-cloud-worker-instructions-skip-reject-od"></a>

When you open the labeling task, three buttons on the top right allow you to decline the task (**Decline task**), release it (**Release task**), and stop and resume it at a later time (**Stop and resume later**). The following list describes what happens when you select one of these options:
+ **Decline task**: You should only decline a task if something is wrong with the task, such as an issue with the 3D point cloud, images or the UI. If you decline a task, you will not be able to return to the task.
+ **Release Task**: If you release a task, you loose all work done on that task. When the task is released, other workers on your team can pick it up. If enough workers pick up the task, you may not be able to return to it. When you select this button and then select **Confirm**, you are returned to the worker portal. If the task is still available, its status will be **Available**. If other workers pick it up, it will disappear from your portal. 
+ **Stop and resume later**: You can use the **Stop and resume later** button to stop working and return to the task at a later time. You should use the **Save** button to save your work before you select **Stop and resume later**. When you select this button and then select **Confirm**, you are returned to the worker portal, and the task status is **Stopped**. You can select the same task to resume work on it. 

  Be aware that the person that creates your labeling tasks specifies a time limit in which all tasks much be completed by. If you do not return to and complete this task within that time limit, it will expire and your work will not be submitted. Contact your administrator for more information. 

## Saving Your Work and Submitting
<a name="sms-point-cloud-worker-instructions-saving-work-od"></a>

You should periodically save your work. Ground Truth will automatically save your work ever 15 minutes. 

When you open a task, you must complete your work on it before pressing **Submit**.

# 3D point cloud object tracking
<a name="sms-point-cloud-worker-instructions-object-tracking"></a>

Use this page to become familiarize with the user interface and tools available to complete your 3D point cloud object detection task.

**Topics**
+ [Your Task](#sms-point-cloud-worker-instructions-ot-task)
+ [Navigate the UI](#sms-point-cloud-worker-instructions-worker-ui-ot)
+ [Bulk Edit Label Category and Frame Attributes](#sms-point-cloud-worker-instructions-ot-bulk-edit)
+ [Icon Guide](#sms-point-cloud-worker-instructions-ot-icons)
+ [Shortcuts](#sms-point-cloud-worker-instructions-ot-hot-keys)
+ [Release, Stop and Resume, and Decline Tasks](#sms-point-cloud-worker-instructions-skip-reject-ot)
+ [Saving Your Work and Submitting](#sms-point-cloud-worker-instructions-saving-work-ot)

## Your Task
<a name="sms-point-cloud-worker-instructions-ot-task"></a>

When you work on a 3D point cloud object tracking task, you need to select a category from the **Annotations** menu on the right side of your worker portal using the **Label Categories** menu. After you've selected a category, use the add cuboid and fit cuboid tools to fit a cuboid around objects in the 3D point cloud that this category applies to. After you place a cuboid, you can modify its location, dimensions, and orientation directly in the point cloud, and the three panels shown on the right. If you see one or more images in your worker portal, you can also modify cuboids in the images or in the 3D point cloud and the edits will show up in the other medium. 

**Important**  
If you see cuboids have already been added to the 3D point cloud frames when you open your task, adjust those cuboids and add additional cuboids as needed. 

To edit a cuboid, including moving, re-orienting, and changing cuboid dimensions, you must use shortcut keys. You can see a full list of shortcut keys in the **Shortcuts** menu in your UI. The following are important key-combinations that you should become familiar with before starting your labeling task. 


****  

| Mac Command | Windows Command | Action | 
| --- | --- | --- | 
|  Cmd \$1 Drag  |  Ctrl \$1 Drag  |  Modify the dimensions of the cuboid. | 
|  Option \$1 Drag  |  Alt \$1 Drag  |   Move the cuboid.   | 
|  Shift \$1 Drag  |  Shift \$1 Drag  |  Rotate the cuboid.   | 
|  Option \$1 O  |  Alt \$1 O  |  Fit the cuboid tightly around the points it has been drawn around. Before using the option, make sure the cuboid fully-surrounds the object of interest.   | 
|  Option \$1 G  |  Alt \$1 G  |  Set the cuboid to the ground.   | 

When you open your task, two frames will be loaded. If your task includes more than two frames, you need to use the navigation bar in the lower-left corner, or the load frames icon to load additional frames. You should annotate and adjust labels in all frames before submitting. 

After you fit a cuboid tightly around the boundaries of an object, navigate to another frame using the navigation bar in the lower-left corner of the UI. If that same object has moved to a new location, add another cuboid and fit it tightly around the boundaries of the object. Each time you manually add a cuboid, you see the frame sequence bar in the lower-left corner of the screen turn red where that frame is located temporally in the sequence.

Your UI automatically infers the location of that object in all other frames after you've placed a cuboid. This is called *interpolation*. You can see the movement of that object, and the inferred and manually created cuboids using the arrows. Adjust inferred cuboids as needed. The following video demonstrates how to navigate between frames. The following video shows how, if you add a cuboid in one frame, and then adjust it in another, your UI will automatically infer the location of the cuboid in all of the frames in-between.

![\[Gif showing how the location of a cuboid is inferred in in-between frames.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/label-interpolation.gif)


**Tip**  
You can turn off the automatic cuboid interpolation across frames using the 3D Point Cloud menu item. Select **3D Point Cloud** from the top-menu, and then select **Interpolate Cuboids Across Frames**. This will uncheck this option and stop cuboid interpolation. You can reselect this item to turn cuboid interpolation back on.   
Turning cuboid interpolation off will not impact cuboids that have already been interpolated across frames. 

Individual labels may have one or more label attributes. If a label has a label attribute associated with it, it will appear when you select the downward pointing arrow next to the label from the **Label Id** menu. Fill in required values for all label attributes. 

You may see frame attributes under the **Label Id** menu. These attributes will appear on each frame in your task. Use these attribute prompts to enter additional information about each frame. 

![\[Example frame attribute prompt.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/frame-attributes.png)


## Navigate the UI
<a name="sms-point-cloud-worker-instructions-worker-ui-ot"></a>

You can navigate in the 3D scene using your keyboard and mouse. You can:
+ Double click on specific objects in the point cloud to zoom into them.
+ You can use the [ and ] keys on your keyboard to zoom into and move from one label to the next. If no label is selected, when you select [ or ], the UI will zoom into the first label in the **Label Id** list. 
+ Use a mouse-scroller or trackpad to zoom in and out of the point cloud.
+ Use both keyboard arrow keys and Q, E, A, and D keys to move Up, Down, Left, Right. Use keyboard keys W and S to zoom in and out. 

Once you place a cuboids in the 3D scene, a side-view will appear with three projected views: top, side, and back. These side-views show points in and around the placed cuboid and help workers refine cuboid boundaries in that area. Workers can zoom in and out of each of those side-views using their mouse. 

The following video demonstrates movements around the 3D point cloud and in the side-view. 

![\[Gif shows how a worker can use the 3D or 2D view to adjust a cuboid.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/view-options-worker-ui.gif)


When you are in the worker UI, you see the following menus:
+ **Instructions** – Review these instructions before starting your task.
+ **Shortcuts** – Use this menu to view keyboard shortcuts that you can use to navigate the point cloud and use the annotation tools provided. 
+ **Label** – Use this menu to modify a cuboid. First, select a cuboid, and then choose an option from this menu. This menu includes assistive labeling tools like setting a cuboid to the ground and automatically fitting the cuboid to the object's boundaries. 
+ **View** – Use this menu to toggle different view options on and off. For example, you can use this menu to add a ground mesh to the point cloud, and to choose the projection of the point cloud.
+ **3D Point Cloud** – Use this menu to add additional attributes to the points in the point cloud, such as color, and pixel intensity. Note that these options may not be available.

When you open a task, the move scene icon is on, and you can move around the point cloud using your mouse and the navigation buttons in the point cloud area of the screen. To return to the original view you see when you first opened the task, choose the reset scene icon. 

After you select the add cuboid icon, you can add cuboids to the point cloud and images (if included). You must select the move scene icon again to move to another area in the 3D point cloud or image. 

To collapse all panels on the right and make the 3D point cloud full-screen, choose the full screen icon. 

If camera images are included, you may have the following view options:
+ **C** – View the camera angle on point cloud view.
+ **F** – View the frustum, or field of view, of the camera used to capture that image on point cloud view. 
+ **P** – View the point cloud overlaid on the image.
+ **B** – View cuboids in the image. 

The following video demonstrates how to use these view options. The **F** option is used to view the field of view of the camera (the gray area), the **C** options shows the direction the camera is facing and angle of the camera (blue lines), and the **B** option is used to view the cuboid. 

![\[Gif showing how to use various view options.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/view-options-side.gif)


### Delete Cuboids
<a name="sms-point-cloud-instructions-ot-delete"></a>

You can select a cuboid or label ID and:
+ Delete an individual cuboid in the current frame you are viewing.
+ Delete all cuboids with that label ID before or after the frame you are viewing.
+ Delete all cuboids with that label ID in all frames. 

A common use-case for cuboid deletion is if the object leaves the scene.

You can use one or more of these options to delete both manually placed and interpolated cuboids with the same label ID.
+ To delete all cuboids before or after the frame you are currently on, select the cuboid, select the **Label** menu item at the top of the UI and then select one of **Delete in previous frames** or **Delete in next frames**. Use the Shortcuts menu to see the shortcut keys you can use for these options.
+ To delete a label in all frames, select **Delete in all frames** from the **Labels** menu, or use the shortcut **Shift \$1 Delete** on your keyboard.
+ To delete an individual cuboid from a single frame, select the cuboid and either select the trashcan icon (![\[Trash can icon representing deletion or removal functionality.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/delete.png)) next to that label ID in the **Label ID** sidebar on the right or use the Delete key on your keyboard to delete that cuboid.

If you have manually placed more than one cuboid with the same label in different frames, when you delete one of the manually placed cuboids, all interpolated cuboids adjust. This adjustment happens because the UI uses manually placed cuboids as anchor points when calculating the location of interpolated cuboid. When you remove one of these anchor points, the UI must recalculate the position of interpolated cuboids.

If you delete a cuboid from a frame, but later decide that you want to get it back, you can use the **Duplicate to previous frames** or **Duplicate to next frames** options in the **Label** menu to copy the cuboid into all the previous or all of the following frames, respectively.

## Bulk Edit Label Category and Frame Attributes
<a name="sms-point-cloud-worker-instructions-ot-bulk-edit"></a>

You can bulk edit label attributes and frame attributes. 

When you bulk edit an attribute, you specify one or more ranges of frames that you want to apply the edit to. The attribute you select is edited in all frames in that range, including the start and end frames you specify. When you bulk edit label attributes, the range you specify *must* contain the label that the label attribute is attached to. If you specify frames that do not contain this label, you will receive an error.

To bulk edit an attribute you *must* specify the desired value for the attribute first. For example, if you want to change an attribute from *Yes* to *No*, you must select *No*, and then perform the bulk edit. 

You can also specify a new value for an attribute that has not been filled in and then use the bulk edit feature to fill in that value in multiple frames. To do this, select the desired value for the attribute and complete the following procedure. 

**To bulk edit a label or attribute:**

1. Use your mouse to right click the attribute you want to bulk edit.

1. Specify the range of frames you want to apply the bulk edit to using a dash (`-`) in the text box. For example, if you want to apply the edit to frames one through ten, enter `1-10`. If you want to apply the edit to frames two to five, eight to ten and twenty enter `2-5,8-10,20`.

1. Select **Confirm**.

If you get an error message, verify that you entered a valid range and that the label associated with the label attribute you are editing (if applicable) exists in all frames specified.

You can quickly add a label to all previous or subsequent frames using the **Duplicate to previous frames** and **Duplicate to next frames** options in the **Label** menu at the top of your screen. 

## Icon Guide
<a name="sms-point-cloud-worker-instructions-ot-icons"></a>

Use this table to learn about the icons you see in your worker task portal. 


| Icon | Name | Description | 
| --- | --- | --- | 
|  ![\[The Add cuboid icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/add_cuobid.png)  |  add cuboid  |  Choose this icon to add a cuboid. Each cuboid you add is associated with the category you chose.   | 
|  ![\[The Edit cuboid icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/edit_cuboid.png)  |  edit cuboid  |  Choose this icon to edit a cuboid. After you add a cuboid, you can edit its dimensions, location, and orientation. After a cuboid is added, it automatically switches to edit cuboid mode.   | 
|  ![\[The Ruler icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/Ruler_icon.png)  |  ruler  |  Use this icon to measure distances, in meters, in the point cloud. You may want to use this tool if your instructions ask you to annotate all objects in a given distance from the center of the cuboid or the object used to capture data. When you select this icon, you can place the starting point (first marker) anywhere in the point cloud by selecting it with your mouse. The tool will automatically use interpolation to place a marker on the closest point within threshold distance to the location you select, otherwise the marker will be placed on ground. If you place a starting point by mistake, you can use the Escape key to revert marker placement.  After you place the first marker, you see a dotted line and a dynamic label that indicates the distance you have moved away from the first marker. Click somewhere else on the point cloud to place a second marker. When you place the second marker, the dotted line becomes solid, and the distance is set.  After you set a distance, you can edit it by selecting either marker. You can delete a ruler by selecting anywhere on the ruler and using the Delete key on your keyboard.   | 
|  ![\[The Reset scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fit_scene.png)  |  reset scene  | Choose this icon to reset the view of the point cloud, side panels, and if applicable, all images to their original position when the task was first opened.  | 
|  ![\[The Move scene icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/move_scene.png)  |  move scene  |  Choose this icon to move the scene. By default, this icon is chosen when you first start a task.   | 
|  ![\[The Full screen icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/fullscreen.png)  |  full screen   |  Choose this icon to make the 3D point cloud visualization full screen and to collapse all side panels.  | 
|  ![\[The Load frames icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/load_screen.png)  |  load frames  |  Choose this icon to load additional frames.   | 
|  ![\[The Hide labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/hide.png)  | hide labels |  Hide labels in the 3D point cloud visualization, and if applicable, in images.   | 
|  ![\[The Show labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/show.png)  | show labels |  Show labels in the 3D point cloud visualization, and if applicable, in images.   | 
|  ![\[The Delete labels icon.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/icons/label-icons/delete.png)  | delete labels |  Delete a label. This option can only be used to delete labels you have manually created or adjusted.   | 

## Shortcuts
<a name="sms-point-cloud-worker-instructions-ot-hot-keys"></a>

The shortcuts listed in the **Shortcuts** menu can help you navigate the 3D point cloud and use tools to add and edit cuboids. 

Before you start your task, it is recommended that you review the **Shortcuts** menu and become acquainted with these commands. You need to use some of the 3D cuboid controls to edit your cuboid. 

## Release, Stop and Resume, and Decline Tasks
<a name="sms-point-cloud-worker-instructions-skip-reject-ot"></a>

When you open the labeling task, three buttons on the top right allow you to decline the task (**Decline task**), release it (**Release task**), and stop and resume it at a later time (**Stop and resume later**). The following list describes what happens when you select one of these options:
+ **Decline task**: You should only decline a task if something is wrong with the task, such as an issue with the 3D point clouds, images or the UI. If you decline a task, you will not be able to return to the task.
+ **Release Task**: Use this option to release a task and allow others to work on it. When you release a task, you loose all work done on that task and other workers on your team can pick it up. If enough workers pick up the task, you may not be able to return to it. When you select this button and then select **Confirm**, you are returned to the worker portal. If the task is still available, its status will be **Available**. If other workers pick it up, it will disappear from your portal. 
+ **Stop and resume later**: You can use the **Stop and resume later** button to stop working and return to the task at a later time. You should use the **Save** button to save your work before you select **Stop and resume later**. When you select this button and then select **Confirm**, you are returned to the worker portal, and the task status is **Stopped**. You can select the same task to resume work on it.

  Be aware that the person that creates your labeling tasks specifies a time limit in which all tasks much be completed by. If you do not return to and complete this task within that time limit, it will expire and your work will not be submitted. Contact your administrator for more information. 

## Saving Your Work and Submitting
<a name="sms-point-cloud-worker-instructions-saving-work-ot"></a>

You should periodically save your work. Ground Truth will automatically save your work ever 15 minutes. 

When you open a task, you must complete your work on it before pressing **Submit**. 

# Label verification and adjustment
<a name="sms-verification-data"></a>

When the labels on a dataset need to be validated, Amazon SageMaker Ground Truth provides functionality to have workers verify that labels are correct or to adjust previous labels. These types of jobs fall into two distinct categories:
+ *Label verification *— Workers indicate if the existing labels are correct, or rate their quality, and can add comments to explain their reasoning. Workers will not be able to modify or adjust labels. 

  If you create a 3D point cloud or video frame label adjustment or verification job, you can choose to make label category attributes (not supported for 3D point cloud semantic segmentation) and frame attributes editable by workers. 
+ *Label adjustment *— Workers adjust prior annotations and, if applicable, label category and frame attributes to correct them. 

The following Ground Truth [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) support adjustment and verification labeling jobs:
+ Bounding box
+ Semantic segmentation 
+ 3D point cloud object detection, 3D point cloud object tracking, and 3D point cloud semantic segmentation
+ All video frame object detection and video frame object tracking task types — bounding box, polyline, polygon and keypoint

**Tip**  
For 3D point cloud and video frame labeling verification jobs, it is recommended that you add new label category attributes or frame attributes to the labeling job. Workers can use these attribute to verify individual labels or the entire frame. To learn more about label category and frame attributes, see [Worker user interface (UI)](sms-point-cloud-general-information.md#sms-point-cloud-worker-task-ui) for 3D point cloud and [Worker user interface (UI)](sms-video-overview.md#sms-video-worker-task-ui) for video frame. 

You can start a label verification and adjustment jobs using the SageMaker AI console or the API. 

## Cautions and considerations
<a name="sms-data-verify-cautions"></a>

To get expected behavior when creating a label verification or adjustment job, carefully verify your input data. 
+ If you are using image data, verify that your manifest file contains hexadecimal RGB color information. 
+ To save money on processing costs, filter your data to ensure you are not including unwanted objects in your labeling job input manifest.
+ Add required Amazon S3 permissions to ensure your input data is processed correctly. 

When you create an adjustment or verification labeling job using the Ground Truth API, you *must* use a different `LabelAttributeName` than the original labeling job.

### Color information requirements for semantic segmentation jobs
<a name="sms-data-verify-color-info"></a>

To properly reproduce color information in verification or adjustment tasks, the tool requires hexadecimal RGB color information in the manifest (for example, \$1FFFFFF for white). When you set up a Semantic Segmentation verification or adjustment job, the tool examines the manifest to determine if this information is present. If it can't find it,Amazon SageMaker Ground Truth displays an error message and the ends job setup.

In prior iterations of the Semantic Segmentation tool, category color information wasn't output in hexadecimal RGB format to the output manifest. That feature was introduced to the output manifest at the same time the verification and adjustment workflows were introduced. Therefore, older output manifests aren't compatible with this new workflow.

### Filter your data before starting the job
<a name="sms-data-verify-filter"></a>

Amazon SageMaker Ground Truth processes all objects in your input manifest. If you have a partially labeled data set, you might want to create a custom manifest using an [Amazon S3 Select query](https://docs.aws.amazon.com/AmazonS3/latest/dev/selecting-content-from-objects.html) on your input manifest. Unlabeled objects individually fail, but they don't cause the job to fail, and they might incur processing costs. Filtering out objects you don't want verified reduces your costs.

If you create a verification job using the console, you can use the filtering tools provided there. If you create jobs using the API, make filtering your data part of your workflow where needed.

**Topics**
+ [Cautions and considerations](#sms-data-verify-cautions)
+ [Requirements to create verification and adjustment labeling jobs](sms-data-verify-adjust-prereq.md)
+ [Create a label verification job (console)](sms-data-verify-start-console.md)
+ [Create a label adjustment job (console)](sms-data-adjust-start-console.md)
+ [Start a label verification or adjustment job (API)](sms-data-verify-start-api.md)
+ [Label verification and adjustment data in the output manifest](sms-data-verify-manifest.md)

# Requirements to create verification and adjustment labeling jobs
<a name="sms-data-verify-adjust-prereq"></a>

To create a label verification or adjustment job, you must satisfy the following criteria. 
+ For non streaming labeling jobs: The input manifest file you use must contain the label attribute name (`LabelAttributeName`) of the labels that you want adjusted. When you chain a successfully completed labeling job, the output manifest file is used as the input manifest file for the new, chained job. To learn more about the format of the output manifest file Ground Truth produces for each task type, see [Labeling job output data](sms-data-output.md).

  For streaming labeling jobs: The Amazon SNS message you sent to the Amazon SNS input topic of the adjustment or verification labeling job must contain the label attribute name of the labels you want adjusted or verified. To see an example of how you can create an adjustment or verification labeling job with streaming labeling jobs, see this [Jupyter Notebook example](https://github.com/aws/amazon-sagemaker-examples/blob/master/ground_truth_labeling_jobs/ground_truth_streaming_labeling_jobs/ground_truth_create_chained_streaming_labeling_job.ipynb) in GitHub.
+ The task type of the verification or adjustment labeling job must be the same as the task type of the original job unless you are using the [Image Label Verification](sms-label-verification.md) task type to verify bounding box or semantic segmentation image labels. See the next bullet point for more details about the video frame task type requirements.
+ For video frame annotation verification and adjustment jobs, you must use the same annotation task type used to create the annotations from the previous labeling job. For example, if you create a video frame object detection job to have workers draw bounding boxes around objects, and then you create a video object detection adjustment job, you must specify *bounding boxes* as the annotation task type. To learn more video frame annotation task types, see [Task types](sms-video-overview.md#sms-video-frame-tools).
+ The task type you select for the adjustment or verification labeling job must support an audit workflow. The following Ground Truth [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) support adjustment and verification labeling jobs: bounding box, semantic segmentation, 3D point cloud object detection, 3D point cloud object tracking, and 3D point cloud semantic segmentation, and all video frame object detection and video frame object tracking task types — bounding box, polyline, polygon and keypoint.

# Create a label verification job (console)
<a name="sms-data-verify-start-console"></a>

Use one of the following sections to create a label verification job for your task type. Bounding box and semantic segmentation labeling jobs are created by choosing the **Label verification** task type in the console. To create a verification job for 3D point cloud and video frame task types, you must choose the same task type as the original labeling job and choose to display existing labels. 

## Create an image label verification job (console)
<a name="sms-data-verify-start-console-bb-ss"></a>

Use the following procedure to create a bounding box or semantic segmentation verification job using the console. This procedure assumes that you have already created a bounding box or semantic segmentation labeling job and its status is Complete. This the labeling job that produces the labels you want verified.

**To create an image label verification job:**

1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) and choose **Labeling jobs**.

1. Start a new labeling job by [chaining](sms-reusing-data.md) a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

1. In the **Task type** pane, select **Label verification**.

1. Choose **Next**.

1. In the **Workers** section, choose the type of workforce you would like to use. For more details about your workforce options see [Workforces](sms-workforce-management.md).

1. (Optional) After you've selected your workforce, specify the **Task timeout** and **Task expiration time**.

1. In the **Existing-labels display options** pane, the system shows the available label attribute names in your manifest. Choose the label attribute name that identifies the labels that you want workers to verify. Ground Truth tries to detect and populate these values by analyzing the manifest, but you might need to set the correct value. 

1. Use the instructions areas of the tool designer to provide context about what the previous labelers were asked to do and what the current verifiers need to check.

   You can add new labels that workers choose from to verify labels. For example, you can ask workers to verify the image quality, and provide the labels *Clear* and *Blurry*. Workers will also have the option to add a comment to explain their selection. 

1. Choose **See preview** to check that the tool is displaying the prior labels correctly and presents the label verification task clearly.

1. Select **Create**. This will create and start your labeling job.

## Create a point cloud or video frame label verification job (console)
<a name="sms-data-verify-start-console-frame"></a>

Use the following procedure to create a 3D point cloud or video frame verification job using the console. This procedure assumes that you have already created a labeling job using the task type that produces the types of labels you want to be verified and its status is Complete.

**To create an image label verification job:**

1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) and choose **Labeling jobs**.

1. Start a new labeling job by [chaining](sms-reusing-data.md) a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

1. In the **Task type** pane, select the same task type as the labeling job that you chained. For example, if the original labeling job was a video frame object detection keypoint labeling job, select that task type. 

1. Choose **Next**.

1. In the **Workers** section, choose the type of workforce you would like to use. For more details about your workforce options see [Workforces](sms-workforce-management.md).

1. (Optional) After you've selected your workforce, specify the **Task timeout** and **Task expiration time**.

1. Toggle on the switch next to **Display existing labels**.

1. Select **Verification**.

1. For **Label attribute name**, choose the name from your manifest that corresponds to the labels that you want to display for verification. You will only see label attribute names for labels that match the task type you selected on the previous screen. Ground Truth tries to detect and populate these values by analyzing the manifest, but you might need to set the correct value.

1. Use the instructions areas of the tool designer to provide context about what the previous labelers were asked to do and what the current verifiers need to check. 

   You cannot modify or add new labels. You can remove, modify and add new label category attributes or frame attributes. It is recommended that you add new label category attributes or frame attributes to the labeling job. Workers can use these attribute to verify individual labels or the entire frame. 

   By default, preexisting label category attributes and frame attributes will not be editable by workers. If you want to make a label category or frame attribute editable, select the **Allow workers to edit this attribute** check box for that attribute.

   To learn more about label category and frame attributes, see [Worker user interface (UI)](sms-point-cloud-general-information.md#sms-point-cloud-worker-task-ui) for 3D point cloud and [Worker user interface (UI)](sms-video-overview.md#sms-video-worker-task-ui) for video frame. 

1. Choose **See preview** to check that the tool is displaying the prior labels correctly and presents the label verification task clearly.

1. Select **Create**. This will create and start your labeling job.

# Create a label adjustment job (console)
<a name="sms-data-adjust-start-console"></a>

Use one of the following sections to create a label verification job for your task type.

**Topics**
+ [Create an image label adjustment job (console)](#sms-data-adjust-start-console-bb-ss)
+ [Create a point cloud or video frame label adjustment job (console)](#sms-data-adjust-start-console-frame)

## Create an image label adjustment job (console)
<a name="sms-data-adjust-start-console-bb-ss"></a>

Use the following procedure to create a bounding box or semantic segmentation adjustment labeling job using the console. This procedure assumes that you have already created a bounding box or semantic segmentation labeling job and its status is Complete. This the labeling job that produces the labels you want adjusted.

**To create an image label adjustment job (console)**

1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) and choose **Labeling jobs**.

1. Start a new labeling job by [chaining](sms-reusing-data.md) a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

1. Choose the same task type as the original labeling job.

1. Choose **Next**.

1. In the **Workers** section, choose the type of workforce you would like to use. For more details about your workforce options see [Workforces](sms-workforce-management.md).

1. (Optional) After you've selected your workforce, specify the **Task timeout** and **Task expiration time**.

1. Expand **Existing-labels display options** by selecting the arrow next to the title.

1. Check the box next to **I want to display existing labels from the dataset for this job**.

1. For **Label attribute name**, choose the name from your manifest that corresponds to the labels that you want to display for adjustment. You will only see label attribute names for labels that match the task type you selected on the previous screen. Ground Truth tries to detect and populate these values by analyzing the manifest, but you might need to set the correct value.

1. Use the instructions areas of the tool designer to provide context about what the previous labelers were tasked with doing and what the current verifiers need to check and adjust.

1. Choose **See preview** to check that the tool shows the prior labels correctly and presents the task clearly.

1. Select **Create**. This will create and start your labeling job.

## Create a point cloud or video frame label adjustment job (console)
<a name="sms-data-adjust-start-console-frame"></a>

Use the following procedure to create a 3D point cloud or video frame adjustment job using the console. This procedure assumes that you have already created a labeling job using the task type that produces the types of labels you want to be verified and its status is Complete.

**To create a 3D point cloud or video frame label adjustment job (console)**

1. Open the SageMaker AI console: [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) and choose **Labeling jobs**.

1. Start a new labeling job by [chaining](sms-reusing-data.md) a prior job or start from scratch, specifying an input manifest that contains labeled data objects.

1. Choose the same task type as the original labeling job.

1. Toggle on the switch next to **Display existing labels**.

1. Select **Adjustment**.

1. For **Label attribute name**, choose the name from your manifest that corresponds to the labels that you want to display for adjustment. You will only see label attribute names for labels that match the task type you selected on the previous screen. Ground Truth tries to detect and populate these values by analyzing the manifest, but you might need to set the correct value.

1. Use the instructions areas of the tool designer to provide context about what the previous labelers were asked to do and what the current adjusters need to check. 

   You cannot remove or modify existing labels but you can add new labels. You can remove, modify and add new label category attributes or frame attributes.

   Be default, preexisting label category attributes and frame attributes will be editable by workers. If you want to make a label category or frame attribute uneditable, deselect the **Allow workers to edit this attribute** check box for that attribute.

   To learn more about label category and frame attributes, see [Worker user interface (UI)](sms-point-cloud-general-information.md#sms-point-cloud-worker-task-ui) for 3D point cloud and [Worker user interface (UI)](sms-video-overview.md#sms-video-worker-task-ui) for video frame. 

1. Choose **See preview** to check that the tool shows the prior labels correctly and presents the task clearly.

1. Select **Create**. This will create and start your labeling job.

# Start a label verification or adjustment job (API)
<a name="sms-data-verify-start-api"></a>

Start a label verification or adjustment job by chaining a successfully completed job or starting a new job from scratch using the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. The procedure is almost the same as setting up a new labeling job with `CreateLabelingJob`, with a few modifications. Use the following sections to learn what modifications are required to chain a labeling job to create an adjustment or verification labeling job. 

When you create an adjustment or verification labeling job using the Ground Truth API, you *must* use a different `LabelAttributeName` than the original labeling job. The original labeling job is the job used to create the labels you want adjusted or verified. 

**Important**  
The label category configuration file you identify for an adjustment or verification job in [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri) of `CreateLabelingJob` must contain the same labels used in the original labeling job. You can add new labels. For 3D point cloud and video frame jobs, you can add new label category and frame attributes to the label category configuration file.

## Bounding Box and Semantic Segmentation
<a name="sms-data-verify-start-api-image"></a>

To create a bounding box or semantic segmentation label verification or adjustment job, use the following guidelines to specify API attributes for the `CreateLabelingJob` operation. 
+ Use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName) parameter to specify the output label name that you want to use for verified or adjusted labels. You must use a different `LabelAttributeName` than the one used for the original labeling job.
+ If you are chaining the job, the labels from the previous labeling job to be adjusted or verified will be specified in the custom UI template. To learn how to create a custom template, see [Create Custom Worker Task Templates](a2i-custom-templates.md).

  Identify the location of the UI template in the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html) parameter. SageMaker AI provides widgets that you can use in your custom template to display old labels. Use the `initial-value` attribute in one of the following crowd elements to extract the labels that need verification or adjustment and include them in your task template:
  + [crowd-semantic-segmentation](sms-ui-template-crowd-semantic-segmentation.md)—Use this crowd element in your custom UI task template to specify semantic segmentation labels that need to be verified or adjusted.
  + [crowd-bounding-box](sms-ui-template-crowd-bounding-box.md)—Use this crowd element in your custom UI task template to specify bounding box labels that need to be verified or adjusted.
+ The [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri) parameter must contain the same label categories as the previous labeling job.
+ Use the bounding box or semantic segmentation adjustment or verification lambda ARNs for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) and [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn):
  + For bounding box, the adjustment labeling job lambda function ARNs end with `AdjustmentBoundingBox` and the verification lambda function ARNs end with `VerificationBoundingBox`.
  + For semantic segmentation, the adjustment labeling job lambda function ARNs end with `AdjustmentSemanticSegmentation` and the verification lambda function ARNs end with `VerificationSemanticSegmentation`.

## 3D point cloud and video frame
<a name="sms-data-verify-start-api-frame"></a>
+ Use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName) parameter to specify the output label name that you want to use for verified or adjusted labels. You must use a different `LabelAttributeName` than the one used for the original labeling job. 
+ You must use the human task UI Amazon Resource Name (ARN) (`HumanTaskUiArn`) used for the original labeling job. To see supported ARNs, see [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-HumanTaskUiArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-HumanTaskUiArn).
+ In the label category configuration file, you must specify the label attribute name ([https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelAttributeName)) of the previous labeling job that you use to create the adjustment or verification labeling job in the `auditLabelAttributeName` parameter.
+ You specify whether your labeling job is a *verification* or *adjustment* labeling job using the `editsAllowed` parameter in your label category configuration file identified by the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#SageMaker-CreateLabelingJob-request-LabelCategoryConfigS3Uri) parameter. 
  + For *verification* labeling jobs, you must use the `editsAllowed` parameter to specify that all labels cannot be modified. `editsAllowed` must be set to `"none"` in each entry in `labels`. Optionally, you can specify whether or not label categories attributes and frame attributes can be adjusted by workers. 
  + Optionally, for *adjustment* labeling jobs, you can use the `editsAllowed` parameter to specify labels, label category attributes, and frame attributes that can or cannot be modified by workers. If you do not use this parameter, all labels, label category attributes, and frame attributes will be adjustable.

  To learn more about the `editsAllowed` parameter and configuring your label category configuration file, see [Label category configuration file schema](sms-label-cat-config-attributes.md#sms-label-cat-config-attributes-schema). 
+ Use the 3D point cloud or video frame adjustment lambda ARNs for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn) and [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html#sagemaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn) for both adjustment and verification labeling jobs:
  + For 3D point clouds, the adjustment and verification labeling job lambda function ARNs end with `Adjustment3DPointCloudSemanticSegmentation`, `Adjustment3DPointCloudObjectTracking`, and `Adjustment3DPointCloudObjectDetection` for 3D point cloud semantic segmentation, object detection, and object tracking respectively. 
  + For video frames, the adjustment and verification labeling job lambda function ARNs end with `AdjustmentVideoObjectDetection` and `AdjustmentVideoObjectTracking` for video frame object detection and object tracking respectively. 

Ground Truth stores the output data from a label verification or adjustment job in the S3 bucket that you specified in the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobOutputConfig.html#SageMaker-Type-LabelingJobOutputConfig-S3OutputPath](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobOutputConfig.html#SageMaker-Type-LabelingJobOutputConfig-S3OutputPath) parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. For more information about the output data from a label verification or adjustment labeling job, see [Label verification and adjustment data in the output manifest](sms-data-verify-manifest.md).

# Label verification and adjustment data in the output manifest
<a name="sms-data-verify-manifest"></a>

Amazon SageMaker Ground Truth writes label verification data to the output manifest within the metadata for the label. It adds two properties to the metadata:
+ A `type` property, with a value of "`groundtruth/label-verification`.
+ A `worker-feedback` property, with an array of `comment` values. This property is added when the worker enters comments. If there are no comments, the field doesn't appear.

The following example output manifest shows how label verification data appears:

```
{
  "source-ref":"S3 bucket location", 
  "verify-bounding-box":"1",    
  "verify-bounding-box-metadata":  
  {
    "class-name": "bad", 
    "confidence": 0.93, 
    "type": "groundtruth/label-verification", 
    "job-name": "verify-bounding-boxes",
    "human-annotated": "yes",
    "creation-date": "2018-10-18T22:18:13.527256",
    "worker-feedback": [
      {"comment": "The bounding box on the bird is too wide on the right side."},
      {"comment": "The bird on the upper right is not labeled."}
    ]
  }
}
```

The worker output of adjustment tasks resembles the worker output of the original task, except that it contains the adjusted values and an `adjustment-status` property with the value of `adjusted` or `unadjusted` to indicate whether an adjustment was made.

For more examples of the output of different tasks, see [Labeling job output data](sms-data-output.md).

# Custom labeling workflows
<a name="sms-custom-templates"></a>

These topics help you set up a Ground Truth labeling job that uses a custom labeling template. A custom labeling template allows you to create a custom worker portal UI that workers will use to label data. Template can be created using HTML, CSS, JavaScript, [Liquid template language](https://shopify.github.io/liquid/), and [Crowd HTML Elements](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-reference.html).

## Overview
<a name="sms-custom-templates-overview"></a>

If this is your first time creating a custom labeling workflow in Ground Truth, the following list is a high-level summary of the steps required.

1. *Set up your workforce* – To create a custom labeling workflow you need a workforce. This topic teaches you about configuring a workforce.

1. *Creating a custom template* – To create a custom template you must map the data from your input manifest file correctly to the variables in your template.

1. *Using optional processing Lambda functions* – To control how data from your input manifest is added to your worker template, and how worker annotations are logged in your job's output file.

This topic also has three end-to-end demos to help you better understand how to use custom labeling templates.

**Note**  
The examples in the links below all include pre-annotation and post-annotation Lambda functions. These Lambda functions are optional.
+ [Demo template: Annotation of images with `crowd-bounding-box`](sms-custom-templates-step2-demo1.md)
+ [Demo Template: Labeling Intents with `crowd-classifier`](sms-custom-templates-step2-demo2.md)
+ [Build a custom data labeling workflow with Amazon SageMaker Ground Truth](https://aws.amazon.com/blogs/machine-learning/build-a-custom-data-labeling-workflow-with-amazon-sagemaker-ground-truth/)

**Topics**
+ [Overview](#sms-custom-templates-overview)
+ [Set up your workforce](sms-custom-templates-step1.md)
+ [Creating a custom worker task template](sms-custom-templates-step2.md)
+ [Adding automation with Liquid](sms-custom-templates-step2-automate.md)
+ [Processing data in a custom labeling workflow with AWS Lambda](sms-custom-templates-step3.md)
+ [Demo template: Annotation of images with `crowd-bounding-box`](sms-custom-templates-step2-demo1.md)
+ [Demo Template: Labeling Intents with `crowd-classifier`](sms-custom-templates-step2-demo2.md)
+ [Create a custom workflow using the API](sms-custom-templates-step4.md)

# Set up your workforce
<a name="sms-custom-templates-step1"></a>

In this step you use the console to establish which worker type to use and make the necessary sub-selections for the worker type. It assumes you have already completed the steps up to this point in the [Getting started: Create a bounding box labeling job with Ground Truth](sms-getting-started.md) section and have chosen the **Custom labeling task** as the **Task type**.

**To configure your workforce.**

1. First choose an option from the **Worker types**. There are three types currently available:
   + **Public** uses an on-demand workforce of independent contractors, powered by Amazon Mechanical Turk. They are paid on a per-task basis.
   + **Private** uses your employees or contractors for handling data that needs to stay within your organization.
   + **Vendor** uses third party vendors that specialize in providing data labeling services, available via the AWS Marketplace.

1. If you choose the **Public** option, you are asked to set the **number of workers per dataset object**. Having more than one worker perform the same task on the same object can help increase the accuracy of your results. The default is three. You can raise or lower that depending on the accuracy you need.

   You are also asked to set a **price per task** by using a drop-down menu. The menu recommends price points based on how long it will take to complete the task.

   The recommended method to determine this is to first run a short test of your task with a **private** workforce. The test provides a realistic estimate of how long the task takes to complete. You can then select the range your estimate falls within on the **Price per task** menu. If your average time is more than 5 minutes, consider breaking your task into smaller units.

## Next
<a name="templates-step1-next"></a>

[Creating a custom worker task template](sms-custom-templates-step2.md)

# Creating a custom worker task template
<a name="sms-custom-templates-step2"></a>

To create a custom labeling job, you need to update the worker task template, map the input data from your manifest file to the variables used in the template, and map the output data to Amazon S3. To learn more about advanced features that use Liquid automation, see [Adding automation with Liquid](sms-custom-templates-step2-automate.md).

The following sections describe each of the required steps.

## Worker task template
<a name="sms-custom-templates-step2-template"></a>

A *worker task template* is a file used by Ground Truth to customize the worker user interface (UI). You can create a worker task template using HTML, CSS, JavaScript, [Liquid template language](https://shopify.github.io/liquid/), and [Crowd HTML Elements](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-reference.html). Liquid is used to automate the template. Crowd HTML Elements are used to include common annotation tools and provide the logic to submit to Ground Truth.

Use the following topics to learn how you can create a worker task template. You can see a repository of example Ground Truth worker task templates on [GitHub](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis).

### Using the base worker task template in the SageMaker AI console
<a name="sms-custom-templates-step2-base"></a>

You can use a template editor in the Ground Truth console to start creating a template. This editor includes a number of pre-designed base templates. It supports autofill for HTML and Crowd HTML Element code.

**To access the Ground Truth custom template editor:**

1. Following the instructions in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md).

1. Then select **Custom** for the labeling job **Task type**.

1. Choose **Next**, and then you can access the template editor and base templates in the **Custom labeling task setup** section. 

1. (Optional) Select a base template from the drop-down menu under **Templates**. If you prefer to create a template from scratch, choose **Custom** from the drop down-menu for a minimal template skeleton.

Use the following section to learn how to visualize a template developed in the console locally.

#### Visualizing your worker task templates locally
<a name="sms-custom-template-step2-UI-local"></a>

You must use the console to test how your template processes incoming data. To test the look and feel of your template's HTML and custom elements you can use your browser.

**Note**  
Variables will not be parsed. You may need to replace them with sample content while viewing your content locally.

The following example code snippet loads the necessary code to render the custom HTML elements. Use this if you want to develop your template's look and feel in your preferred editor rather than in the console.

**Example**  

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
```

### Creating a simple HTML task sample
<a name="sms-custom-templates-step2-sample"></a>

Now that you have the base worker task template, you can use this topic to create a simple HTML-based task template.

The following is an example entry from an input manifest file.

```
{
  "source": "This train is really late.",
  "labels": [ "angry" , "sad", "happy" , "inconclusive" ],
  "header": "What emotion is the speaker feeling?"
}
```

In the HTML task template we need to map the variables from input manifest file to the template. The variable from the example input manifest would be mapped using the following syntax **task.input.source**, **task.input.labels**, and **task.input.header**.

The following is a simple example HTML worker task template for tweet-analysis. All tasks begin and end with the `<crowd-form> </crowd-form>` elements. Like standard HTML `<form>` elements, all of your form code should go between them. Ground Truth generates the workers' tasks directly from the context specified in the template, unless you implement a pre-annotation Lambda. The `taskInput` object returned by Ground Truth or [Pre-annotation Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-prelambda) is the `task.input` object in your templates.

For a simple tweet-analysis task, use the `<crowd-classifier>` element. It requires the following attributes:
+ *name* - The name of your output variable. Worker annotations are saved to this variable name in your output manifest.
+ *categories* - a JSON formatted array of the possible answers.
+ *header* - a title for the annotation tool

The `<crowd-classifier>` element requires at least the three following child elements.
+ *<classification-target>* - The text the worker will classify based on the options specified in the `categories` attribute above.
+ *<full-instructions>* - Instructions that are available from the "View full instructions" link in the tool. This can be left blank, but it is recommended that you give good instructions to get better results.
+ *<short-instructions>* - A more brief description of the task that appears in the tool's sidebar. This can be left blank, but it is recommended that you give good instructions to get better results.

A simple version of this tool would look like the following. The variable **\$1\$1 task.input.source \$1\$1** is what specifies the source data from your input manifest file. The **\$1\$1 task.input.labels \$1 to\$1json \$1\$1** is an example of a variable filter to turn the array into a JSON representation. The `categories` attribute must be JSON.

**Example of using `crowd-classifier` with the sample input manifest json**  

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-classifier
    name="tweetFeeling"
    categories="='{{ task.input.labels | to_json }}'"
    header="{{ task.input.header }}'"
  >
     <classification-target>
       {{ task.input.source }}
     </classification-target>

    <full-instructions header="Sentiment Analysis Instructions">
      Try to determine the sentiment the author
      of the tweet is trying to express.
      If none seem to match, choose "cannot determine."
    </full-instructions>

    <short-instructions>
      Pick the term that best describes the sentiment of the tweet.
    </short-instructions>

  </crowd-classifier>
</crowd-form>
```

You can copy and paste the code into the editor in the Ground Truth labeling job creation workflow to preview the tool, or try out a [demo of this code on CodePen.](https://codepen.io/MTGT/full/OqBvJw)

 [https://codepen.io/MTGT/full/OqBvJw](https://codepen.io/MTGT/full/OqBvJw) 

## Input data, external assets and your task template
<a name="sms-custom-templates-step2-template-input"></a>

Following sections describe the use of external assets, input data format requirements, and when to consider using pre-annotation Lambda functions.

### Input data format requirements
<a name="sms-custom-template-input-manifest"></a>

When you create an input manifest file to use in your custom Ground Truth labeling job, you must store the data in Amazon S3. The input manifest files must also be saved in the same AWS Region in which your custom Ground Truth labeling job is to be run. Furthermore, it can be stored in any Amazon S3 bucket that is accessible to the IAM service role that you use to run your custom labeling job in Ground Truth.

Input manifest files must use the newline-delimited JSON or JSON lines format. Each line is delimited by a standard line break, **\$1n** or **\$1r\$1n**. Each line must also be a valid JSON object. 

Furthermore, each JSON object in the manifest file must contain one of the following keys: `source-ref` or `source`. The value of the keys are interpreted as follows:
+ `source-ref` – The source of the object is the Amazon S3 object specified in the value. Use this value when the object is a binary object, such as an image.
+ `source` – The source of the object is the value. Use this value when the object is a text value.

To learn more about formatting your input manifest files, see [Input manifest files](sms-input-data-input-manifest.md).

### Pre-annotation Lambda function
<a name="sms-custom-template-input-lambda"></a>

You can optionally specify a *pre-annotation Lambda* function to manage how data from your input manifest file is handled prior to labeling. If you have specified the `isHumanAnnotationRequired` key-value pair you must us a pre-annotation Lambda function. When Ground Truth sends the pre-annotation Lambda function a JSON formatted request it uses the following schemas.

**Example data object identified with the `source-ref` key-value pair**  

```
{
  "version": "2018-10-16",
  "labelingJobArn": arn:aws:lambda:us-west-2:555555555555:function:my-function
  "dataObject" : {
    "source-ref": s3://input-data-bucket/data-object-file-name
  }
}
```

**Example data object identified with the `source` key-value pair**  

```
{
      "version": "2018-10-16",
      "labelingJobArn" : arn:aws:lambda:us-west-2:555555555555:function:my-function
      "dataObject" : {
        "source": Sue purchased 10 shares of the stock on April 10th, 2020
      }
    }
```

The following is the expected response from the Lambda function when `isHumanAnnotationRequired` is used.

```
{
  "taskInput": {
    "source": "This train is really late.",
    "labels": [ "angry" , "sad" , "happy" , "inconclusive" ],
    "header": "What emotion is the speaker feeling?"
  },
  "isHumanAnnotationRequired": False
}
```

### Using External Assets
<a name="sms-custom-template-step2-UI-external"></a>

Amazon SageMaker Ground Truth custom templates allow external scripts and style sheets to be embedded. For example, the following code block demonstrates how you would add a style sheet located at `https://www.example.com/my-enhancement-styles.css` to your template.

**Example**  

```
<script src="https://www.example.com/my-enhancment-script.js"></script>
<link rel="stylesheet" type="text/css" href="https://www.example.com/my-enhancement-styles.css">
```

If you encounter errors, ensure that your originating server is sending the correct MIME type and encoding headers with the assets.

For example, the MIME and encoding types for remote scripts are: `application/javascript;CHARSET=UTF-8`.

The MIME and encoding type for remote stylesheets are: `text/css;CHARSET=UTF-8`.

## Output data and your task template
<a name="sms-custom-templates-step2-template-output"></a>

The following sections describe the output data from a custom labeling job, and when to consider using a post-annotation Lambda function.

### Output data
<a name="sms-custom-templates-data"></a>

When your custom labeling job is finished, the data is saved in the Amazon S3 bucket specified when the labeling job was created. The data is saved in an `output.manifest` file.

**Note**  
*labelAttributeName* is a placeholder variable. In your output it is either the name of your labeling job, or the label attribute name you specify when you create the labeling job.
+ `source` or `source-ref` – Either the string or an S3 URI workers were asked to label. 
+ `labelAttributeName` – A dictionary containing consolidated label content from the [post-annotation Lambda function](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda). If a post-annotation Lambda function is not specified, this dictionary will be empty.
+ `labelAttributeName-metadata` – Metadata from your custom labeling job added by Ground Truth. 
+ `worker-response-ref` – The S3 URI of the bucket where the data is saved. If a post-annotation Lambda function is specified this key-value pair will not present.

In this example the JSON object is formatted for readability, in the actual output file the JSON object is on a single line.

```
{
  "source" : "This train is really late.",
  "labelAttributeName" : {},
  "labelAttributeName-metadata": { # These key values pairs are added by Ground Truth
    "job_name": "test-labeling-job",
    "type": "groundTruth/custom",
    "human-annotated": "yes",
    "creation_date": "2021-03-08T23:06:49.111000",
    "worker-response-ref": "s3://amzn-s3-demo-bucket/test-labeling-job/annotations/worker-response/iteration-1/0/2021-03-08_23:06:49.json"
  }
}
```

### Using a post annotation Lambda to consolidate the results from your workers
<a name="sms-custom-templates-consolidation"></a>

By default Ground Truth saves worker responses unprocessed in Amazon S3. To have more fine-grained control over how responses are handled, you can specify a *post-annotation Lambda function*. For example, a post-annotation Lambda function could be used to consolidate annotation if multiple workers have labeled the same data object. To learn more about creating post-annotation Lambda functions, see [Post-annotation Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda).

If you want to use a post-annotation Lambda function, it must be specified as part of the [https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html](https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html) in a `CreateLabelingJob` request.

To learn more about how annotation consolidation works, see [Annotation consolidation](sms-annotation-consolidation.md).

# Adding automation with Liquid
<a name="sms-custom-templates-step2-automate"></a>

Our custom template system uses [Liquid](https://shopify.github.io/liquid/) for automation. It is an open source inline markup language. In Liquid, the text between single curly braces and percent symbols is an instruction or *tag* that performs an operation like control flow or iteration. Text between double curly braces is a variable or *object* that outputs its value.

The most common use of Liquid will be to parse the data coming from your input manifest file, and pull out the relevant variables to create the task. Ground Truth automatically generates the tasks unless a pre-annotation Lambda is specified. The `taskInput` object returned by Ground Truth or your [Pre-annotation Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-prelambda) is the `task.input` object in your templates.

The properties in your input manifest are passed into your template as the `event.dataObject`.

**Example manifest data object**  

```
{
  "source": "This is a sample text for classification",
  "labels": [ "angry" , "sad" , "happy" , "inconclusive" ],
  "header": "What emotion is the speaker feeling?"
}
```

**Example sample HTML using variables**  

```
<crowd-classifier 
  name='tweetFeeling'
  categories='{{ task.input.labels | to_json }}'
  header='{{ task.input.header }}' >
<classification-target>
  {{ task.input.source }}
</classification-target>
```

Note the addition of ` | to_json` to the `labels` property above. That is a filter that turns the input manifest array into a JSON representation of the array. Variable filters are explained in the next section.

The following list includes two types of Liquid tags that you may find useful to automate template input data processing. If you select one of the following tag-types, you will be redirected to the Liquid documentation.
+ [Control flow](https://shopify.github.io/liquid/tags/control-flow/): Includes programming logic operators like `if/else`, `unless`, and `case/when`.
+ [Iteration](https://shopify.github.io/liquid/tags/iteration/): Enables you to run blocks of code repeatedly using statements like for loops. 

  For an example of an HTML template that uses Liquid elements to create a for loop, see [translation-review-and-correction.liquid.html](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis/blob/8ae02533ea5a91087561b1daecd0bc22a37ca393/text/translation-review-and-correction.liquid.html) in GitHub. 

For more information and documentation, visit the [Liquid homepage](https://shopify.github.io/liquid/).

## Variable filters
<a name="sms-custom-templates-step2-automate-filters"></a>

In addition to the standard [Liquid filters](https://shopify.github.io/liquid/filters/abs/) and actions, Ground Truth offers a few additional filters. Filters are applied by placing a pipe (`|`) character after the variable name, then specifying a filter name. Filters can be chained in the form of:

**Example**  

```
{{ <content> | <filter> | <filter> }}
```

### Autoescape and explicit escape
<a name="sms-custom-templates-step2-automate-filters-autoescape"></a>

By default, inputs will be HTML escaped to prevent confusion between your variable text and HTML. You can explicitly add the `escape` filter to make it more obvious to someone reading the source of your template that the escaping is being done.

### escape\$1once
<a name="sms-custom-templates-step2-automate-escapeonce"></a>

`escape_once` ensures that if you've already escaped your code, it doesn't get re-escaped on top of that. For example, so that &amp; doesn't become &amp;amp;.

### skip\$1autoescape
<a name="sms-custom-templates-step2-automate-skipautoescape"></a>

`skip_autoescape` is useful when your content is meant to be used as HTML. For example, you might have a few paragraphs of text and some images in the full instructions for a bounding box.

**Use `skip_autoescape` sparingly**  
The best practice in templates is to avoid passing in functional code or markup with `skip_autoescape` unless you are absolutely sure you have strict control over what's being passed. If you're passing user input, you could be opening your workers up to a Cross Site Scripting attack.

### to\$1json
<a name="sms-custom-templates-step2-automate-tojson"></a>

`to_json` will encode what you feed it to JSON (JavaScript Object Notation). If you feed it an object, it will serialize it.

### grant\$1read\$1access
<a name="sms-custom-templates-step2-automate-grantreadaccess"></a>

`grant_read_access` takes an S3 URI and encodes it into an HTTPS URL with a short-lived access token for that resource. This makes it possible to display to workers the photo, audio, or video objects stored in S3 buckets that are not otherwise publicly accessible.

### s3\$1presign
<a name="sms-custom-templates-step2-automate-s3"></a>

 The `s3_presign` filter works the same way as the `grant_read_access` filter. `s3_presign` takes an Amazon S3 URI and encodes it into an HTTPS URL with a short-lived access token for that resource. This makes it possible to display photo, audio, or video objects stored in S3 buckets that are not otherwise publicly accessible to workers.

**Example of the variable filters**  
Input  

```
auto-escape: {{ "Have you read 'James & the Giant Peach'?" }}
explicit escape: {{ "Have you read 'James & the Giant Peach'?" | escape }}
explicit escape_once: {{ "Have you read 'James &amp; the Giant Peach'?" | escape_once }}
skip_autoescape: {{ "Have you read 'James & the Giant Peach'?" | skip_autoescape }}
to_json: {{ jsObject | to_json }}                
grant_read_access: {{ "s3://amzn-s3-demo-bucket/myphoto.png" | grant_read_access }}
s3_presign: {{ "s3://amzn-s3-demo-bucket/myphoto.png" | s3_presign }}
```

**Example**  
Output  

```
auto-escape: Have you read &#39;James &amp; the Giant Peach&#39;?
explicit escape: Have you read &#39;James &amp; the Giant Peach&#39;?
explicit escape_once: Have you read &#39;James &amp; the Giant Peach&#39;?
skip_autoescape: Have you read 'James & the Giant Peach'?
to_json: { "point_number": 8, "coords": [ 59, 76 ] }
grant_read_access: https://s3.amazonaws.com/amzn-s3-demo-bucket/myphoto.png?<access token and other params>
s3_presign: https://s3.amazonaws.com/amzn-s3-demo-bucket/myphoto.png?<access token and other params>
```

**Example of an automated classification template.**  
To automate the simple text classification sample, replace the tweet text with a variable.  
The text classification template is below with automation added. The changes/additions are highlighted in bold.  

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-classifier 
    name="tweetFeeling"
    categories="['positive', 'negative', 'neutral', 'cannot determine']"
    header="Which term best describes this tweet?" 
  >
    <classification-target>
       {{ task.input.source }}
    </classification-target>

    <full-instructions header="Analyzing a sentiment">
      Try to determine the feeling the author 
      of the tweet is trying to express. 
      If none seem to match, choose "other."
    </full-instructions>

    <short-instructions>
      Pick the term best describing the sentiment 
      of the tweet. 
    </short-instructions>

  </crowd-classifier>
</crowd-form>
```
The tweet text in the prior sample is now replaced with an object. The `entry.taskInput` object uses `source` (or another name you specify in your pre-annotation Lambda) as the property name for the text, and it is inserted directly in the HTML by virtue of being between double curly braces.

# Processing data in a custom labeling workflow with AWS Lambda
<a name="sms-custom-templates-step3"></a>

In this topic, you can learn how to deploy optional [AWS Lambda](https://aws.amazon.com/lambda/) functions when creating a custom labeling workflow. You can specify two types of Lambda functions to use with your custom labeling workflow.
+ *Pre-annotation Lambda*: This function pre-processes each data object sent to your labeling job prior to sending it to workers.
+ *Post-annotation Lambda*: This function processes the results once workers submit a task. If you specify multiple workers per data object, this function may include logic to consolidate annotations.

If you are a new user of Lambda and Ground Truth, we recommend that you use the pages in this section as follows:

1. First, review [Using pre-annotation and post-annotation Lambda functionsUsing Lambda functions](sms-custom-templates-step3-lambda-requirements.md).

1. Then, use the page [Add required permissions to use AWS Lambda with Ground Truth](sms-custom-templates-step3-lambda-permissions.md) to learn about security and permission requirements to use your pre-annotation and post-annotation Lambda functions in a Ground Truth custom labeling job.

1. Next, you need to visit the Lambda console or use Lambda's APIs to create your functions. Use the section [Create Lambda functions using Ground Truth templates](sms-custom-templates-step3-lambda-create.md) to learn how to create Lambda functions.

1. To learn how to test your Lambda functions, see [Test pre-annotation and post-annotation Lambda functions](sms-custom-templates-step3-lambda-test.md).

1. After you create pre-processing and post-processing Lambda functions, select them from the **Lambda functions** section that comes after the code editor for your custom HTML in the Ground Truth console. To learn how to use these functions in a `CreateLabelingJob` API request, see [Create a Labeling Job (API)](sms-create-labeling-job-api.md).

For a custom labeling workflow tutorial that includes example pre-annotation and post-annotation Lambda functions, see [Demo template: Annotation of images with `crowd-bounding-box`](sms-custom-templates-step2-demo1.md).

**Topics**
+ [Using pre-annotation and post-annotation Lambda functions](sms-custom-templates-step3-lambda-requirements.md)
+ [Add required permissions to use AWS Lambda with Ground Truth](sms-custom-templates-step3-lambda-permissions.md)
+ [Create Lambda functions using Ground Truth templates](sms-custom-templates-step3-lambda-create.md)
+ [Test pre-annotation and post-annotation Lambda functions](sms-custom-templates-step3-lambda-test.md)

# Using pre-annotation and post-annotation Lambda functions
<a name="sms-custom-templates-step3-lambda-requirements"></a>

Use these topics to learn about the syntax of the requests sent to pre-annotation and post-annotation Lambda functions, and the required response syntax that Ground Truth uses in custom labeling workflows.

**Topics**
+ [Pre-annotation Lambda](#sms-custom-templates-step3-prelambda)
+ [Post-annotation Lambda](#sms-custom-templates-step3-postlambda)

## Pre-annotation Lambda
<a name="sms-custom-templates-step3-prelambda"></a>

Before a labeling task is sent to the worker, a optional pre-annotation Lambda function can be invoked.

Ground Truth sends your Lambda function a JSON formatted request to provide details about the labeling job and the data object.

The following are 2 example JSON formatted requests.

------
#### [ Data object identified with "source-ref" ]

```
{
    "version": "2018-10-16",
    "labelingJobArn": <labelingJobArn>
    "dataObject" : {
        "source-ref": <s3Uri>
    }
}
```

------
#### [ Data object identified with "source" ]

```
{
    "version": "2018-10-16",
    "labelingJobArn": <labelingJobArn>
    "dataObject" : {
        "source": <string>
    }
}
```

------

 The following list contains the pre-annotation request schemas. Each parameter is described below.
+ `version` (string): This is a version number used internally by Ground Truth.
+ `labelingJobArn` (string): This is the Amazon Resource Name, or ARN, of your labeling job. This ARN can be used to reference the labeling job when using Ground Truth API operations such as `DescribeLabelingJob`.
+ The `dataObject` (JSON object): The key contains a single JSON line, either from your input manifest file or sent from Amazon SNS. The JSON line objects in your manifest can be up to 100 kilobytes in size and contain a variety of data. For a very basic image annotation job, the `dataObject` JSON may just contain a `source-ref` key, identifying the image to be annotated. If the data object (for example, a line of text) is included directly in the input manifest file, the data object is identified with `source`. If you create a verification or adjustment job, this line may contain label data and metadata from the previous labeling job.

The following tabbed examples show examples of a pre-annotation request. Each parameter in these example requests is explained below the tabbed table.

------
#### [ Data object identified with "source-ref" ]

```
{
    "version": "2018-10-16",
    "labelingJobArn": "arn:aws:sagemaker:us-west-2:111122223333:labeling-job/<labeling_job_name>"
    "dataObject" : {
        "source-ref": "s3://input-data-bucket/data-object-file-name"
    }
}
```

------
#### [ Data object identified with "source" ]

```
{
    "version": "2018-10-16",
    "labelingJobArn": "arn:aws:sagemaker:<aws_region>:111122223333:labeling-job/<labeling_job_name>"
    "dataObject" : {
        "source": "Sue purchased 10 shares of the stock on April 10th, 2020"
    }
}
```

------

In return, Ground Truth requires a response formatted like the following:

**Example of expected return data**  

```
{
    "taskInput": <json object>,
    "isHumanAnnotationRequired": <boolean> # Optional
}
```

In the previous example, the `<json object>` needs to contain *all* the data your custom worker task template needs. If you're doing a bounding box task where the instructions stay the same all the time, it may just be the HTTP(S) or Amazon S3 resource for your image file. If it's a sentiment analysis task and different objects may have different choices, it is the object reference as a string and the choices as an array of strings.

**Implications of `isHumanAnnotationRequired`**  
This value is optional because it defaults to `true`. The primary use case for explicitly setting it is when you want to exclude this data object from being labeled by human workers. 

If you have a mix of objects in your manifest, with some requiring human annotation and some not needing it, you can include a `isHumanAnnotationRequired` value in each data object. You can add logic to your pre-annotation Lambda to dynamically determine if an object requires annotation, and set this boolean value accordingly.

### Examples of pre-annotation Lambda functions
<a name="sms-custom-templates-step3-prelambda-example"></a>

The following basic pre-annotation Lambda function accesses the JSON object in `dataObject` from the initial request, and returns it in the `taskInput` parameter.

```
import json

def lambda_handler(event, context):
    return {
        "taskInput":  event['dataObject']
    }
```

Assuming the input manifest file uses `"source-ref"` to identify data objects, the worker task template used in the same labeling job as this pre-annotation Lambda must include a Liquid element like the following to ingest `dataObject`:

```
{{ task.input.source-ref | grant_read_access }}
```

If the input manifest file used `source` to identify the data object, the work task template can ingest `dataObject` with the following:

```
{{ task.input.source }}
```

The following pre-annotation Lambda example includes logic to identify the key used in `dataObject`, and to point to that data object using `taskObject` in the Lambda's return statement.

```
import json

def lambda_handler(event, context):

    # Event received
    print("Received event: " + json.dumps(event, indent=2))

    # Get source if specified
    source = event['dataObject']['source'] if "source" in event['dataObject'] else None

    # Get source-ref if specified
    source_ref = event['dataObject']['source-ref'] if "source-ref" in event['dataObject'] else None

    # if source field present, take that otherwise take source-ref
    task_object = source if source is not None else source_ref

    # Build response object
    output = {
        "taskInput": {
            "taskObject": task_object
        },
        "humanAnnotationRequired": "true"
    }

    print(output)
    # If neither source nor source-ref specified, mark the annotation failed
    if task_object is None:
        print(" Failed to pre-process {} !".format(event["labelingJobArn"]))
        output["humanAnnotationRequired"] = "false"

    return output
```

## Post-annotation Lambda
<a name="sms-custom-templates-step3-postlambda"></a>

When all workers have annotated the data object or when [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanLoopConfig.html#SageMaker-Type-HumanLoopConfig-TaskAvailabilityLifetimeInSeconds](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanLoopConfig.html#SageMaker-Type-HumanLoopConfig-TaskAvailabilityLifetimeInSeconds) has been reached, whichever comes first, Ground Truth sends those annotations to your post-annotation Lambda. This Lambda is generally used for [Annotation consolidation](sms-annotation-consolidation.md).

**Note**  
To see an example of a post-consolidation Lambda function, see [annotation\$1consolidation\$1lambda.py](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe/blob/master/aws_sagemaker_ground_truth_sample_lambda/annotation_consolidation_lambda.py) in the [aws-sagemaker-ground-truth-recipe](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe) GitHub repository.

The following code block contains the post-annotation request schema. Each parameter is described in the following bulleted list.

```
{
    "version": "2018-10-16",
    "labelingJobArn": <string>,
    "labelCategories": [<string>],
    "labelAttributeName": <string>,
    "roleArn" : <string>,
    "payload": {
        "s3Uri": <string>
    }
 }
```
+ `version` (string): A version number used internally by Ground Truth.
+ `labelingJobArn` (string): The Amazon Resource Name, or ARN, of your labeling job. This ARN can be used to reference the labeling job when using Ground Truth API operations such as `DescribeLabelingJob`.
+ `labelCategories` (list of strings): Includes the label categories and other attributes you either specified in the console, or that you include in the label category configuration file.
+ `labelAttributeName` (string): Either the name of your labeling job, or the label attribute name you specify when you create the labeling job.
+ `roleArn` (string): The Amazon Resource Name (ARN) of the IAM execution role you specify when you create the labeling job. 
+ `payload` (JSON object): A JSON that includes an `s3Uri` key, which identifies the location of the annotation data for that data object in Amazon S3. The second code block below shows an example of this annotation file.

The following code block contains an example of a post-annotation request. Each parameter in this example request is explained below the code block.

**Example of an post-annotation Lambda request**  

```
{
    "version": "2018-10-16",
    "labelingJobArn": "arn:aws:sagemaker:us-west-2:111122223333:labeling-job/labeling-job-name",
    "labelCategories": ["Ex Category1","Ex Category2", "Ex Category3"],
    "labelAttributeName": "labeling-job-attribute-name",
    "roleArn" : "arn:aws:iam::111122223333:role/role-name",
    "payload": {
        "s3Uri": "s3://amzn-s3-demo-bucket/annotations.json"
    }
 }
```

**Note**  
If no worker works on the data object and `TaskAvailabilityLifetimeInSeconds` has been reached, the data object is marked as failed and not included as part of post-annotation Lambda invocation.

The following code block contains the payload schema. This is the file that is indicated by the `s3Uri` parameter in the post-annotation Lambda request `payload` JSON object. For example, if the previous code block is the post-annotation Lambda request, the following annotation file is located at `s3://amzn-s3-demo-bucket/annotations.json`.

Each parameter is described in the following bulleted list.

**Example of an annotation file**  

```
[
    {
        "datasetObjectId": <string>,
        "dataObject": {
            "s3Uri": <string>,
            "content": <string>
        },
        "annotations": [{
            "workerId": <string>,
            "annotationData": {
                "content": <string>,
                "s3Uri": <string>
            }
       }]
    }
]
```
+ `datasetObjectId` (string): Identifies a unique ID that Ground Truth assigns to each data object you send to the labeling job.
+ `dataObject` (JSON object): The data object that was labeled. If the data object is included in the input manifest file and identified using the `source` key (for example, a string), `dataObject` includes a `content` key, which identifies the data object. Otherwise, the location of the data object (for example, a link or S3 URI) is identified with `s3Uri`.
+ `annotations` (list of JSON objects): This list contains a single JSON object for each annotation submitted by workers for that `dataObject`. A single JSON object contains a unique `workerId` that can be used to identify the worker that submitted that annotation. The `annotationData` key contains one of the following:
  + `content` (string): Contains the annotation data. 
  + `s3Uri` (string): Contains an S3 URI that identifies the location of the annotation data.

The following table contains examples of the content that you may find in payload for different types of annotation.

------
#### [ Named Entity Recognition Payload ]

```
[
    {
      "datasetObjectId": "1",
      "dataObject": {
        "content": "Sift 3 cups of flour into the bowl."
      },
      "annotations": [
        {
          "workerId": "private.us-west-2.ef7294f850a3d9d1",
          "annotationData": {
            "content": "{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":4,\"label\":\"verb\",\"startOffset\":0},{\"endOffset\":6,\"label\":\"number\",\"startOffset\":5},{\"endOffset\":20,\"label\":\"object\",\"startOffset\":15},{\"endOffset\":34,\"label\":\"object\",\"startOffset\":30}]}}"
          }
        }
      ]
    }
]
```

------
#### [ Semantic Segmentation Payload ]

```
[
    {
      "datasetObjectId": "2",
      "dataObject": {
        "s3Uri": "s3://amzn-s3-demo-bucket/gt-input-data/images/bird3.jpg"
      },
      "annotations": [
        {
          "workerId": "private.us-west-2.ab1234c5678a919d0",
          "annotationData": {
            "content": "{\"crowd-semantic-segmentation\":{\"inputImageProperties\":{\"height\":2000,\"width\":3020},\"labelMappings\":{\"Bird\":{\"color\":\"#2ca02c\"}},\"labeledImage\":{\"pngImageData\":\"iVBOR...\"}}}"
          }
        }
      ]
    }
  ]
```

------
#### [ Bounding Box Payload ]

```
[
    {
      "datasetObjectId": "0",
      "dataObject": {
        "s3Uri": "s3://amzn-s3-demo-bucket/gt-input-data/images/bird1.jpg"
      },
      "annotations": [
        {
          "workerId": "private.us-west-2.ab1234c5678a919d0",
          "annotationData": {
            "content": "{\"boundingBox\":{\"boundingBoxes\":[{\"height\":2052,\"label\":\"Bird\",\"left\":583,\"top\":302,\"width\":1375}],\"inputImageProperties\":{\"height\":2497,\"width\":3745}}}"
          }
        }
      ]
    }
 ]
```

------

Your post-annotation Lambda function may contain logic similar to the following to loop through and access all annotations contained in the request. For a full example, see [annotation\$1consolidation\$1lambda.py](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe/blob/master/aws_sagemaker_ground_truth_sample_lambda/annotation_consolidation_lambda.py) in the [aws-sagemaker-ground-truth-recipe](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe) GitHub repository. In this GitHub example, you must add your own annotation consolidation logic. 

```
for i in range(len(annotations)):
    worker_id = annotations[i]["workerId"]
    annotation_content = annotations[i]['annotationData'].get('content')
    annotation_s3_uri = annotations[i]['annotationData'].get('s3uri')
    annotation = annotation_content if annotation_s3_uri is None else s3_client.get_object_from_s3(
        annotation_s3_uri)
    annotation_from_single_worker = json.loads(annotation)

    print("{} Received Annotations from worker [{}] is [{}]"
            .format(log_prefix, worker_id, annotation_from_single_worker))
```

**Tip**  
When you run consolidation algorithms on the data, you can use an AWS database service to store results, or you can pass the processed results back to Ground Truth. The data you return to Ground Truth is stored in consolidated annotation manifests in the S3 bucket specified for output during the configuration of the labeling job.

In return, Ground Truth requires a response formatted like the following:

**Example of expected return data**  

```
[
   {        
        "datasetObjectId": <string>,
        "consolidatedAnnotation": {
            "content": {
                "<labelattributename>": {
                    # ... label content
                }
            }
        }
    },
   {        
        "datasetObjectId": <string>,
        "consolidatedAnnotation": {
            "content": {
                "<labelattributename>": {
                    # ... label content
                }
            }
        }
    }
    .
    .
    .
]
```
At this point, all the data you're sending to your S3 bucket, other than the `datasetObjectId`, is in the `content` object.

When you return annotations in `content`, this results in an entry in your job's output manifest like the following:

**Example of label format in output manifest**  

```
{  "source-ref"/"source" : "<s3uri or content>", 
   "<labelAttributeName>": {
        # ... label content from you
    },   
   "<labelAttributeName>-metadata": { # This will be added by Ground Truth
        "job_name": <labelingJobName>,
        "type": "groundTruth/custom",
        "human-annotated": "yes", 
        "creation_date": <date> # Timestamp of when received from Post-labeling Lambda
    }
}
```

Because of the potentially complex nature of a custom template and the data it collects, Ground Truth does not offer further processing of the data.

# Add required permissions to use AWS Lambda with Ground Truth
<a name="sms-custom-templates-step3-lambda-permissions"></a>

You may need to configure some or all the following to create and use AWS Lambda with Ground Truth. 
+ You need to grant an IAM role or user (collectively, an IAM entity) permission to create the pre-annotation and post-annotation Lambda functions using AWS Lambda, and to choose them when creating the labeling job. 
+ The IAM execution role specified when the labeling job is configured needs permission to invoke the pre-annotation and post-annotation Lambda functions. 
+ The post-annotation Lambda functions may need permission to access Amazon S3.

Use the following sections to learn how to create the IAM entities and grant permissions described above.

**Topics**
+ [Grant Permission to Create and Select an AWS Lambda Function](#sms-custom-templates-step3-postlambda-create-perms)
+ [Grant IAM Execution Role Permission to Invoke AWS Lambda Functions](#sms-custom-templates-step3-postlambda-execution-role-perms)
+ [Grant Post-Annotation Lambda Permissions to Access Annotation](#sms-custom-templates-step3-postlambda-perms)

## Grant Permission to Create and Select an AWS Lambda Function
<a name="sms-custom-templates-step3-postlambda-create-perms"></a>

If you do not require granular permissions to develop pre-annotation and post-annotation Lambda functions, you can attach the AWS managed policy `AWSLambda_FullAccess` to a user or role. This policy grants broad permissions to use all Lambda features, as well as permission to perform actions in other AWS services with which Lambda interacts.

To create a more granular policy for security-sensitive use cases, refer to the documentation [Identity-based IAM policies for Lambda](https://docs.aws.amazon.com/lambda/latest/dg/access-control-identity-based.html) in the to AWS Lambda Developer Guide to learn how to create an IAM policy that fits your use case. 

**Policies to Use the Lambda Console**

If you want to grant an IAM entity permission to use the Lambda console, see [Using the Lambda console](https://docs.aws.amazon.com/lambda/latest/dg/security_iam_id-based-policy-examples.html#security_iam_id-based-policy-examples-console) in the AWS Lambda Developer Guide.

Additionally, if you want the user to be able to access and deploy the Ground Truth starter pre-annotation and post-annotation functions using the AWS Serverless Application Repository in the Lambda console, you must specify the *`<aws-region>`* where you want to deploy the functions (this should be the same AWS Region used to create the labeling job), and add the following policy to the IAM role.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "serverlessrepo:ListApplicationVersions",
                "serverlessrepo:GetApplication",
                "serverlessrepo:CreateCloudFormationTemplate"
            ],
            "Resource": "arn:aws:serverlessrepo:us-east-1:838997950401:applications/aws-sagemaker-ground-truth-recipe"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "serverlessrepo:SearchApplications",
            "Resource": "*"
        }
    ]
}
```

------

**Policies to See Lambda Functions in the Ground Truth Console**

To grant an IAM entity permission to view Lambda functions in the Ground Truth console when the user is creating a custom labeling job, the entity must have the permissions described in [Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console](sms-security-permission-console-access.md), including the permissions described in the section [Custom Labeling Workflow Permissions](sms-security-permission-console-access.md#sms-security-permissions-custom-workflow).

## Grant IAM Execution Role Permission to Invoke AWS Lambda Functions
<a name="sms-custom-templates-step3-postlambda-execution-role-perms"></a>

If you add the IAM managed policy [AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution) to the IAM execution role used to create the labeling job, this role has permission to list and invoke Lambda functions with one of the following strings in the function name: `GtRecipe`, `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction`. 

If the pre-annotation or post-annotation Lambda function names do not include one of the terms in the preceding paragraph, or if you require more granular permission than those in the `AmazonSageMakerGroundTruthExecution` managed policy, you can add a policy similar to the following to give the execution role permission to invoke pre-annotation and post-annotation functions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "lambda:InvokeFunction",
            "Resource": [
                "arn:aws:lambda:us-east-1:111122223333:function:<pre-annotation-lambda-name>",
                "arn:aws:lambda:us-east-1:111122223333:function:<post-annotation-lambda-name>"
            ]
        }
    ]
}
```

------

## Grant Post-Annotation Lambda Permissions to Access Annotation
<a name="sms-custom-templates-step3-postlambda-perms"></a>

As described in [Post-annotation Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda), the post-annotation Lambda request includes the location of the annotation data in Amazon S3. This location is identified by the `s3Uri` string in the `payload` object. To process the annotations as they come in, even for a simple pass through function, you need to assign the necessary permissions to the post-annotation [Lambda execution role](https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html) to read files from the Amazon S3.

There are many ways that you can configure your Lambda to access annotation data in Amazon S3. Two common ways are:
+ Allow the Lambda execution role to assume the SageMaker AI execution role identified in `roleArn` in the post-annotation Lambda request. This SageMaker AI execution role is the one used to create the labeling job, and has access to the Amazon S3 output bucket where the annotation data is stored.
+ Grant the Lambda execution role permission to access the Amazon S3 output bucket directly.

Use the following sections to learn how to configure these options. 

**Grant Lambda Permission to Assume SageMaker AI Execution Role**

To allow a Lambda function to assume a SageMaker AI execution role, you must attach a policy to the Lambda function's execution role, and modify the trust relationship of the SageMaker AI execution role to allow Lambda to assume it.

1. [Attach the following IAM policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) to your Lambda function's execution role to assume the SageMaker AI execution role identified in `Resource`. Replace `222222222222` with an [AWS account ID](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html). Replace `sm-execution-role` with the name of the assumed role.

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": {
           "Effect": "Allow",
           "Action": "sts:AssumeRole",
           "Resource": "arn:aws:iam::222222222222:role/sm-execution-role"
       }
   }
   ```

------

1. [Modify the trust policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/roles-managingrole-editing-console.html#roles-managingrole_edit-trust-policy) of the SageMaker AI execution role to include the following `Statement`. Replace `222222222222` with an [AWS account ID](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html). Replace `my-lambda-execution-role` with the name of the assumed role.

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Principal": {
                   "AWS": "arn:aws:iam::222222222222:role/my-lambda-execution-role"
               },
               "Action": "sts:AssumeRole"
           }
       ]
   }
   ```

------

**Grant Lambda Execution Role Permission to Access S3**

You can add a policy similar to the following to the post-annotation Lambda function execution role to give it S3 read permissions. Replace *amzn-s3-demo-bucket* with the name of the output bucket you specify when you create a labeling job.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        }
    ]
}
```

------

To add S3 read permissions to a Lambda execution role in the Lambda console, use the following procedure. 

**Add S3 read permissions to post-annotation Lambda:**

1. Open the [**Functions** page](https://console.aws.amazon.com/lambda/home#/functions) in the Lambda console.

1. Choose the name of the post-annotation function.

1. Choose **Configuration** and then choose **Permissions**.

1. Select the **Role name** and the summary page for that role opens in the IAM console in a new tab. 

1. Select **Attach policies**.

1. Do one of the following:
   + Search for and select **`AmazonS3ReadOnlyAccess`** to give the function permission to read all buckets and objects in the account. 
   + If you require more granular permissions, select **Create policy** and use the policy example in the preceding section to create a policy. Note that you must navigate back to the execution role summary page after you create the policy.

1. If you used the `AmazonS3ReadOnlyAccess` managed policy, select **Attach policy**. 

   If you created a new policy, navigate back to the Lambda execution role summary page and attach the policy you just created.

# Create Lambda functions using Ground Truth templates
<a name="sms-custom-templates-step3-lambda-create"></a>

You can create a Lambda function using the Lambda console, the AWS CLI, or an AWS SDK in a supported programming language of your choice. Use the AWS Lambda Developer Guide to learn more about each of these options:
+ To learn how to create a Lambda function using the console, see [Create a Lambda function with the console](https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html).
+ To learn how to create a Lambda function using the AWS CLI, see [Using AWS Lambda with the AWS Command Line Interface](https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-awscli.html).
+ Select the relevant section in the table of contents to learn more about working with Lambda in the language of your choice. For example, select [Working with Python](https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html) to learn more about using Lambda with the AWS SDK for Python (Boto3).

Ground Truth provides pre-annotation and post-annotation templates through an AWS Serverless Application Repository (SAR) *recipe*. Use the following procedure to select the Ground Truth recipe in the Lambda console.

**Use the Ground Truth SAR recipe to create pre-annotation and post-annotation Lambda functions:**

1. Open the [**Functions** page](https://console.aws.amazon.com/lambda/home#/functions) on the Lambda console.

1. Select **Create function**.

1. Select **Browse serverless app repository**.

1. In the search text box, enter **aws-sagemaker-ground-truth-recipe** and select that app.

1. Select **Deploy**. The app may take a couple of minutes to deploy. 

   Once the app deploys, two functions appear in the **Functions** section of the Lambda console: `serverlessrepo-aws-sagema-GtRecipePreHumanTaskFunc-<id>` and `serverlessrepo-aws-sagema-GtRecipeAnnotationConsol-<id>`. 

1. Select one of these functions and add your custom logic in the **Code** section.

1. When you are finished making changes, select **Deploy** to deploy them.

# Test pre-annotation and post-annotation Lambda functions
<a name="sms-custom-templates-step3-lambda-test"></a>

You can test your pre-annotation and post annotation Lambda functions in the Lambda console. If you are a new user of Lambda, you can learn how to test, or *invoke*, your Lambda functions in the console using the [Create a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html#gettingstarted-zip-function) tutorial with the console in the AWS Lambda Developer Guide. You can use the sections on this page to learn how to test the Ground Truth pre-annotation and post-annotation templates provided through an AWS Serverless Application Repository (SAR). 

**Topics**
+ [Prerequisites](#sms-custom-templates-step3-lambda-test-pre)
+ [Test the Pre-annotation Lambda Function](#sms-custom-templates-step3-lambda-test-pre-annotation)
+ [Test the Post-Annotation Lambda Function](#sms-custom-templates-step3-lambda-test-post-annotation)

## Prerequisites
<a name="sms-custom-templates-step3-lambda-test-pre"></a>

You must do the following to use the tests described on this page.
+ You need access to the Lambda console, and you need permission to create and invoke Lambda functions. To learn how to set up these permissions, see [Grant Permission to Create and Select an AWS Lambda Function](sms-custom-templates-step3-lambda-permissions.md#sms-custom-templates-step3-postlambda-create-perms).
+ If you have not deployed the Ground Truth SAR recipe, use the procedure in [Create Lambda functions using Ground Truth templates](sms-custom-templates-step3-lambda-create.md) to do so.
+ To test the post-annotation Lambda function, you must have a data file in Amazon S3 with sample annotation data. For a simple test, you can copy and paste the following code into a file and save it as `sample-annotations.json` and [upload this file to Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html). Note the S3 URI of this file—you need this information to configure the post-annotation Lambda test.

  ```
  [{"datasetObjectId":"0","dataObject":{"content":"To train a machine learning model, you need a large, high-quality, labeled dataset. Ground Truth helps you build high-quality training datasets for your machine learning models."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":8,\"label\":\"verb\",\"startOffset\":3},{\"endOffset\":27,\"label\":\"adjective\",\"startOffset\":11},{\"endOffset\":33,\"label\":\"object\",\"startOffset\":28},{\"endOffset\":51,\"label\":\"adjective\",\"startOffset\":46},{\"endOffset\":65,\"label\":\"adjective\",\"startOffset\":53},{\"endOffset\":74,\"label\":\"adjective\",\"startOffset\":67},{\"endOffset\":82,\"label\":\"adjective\",\"startOffset\":75},{\"endOffset\":102,\"label\":\"verb\",\"startOffset\":97},{\"endOffset\":112,\"label\":\"verb\",\"startOffset\":107},{\"endOffset\":125,\"label\":\"adjective\",\"startOffset\":113},{\"endOffset\":134,\"label\":\"adjective\",\"startOffset\":126},{\"endOffset\":143,\"label\":\"object\",\"startOffset\":135},{\"endOffset\":169,\"label\":\"adjective\",\"startOffset\":153},{\"endOffset\":176,\"label\":\"object\",\"startOffset\":170}]}}"}}]},{"datasetObjectId":"1","dataObject":{"content":"Sift 3 cups of flour into the bowl."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":4,\"label\":\"verb\",\"startOffset\":0},{\"endOffset\":6,\"label\":\"number\",\"startOffset\":5},{\"endOffset\":20,\"label\":\"object\",\"startOffset\":15},{\"endOffset\":34,\"label\":\"object\",\"startOffset\":30}]}}"}}]},{"datasetObjectId":"2","dataObject":{"content":"Jen purchased 10 shares of the stock on Janurary 1st, 2020."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":3,\"label\":\"person\",\"startOffset\":0},{\"endOffset\":13,\"label\":\"verb\",\"startOffset\":4},{\"endOffset\":16,\"label\":\"number\",\"startOffset\":14},{\"endOffset\":58,\"label\":\"date\",\"startOffset\":40}]}}"}}]},{"datasetObjectId":"3","dataObject":{"content":"The narrative was interesting, however the character development was weak."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":29,\"label\":\"adjective\",\"startOffset\":18},{\"endOffset\":73,\"label\":\"adjective\",\"startOffset\":69}]}}"}}]}]
  ```
+ You must use the directions in [Grant Post-Annotation Lambda Permissions to Access Annotation](sms-custom-templates-step3-lambda-permissions.md#sms-custom-templates-step3-postlambda-perms) to give your post-annotation Lambda function's execution role permission to assume the SageMaker AI execution role you use to create the labeling job. The post-annotation Lambda function uses the SageMaker AI execution role to access the annotation data file, `sample-annotations.json`, in S3.



## Test the Pre-annotation Lambda Function
<a name="sms-custom-templates-step3-lambda-test-pre-annotation"></a>

Use the following procedure to test the pre-annotation Lambda function created when you deployed the Ground Truth AWS Serverless Application Repository (SAR) recipe. 

**Test the Ground Truth SAR recipe pre-annotation Lambda function**

1. Open the [**Functions** page](https://console.aws.amazon.com/lambda/home#/functions) in the Lambda console.

1. Select the pre-annotation function that was deployed from the Ground Truth SAR recipe. The name of this function is similar to `serverlessrepo-aws-sagema-GtRecipePreHumanTaskFunc-<id>`.

1. In the **Code source** section, select the arrow next to **Test**.

1. Select **Configure test event**.

1. Keep the **Create new test event** option selected.

1. Under **Event template**, select **SageMaker Ground Truth PreHumanTask**. 

1. Give your test an **Event name**.

1. Select **Create**.

1. Select the arrow next to **Test** again and you should see that the test you created is selected, which is indicated with a dot by the event name. If it is not selected, select it. 

1. Select **Test** to run the test. 

After you run the test, you can see the **Execution results**. In the **Function logs**, you should see a response similar to the following:

```
START RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f Version: $LATEST
Received event: {
  "version": "2018-10-16",
  "labelingJobArn": "arn:aws:sagemaker:us-east-2:123456789012:labeling-job/example-job",
  "dataObject": {
    "source-ref": "s3://sagemakerexample/object_to_annotate.jpg"
  }
}
{'taskInput': {'taskObject': 's3://sagemakerexample/object_to_annotate.jpg'}, 'isHumanAnnotationRequired': 'true'}
END RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f
REPORT RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f	Duration: 0.42 ms	Billed Duration: 1 ms	Memory Size: 128 MB	Max Memory Used: 43 MB
```

In this response, we can see the Lambda function's output matches the required pre-annotation response syntax:

```
{'taskInput': {'taskObject': 's3://sagemakerexample/object_to_annotate.jpg'}, 'isHumanAnnotationRequired': 'true'}
```

## Test the Post-Annotation Lambda Function
<a name="sms-custom-templates-step3-lambda-test-post-annotation"></a>

Use the following procedure to test the post-annotation Lambda function created when you deployed the Ground Truth AWS Serverless Application Repository (SAR) recipe. 

**Test the Ground Truth SAR recipe post-annotation Lambda**

1. Open the [**Functions** page](https://console.aws.amazon.com/lambda/home#/functions) in the Lambda console.

1. Select the post-annotation function that was deployed from the Ground Truth SAR recipe. The name of this function is similar to `serverlessrepo-aws-sagema-GtRecipeAnnotationConsol-<id>`.

1. In the **Code source** section, select the arrow next to **Test**.

1. Select **Configure test event**.

1. Keep the **Create new test event** option selected.

1. Under **Event template**, select **SageMaker Ground Truth AnnotationConsolidation**.

1. Give your test an **Event name**.

1. Modify the template code provided as follows:
   + Replace the Amazon Resource Name (ARN) in `roleArn` with the ARN of the SageMaker AI execution role you used to create the labeling job.
   + Replace the S3 URI in `s3Uri` with the URI of the `sample-annotations.json` file you added to Amazon S3.

   After you make these modifications, your test should look similar to the following:

   ```
   {
     "version": "2018-10-16",
     "labelingJobArn": "arn:aws:sagemaker:us-east-2:123456789012:labeling-job/example-job",
     "labelAttributeName": "example-attribute",
     "roleArn": "arn:aws:iam::222222222222:role/sm-execution-role",
     "payload": {
       "s3Uri": "s3://your-bucket/sample-annotations.json"
     }
   }
   ```

1. Select **Create**.

1. Select the arrow next to **Test** again and you should see that the test you created is selected, which is indicated with a dot by the event name. If it is not selected, select it. 

1. Select the **Test** to run the test. 

After you run the test, you should see a `-- Consolidated Output --` section in the **Function Logs**, which contains a list of all annotations included in `sample-annotations.json`.

# Demo template: Annotation of images with `crowd-bounding-box`
<a name="sms-custom-templates-step2-demo1"></a>

When you chose to use a custom template as your task type in the Amazon SageMaker Ground Truth console, you reach the **Custom labeling task panel**. There you can choose from multiple base templates. The templates represent some of the most common tasks and provide a sample to work from as you create your customized labeling task's template. If you are not using the console, or as an additional recourse, see [Amazon SageMaker AI Ground Truth Sample Task UIs ](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis) for a repository of demo templates for a variety of labeling job task types.

This demonstration works with the **BoundingBox** template. The demonstration also works with the AWS Lambda functions needed for processing your data before and after the task. In the Github repository above, to find templates that work with AWS Lambda functions, look for `{{ task.input.<property name> }}` in the template.

**Topics**
+ [Starter Bounding Box custom template](#sms-custom-templates-step2-demo1-base-template)
+ [Your own Bounding Box custom template](#sms-custom-templates-step2-demo1-your-own-template)
+ [Your manifest file](#sms-custom-templates-step2-demo1-manifest)
+ [Your pre-annotation Lambda function](#sms-custom-templates-step2-demo1-pre-annotation)
+ [Your post-annotation Lambda function](#sms-custom-templates-step2-demo1-post-annotation)
+ [The output of your labeling job](#sms-custom-templates-step2-demo1-job-output)

## Starter Bounding Box custom template
<a name="sms-custom-templates-step2-demo1-base-template"></a>

This is the starter bounding box template that is provided.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<crowd-form>
  <crowd-bounding-box
    name="boundingBox"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="{{ task.input.header }}"
    labels="{{ task.input.labels | to_json | escape }}"
  >

    <!-- The <full-instructions> tag is where you will define the full instructions of your task. -->
    <full-instructions header="Bounding Box Instructions" >
      <p>Use the bounding box tool to draw boxes around the requested target of interest:</p>
      <ol>
        <li>Draw a rectangle using your mouse over each instance of the target.</li>
        <li>Make sure the box does not cut into the target, leave a 2 - 3 pixel margin</li>
        <li>
          When targets are overlapping, draw a box around each object,
          include all contiguous parts of the target in the box.
          Do not include parts that are completely overlapped by another object.
        </li>
        <li>
          Do not include parts of the target that cannot be seen,
          even though you think you can interpolate the whole shape of the target.
        </li>
        <li>Avoid shadows, they're not considered as a part of the target.</li>
        <li>If the target goes off the screen, label up to the edge of the image.</li>
      </ol>
    </full-instructions>

    <!-- The <short-instructions> tag allows you to specify instructions that are displayed in the left hand side of the task interface.
    It is a best practice to provide good and bad examples in this section for quick reference. -->
    <short-instructions>
      Use the bounding box tool to draw boxes around the requested target of interest.
    </short-instructions>
  </crowd-bounding-box>
</crowd-form>
```

The custom templates use the [Liquid template language](https://shopify.github.io/liquid/), and each of the items between double curly braces is a variable. The pre-annotation AWS Lambda function should provide an object named `taskInput` and that object's properties can be accessed as `{{ task.input.<property name> }}` in your template.

## Your own Bounding Box custom template
<a name="sms-custom-templates-step2-demo1-your-own-template"></a>

As an example, assume you have a large collection of animal photos in which you know the kind of animal in an image from a prior image-classification job. Now you want to have a bounding box drawn around it.

In the starter sample, there are three variables: `taskObject`, `header`, and `labels`.

Each of these would be represented in different parts of the bounding box.
+ `taskObject` is an HTTP(S) URL or S3 URI for the photo to be annotated. The added `| grant_read_access` is a filter that will convert an S3 URI to an HTTPS URL with short-lived access to that resource. If you're using an HTTP(S) URL, it's not needed.
+ `header` is the text above the photo to be labeled, something like "Draw a box around the bird in the photo."
+ `labels` is an array, represented as `['item1', 'item2', ...]`. These are labels that can be assigned by the worker to the different boxes they draw. You can have one or many.

Each of the variable names come from the JSON object in the response from your pre-annotation Lambda, The names above are merely suggested, Use whatever variable names make sense to you and will promote code readability among your team.

**Only use variables when necessary**  
If a field will not change, you can remove that variable from the template and replace it with that text, otherwise you have to repeat that text as a value in each object in your manifest or code it into your pre-annotation Lambda function.

**Example : Final Customized Bounding Box Template**  
To keep things simple, this template will have one variable, one label, and very basic instructions. Assuming your manifest has an "animal" property in each data object, that value can be re-used in two parts of the template.  

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
  <crowd-bounding-box
    name="boundingBox"
    labels="[ '{{ task.input.animal }}' ]"
    src="{{ task.input.source-ref | grant_read_access }}"
    header="Draw a box around the {{ task.input.animal }}."
  >
    <full-instructions header="Bounding Box Instructions" >
      <p>Draw a bounding box around the {{ task.input.animal }} in the image. If 
      there is more than one {{ task.input.animal }} per image, draw a bounding 
      box around the largest one.</p>
      <p>The box should be tight around the {{ task.input.animal }} with 
      no more than a couple of pixels of buffer around the 
      edges.</p>
      <p>If the image does not contain a {{ task.input.animal }}, check the <strong>
      Nothing to label</strong> box.
    </full-instructions>
    <short-instructions>
      <p>Draw a bounding box around the {{ task.input.animal }} in each image. If 
      there is more than one {{ task.input.animal }} per image, draw a bounding 
      box around the largest one.</p>
    </short-instructions>
  </crowd-bounding-box>
</crowd-form>
```
Note the re-use of `{{ task.input.animal }}` throughout the template. If your manifest had all of the animal names beginning with a capital letter, you could use `{{ task.input.animal | downcase }}`, incorporating one of Liquid's built-in filters in sentences where it needed to be presented lowercase.

## Your manifest file
<a name="sms-custom-templates-step2-demo1-manifest"></a>

Your manifest file should provide the variable values you're using in your template. You can do some transformation of your manifest data in your pre-annotation Lambda, but if you don't need to, you maintain a lower risk of errors and your Lambda will run faster. Here's a sample manifest file for the template.

```
{"source-ref": "<S3 image URI>", "animal": "horse"}
{"source-ref": "<S3 image URI>", "animal" : "bird"}
{"source-ref": "<S3 image URI>", "animal" : "dog"}
{"source-ref": "<S3 image URI>", "animal" : "cat"}
```

## Your pre-annotation Lambda function
<a name="sms-custom-templates-step2-demo1-pre-annotation"></a>

As part of the job set-up, provide the ARN of an AWS Lambda function that can be called to process your manifest entries and pass them to the template engine.

**Naming your Lambda function**  
The best practice in naming your function is to use one of the following four strings as part of the function name: `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction`. This applies to both your pre-annotation and post-annotation functions.

When you're using the console, if you have AWS Lambda functions that are owned by your account, a drop-down list of functions meeting the naming requirements will be provided to choose one.

In this very basic example, you're just passing through the information from the manifest without doing any additional processing on it. This sample pre-annotation function is written for Python 3.7.

```
import json

def lambda_handler(event, context):
    return {
        "taskInput": event['dataObject']
    }
```

The JSON object from your manifest will be provided as a child of the `event` object. The properties inside the `taskInput` object will be available as variables to your template, so simply setting the value of `taskInput` to `event['dataObject']` will pass all the values from your manifest object to your template without having to copy them individually. If you wish to send more values to the template, you can add them to the `taskInput` object.

## Your post-annotation Lambda function
<a name="sms-custom-templates-step2-demo1-post-annotation"></a>

As part of the job set-up, provide the ARN of an AWS Lambda function that can be called to process the form data when a worker completes a task. This can be as simple or complex as you want. If you want to do answer consolidation and scoring as it comes in, you can apply the scoring and/or consolidation algorithms of your choice. If you want to store the raw data for offline processing, that is an option.

**Provide permissions to your post-annotation Lambda**  
The annotation data will be in a file designated by the `s3Uri` string in the `payload` object. To process the annotations as they come in, even for a simple pass through function, you need to assign `S3ReadOnly` access to your Lambda so it can read the annotation files.  
In the Console page for creating your Lambda, scroll to the **Execution role** panel. Select **Create a new role from one or more templates**. Give the role a name. From the **Policy templates** drop-down, choose **Amazon S3 object read-only permissions**. Save the Lambda and the role will be saved and selected.

The following sample is in Python 2.7.

```
import json
import boto3
from urlparse import urlparse

def lambda_handler(event, context):
    consolidated_labels = []

    parsed_url = urlparse(event['payload']['s3Uri']);
    s3 = boto3.client('s3')
    textFile = s3.get_object(Bucket = parsed_url.netloc, Key = parsed_url.path[1:])
    filecont = textFile['Body'].read()
    annotations = json.loads(filecont);
    
    for dataset in annotations:
        for annotation in dataset['annotations']:
            new_annotation = json.loads(annotation['annotationData']['content'])
            label = {
                'datasetObjectId': dataset['datasetObjectId'],
                'consolidatedAnnotation' : {
                'content': {
                    event['labelAttributeName']: {
                        'workerId': annotation['workerId'],
                        'boxesInfo': new_annotation,
                        'imageSource': dataset['dataObject']
                        }
                    }
                }
            }
            consolidated_labels.append(label)
    
    return consolidated_labels
```

The post-annotation Lambda will often receive batches of task results in the event object. That batch will be the `payload` object the Lambda should iterate through. What you send back will be an object meeting the [API contract](sms-custom-templates-step3.md).

## The output of your labeling job
<a name="sms-custom-templates-step2-demo1-job-output"></a>

You'll find the output of the job in a folder named after your labeling job in the target S3 bucket you specified. It will be in a sub folder named `manifests`.

For a bounding box task, the output you find in the output manifest will look a bit like the demo below. The example has been cleaned up for printing. The actual output will be a single line per record.

**Example : JSON in your output manifest**  

```
{
  "source-ref":"<URL>",
  "<label attribute name>":
    {
       "workerId":"<URL>",
       "imageSource":"<image URL>",
       "boxesInfo":"{\"boundingBox\":{\"boundingBoxes\":[{\"height\":878, \"label\":\"bird\", \"left\":208, \"top\":6, \"width\":809}], \"inputImageProperties\":{\"height\":924, \"width\":1280}}}"},
  "<label attribute name>-metadata":
    {
      "type":"groundTruth/custom",
      "job_name":"<Labeling job name>",
      "human-annotated":"yes"
    },
  "animal" : "bird"
}
```
Note how the additional `animal` attribute from your original manifest is passed to the output manifest on the same level as the `source-ref` and labeling data. Any properties from your input manifest, whether they were used in your template or not, will be passed to the output manifest.

# Demo Template: Labeling Intents with `crowd-classifier`
<a name="sms-custom-templates-step2-demo2"></a>

If you choose a custom template, you'll reach the **Custom labeling task panel**. There you can select from multiple starter templates that represent some of the more common tasks. The templates provide a starting point to work from in building your customized labeling task's template.

In this demonstration, you work with the **Intent Detection** template, which uses the `crowd-classifier` element, and the AWS Lambda functions needed for processing your data before and after the task.

**Topics**
+ [Starter Intent Detection custom template](#sms-custom-templates-step2-demo2-base-template)
+ [Your Intent Detection custom template](#sms-custom-templates-step2-demo2-your-template)
+ [Your pre-annotation Lambda function](#sms-custom-templates-step2-demo2-pre-lambda)
+ [Your post-annotation Lambda function](#sms-custom-templates-step2-demo2-post-lambda)
+ [Your labeling job output](#sms-custom-templates-step2-demo2-job-output)

## Starter Intent Detection custom template
<a name="sms-custom-templates-step2-demo2-base-template"></a>

This is the intent detection template that is provided as a starting point.

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<crowd-form>
  <crowd-classifier
    name="intent"
    categories="{{ task.input.labels | to_json | escape }}"
    header="Pick the most relevant intention expressed by the below text"
  >
    <classification-target>
      {{ task.input.utterance }}
    </classification-target>
    
    <full-instructions header="Intent Detection Instructions">
        <p>Select the most relevant intention expressed by the text.</p>
        <div>
           <p><strong>Example: </strong>I would like to return a pair of shoes</p>
           <p><strong>Intent: </strong>Return</p>
        </div>
    </full-instructions>

    <short-instructions>
      Pick the most relevant intention expressed by the text
    </short-instructions>
  </crowd-classifier>
</crowd-form>
```

The custom templates use the [Liquid template language](https://shopify.github.io/liquid/), and each of the items between double curly braces is a variable. The pre-annotation AWS Lambda function should provide an object named `taskInput` and that object's properties can be accessed as `{{ task.input.<property name> }}` in your template.

## Your Intent Detection custom template
<a name="sms-custom-templates-step2-demo2-your-template"></a>

In the starter template, there are two variables: the `task.input.labels` property in the `crowd-classifier` element opening tag and the `task.input.utterance` in the `classification-target` region's content.

Unless you need to offer different sets of labels with different utterances, avoiding a variable and just using text will save processing time and creates less possibility of error. The template used in this demonstration will remove that variable, but variables and filters like `to_json` are explained in more detail in the [`crowd-bounding-box` demonstration]() article.

### Styling Your Elements
<a name="sms-custom-templates-step2-demo2-instructions"></a>

Two parts of these custom elements that sometimes get overlooked are the `<full-instructions>` and `<short-instructions>` regions. Good instructions generate good results.

In the elements that include these regions, the `<short-instructions>` appear automatically in the "Instructions" pane on the left of the worker's screen. The `<full-instructions>` are linked from the "View full instructions" link near the top of that pane. Clicking the link opens a modal pane with more detailed instructions.

You can not only use HTML, CSS, and JavaScript in these sections, you are encouraged to if you believe you can provide a strong set of instructions and examples that will help workers complete your tasks with better speed and accuracy. 

**Example Try out a sample with JSFiddle**  
[https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/)  
 Try out an [example `<crowd-classifier>` task](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/). The example is rendered by JSFiddle, therefore all the template variables are replaced with hard-coded values. Click the "View full instructions" link to see a set of examples with extended CSS styling. You can fork the project to experiment with your own changes to the CSS, adding sample images, or adding extended JavaScript functionality.

**Example : Final Customized Intent Detection Template**  
This uses the [example `<crowd-classifier>` task](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/), but with a variable for the `<classification-target>`. If you are trying to keep a consistent CSS design among a series of different labeling jobs, you can include an external stylesheet using a `<link rel...>` element the same way you'd do in any other HTML document.  

```
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<crowd-form>
  <crowd-classifier
    name="intent"
    categories="['buy', 'eat', 'watch', 'browse', 'leave']"
    header="Pick the most relevant intent expressed by the text below"
  >
    <classification-target>
      {{ task.input.source }}
    </classification-target>
    
    <full-instructions header="Emotion Classification Instructions">
      <p>In the statements and questions provided in this exercise, what category of action is the speaker interested in doing?</p>
          <table>
            <tr>
              <th>Example Utterance</th>
              <th>Good Choice</th>
            </tr>
            <tr>
              <td>When is the Seahawks game on?</td>
              <td>
                eat<br>
                <greenbg>watch</greenbg>
                <botchoice>browse</botchoice>
              </td>
            </tr>
            <tr>
              <th>Example Utterance</th>
              <th>Bad Choice</th>
            </tr>
            <tr>
              <td>When is the Seahawks game on?</td>
              <td>
                buy<br>
                <greenbg>eat</greenbg>
                <botchoice>watch</botchoice>
              </td>
            </tr>
          </table>
    </full-instructions>

    <short-instructions>
      What is the speaker expressing they would like to do next?
    </short-instructions>  
  </crowd-classifier>
</crowd-form>
<style>
  greenbg {
    background: #feee23;
    display: block;
  }

  table {
    *border-collapse: collapse; /* IE7 and lower */
    border-spacing: 0; 
  }

  th, tfoot, .fakehead {
    background-color: #8888ee;
    color: #f3f3f3;
    font-weight: 700;
  }

  th, td, tfoot {
      border: 1px solid blue;
  }

  th:first-child {
    border-radius: 6px 0 0 0;
  }

  th:last-child {
    border-radius: 0 6px 0 0;
  }

  th:only-child{
    border-radius: 6px 6px 0 0;
  }

  tfoot:first-child {
    border-radius: 0 0 6px 0;
  }

  tfoot:last-child {
    border-radius: 0 0 0 6px;
  }

  tfoot:only-child{
    border-radius: 6px 6px;
  }

  td {
    padding-left: 15px ;
    padding-right: 15px ;
  }

  botchoice {
    display: block;
    height: 17px;
    width: 490px;
    overflow: hidden;
    position: relative;
    background: #fff;
    padding-bottom: 20px;
  }

  botchoice:after {
    position: absolute;
    bottom: 0;
    left: 0;  
    height: 100%;
    width: 100%;
    content: "";
    background: linear-gradient(to top,
       rgba(255,255,255, 1) 55%, 
       rgba(255,255,255, 0) 100%
    );
    pointer-events: none; /* so the text is still selectable */
  }
</style>
```

**Example : Your manifest file**  
If you are preparing your manifest file manually for a text-classification task like this, have your data formatted in the following manner.  

```
{"source": "Roses are red"}
{"source": "Violets are Blue"}
{"source": "Ground Truth is the best"}
{"source": "And so are you"}
```

This differs from the manifest file used for the "[Demo template: Annotation of images with `crowd-bounding-box`](sms-custom-templates-step2-demo1.md)" demonstration in that `source-ref` was used as the property name instead of `source`. The use of `source-ref` designates S3 URIs for images or other files that must be converted to HTTP. Otherwise, `source` should be used like it is with the text strings above.

## Your pre-annotation Lambda function
<a name="sms-custom-templates-step2-demo2-pre-lambda"></a>

As part of the job set-up, provide the ARN of an AWS Lambda that can be called to process your manifest entries and pass them to the template engine. 

This Lambda function is required to have one of the following four strings as part of the function name: `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction`.

This applies to both your pre-annotation and post-annotation Lambdas.

When you're using the console, if you have Lambdas that are owned by your account, a drop-down list of functions meeting the naming requirements will be provided to choose one.

In this very basic sample, where you have only one variable, it's primarily a pass-through function. Here's a sample pre-labeling Lambda using Python 3.7.

```
import json

def lambda_handler(event, context):
    return {
        "taskInput":  event['dataObject']
    }
```

The `dataObject` property of the `event` contains the properties from a data object in your manifest.

In this demonstration, which is a simple pass through, you just pass that straight through as the `taskInput` value. If you add properties with those values to the `event['dataObject']` object, they will be available to your HTML template as Liquid variables with the format `{{ task.input.<property name> }}`.

## Your post-annotation Lambda function
<a name="sms-custom-templates-step2-demo2-post-lambda"></a>

As part of the job set up, provide the ARN of an Lambda function that can be called to process the form data when a worker completes a task. This can be as simple or complex as you want. If you want to do answer-consolidation and scoring as data comes in, you can apply the scoring or consolidation algorithms of your choice. If you want to store the raw data for offline processing, that is an option.

**Set permissions for your post-annotation Lambda function**  
The annotation data will be in a file designated by the `s3Uri` string in the `payload` object. To process the annotations as they come in, even for a simple pass through function, you need to assign `S3ReadOnly` access to your Lambda so it can read the annotation files.  
In the Console page for creating your Lambda, scroll to the **Execution role** panel. Select **Create a new role from one or more templates**. Give the role a name. From the **Policy templates** drop-down, choose **Amazon S3 object read-only permissions**. Save the Lambda and the role will be saved and selected.

The following sample is for Python 3.7.

```
import json
import boto3
from urllib.parse import urlparse

def lambda_handler(event, context):
    consolidated_labels = []

    parsed_url = urlparse(event['payload']['s3Uri']);
    s3 = boto3.client('s3')
    textFile = s3.get_object(Bucket = parsed_url.netloc, Key = parsed_url.path[1:])
    filecont = textFile['Body'].read()
    annotations = json.loads(filecont);
    
    for dataset in annotations:
        for annotation in dataset['annotations']:
            new_annotation = json.loads(annotation['annotationData']['content'])
            label = {
                'datasetObjectId': dataset['datasetObjectId'],
                'consolidatedAnnotation' : {
                'content': {
                    event['labelAttributeName']: {
                        'workerId': annotation['workerId'],
                        'result': new_annotation,
                        'labeledContent': dataset['dataObject']
                        }
                    }
                }
            }
            consolidated_labels.append(label)

    return consolidated_labels
```

## Your labeling job output
<a name="sms-custom-templates-step2-demo2-job-output"></a>

The post-annotation Lambda will often receive batches of task results in the event object. That batch will be the `payload` object the Lambda should iterate through.

You'll find the output of the job in a folder named after your labeling job in the target S3 bucket you specified. It will be in a sub folder named `manifests`.

For an intent detection task, the output in the output manifest will look a bit like the demo below. The example has been cleaned up and spaced out to be easier for humans to read. The actual output will be more compressed for machine reading.

**Example : JSON in your output manifest**  

```
[
  {
    "datasetObjectId":"<Number representing item's place in the manifest>",
     "consolidatedAnnotation":
     {
       "content":
       {
         "<name of labeling job>":
         {     
           "workerId":"private.us-east-1.XXXXXXXXXXXXXXXXXXXXXX",
           "result":
           {
             "intent":
             {
                 "label":"<label chosen by worker>"
             }
           },
           "labeledContent":
           {
             "content":"<text content that was labeled>"
           }
         }
       }
     }
   },
  "datasetObjectId":"<Number representing item's place in the manifest>",
     "consolidatedAnnotation":
     {
       "content":
       {
         "<name of labeling job>":
         {     
           "workerId":"private.us-east-1.6UDLPKQZHYWJQSCA4MBJBB7FWE",
           "result":
           {
             "intent":
             {
                 "label": "<label chosen by worker>"
             }
           },
           "labeledContent":
           {
             "content": "<text content that was labeled>"
           }
         }
       }
     }
   },
     ...
     ...
     ...
]
```

This should help you create and use your own custom template.

# Create a custom workflow using the API
<a name="sms-custom-templates-step4"></a>

When you have created your custom UI template (Step 2) and processing Lambda functions (Step 3), you should place the template in an Amazon S3 bucket with a file name format of: `<FileName>.liquid.html`. Use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) action to configure your task. You'll use the location of a custom template ([Creating a custom worker task template](sms-custom-templates-step2.md)) stored in a `<filename>.liquid.html` file on S3 as the value for the `UiTemplateS3Uri` field in the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html) object within the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html) object.

For the AWS Lambda tasks described in [Processing data in a custom labeling workflow with AWS Lambda](sms-custom-templates-step3.md), the post-annotation task's ARN will be used as the value for the `AnnotationConsolidationLambdaArn` field, and the pre-annotation task will be used as the value for the `PreHumanTaskLambdaArn.` 

# Create a Labeling Job
<a name="sms-create-labeling-job"></a>

You can create a labeling job in the Amazon SageMaker AI console and by using an AWS SDK in your preferred language to run `CreateLabelingJob`. After a labeling job has been created, you can track worker metrics (for private workforces) and your labeling job status using [CloudWatch](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-monitor-cloud-watch.html).

Before you create a labeling job it is recommended that you review the following pages, as applicable:
+ You can specify your input data using an automatic data setup in the console, or an input manifest file in either the console or when using `CreateLabelingJob` API. For automated data setup, see [Automate data setup for labeling jobs](sms-console-create-manifest-file.md). To learn how to create an input manifest file, see [Input manifest files](sms-input-data-input-manifest.md).
+ Review labeling job input data quotas: [Input Data Quotas](input-data-limits.md).

After you have chosen your task type, use the topics on this page to learn how to create a labeling job.

If you are a new Ground Truth user, we recommend that you start by walking through the demo in [Getting started: Create a bounding box labeling job with Ground Truth](sms-getting-started.md).

**Important**  
Ground Truth requires all S3 buckets that contain labeling job input image data to have a CORS policy attached. To learn more, see [CORS Requirement for Input Image Data](sms-cors-update.md).

**Topics**
+ [Built-in Task Types](sms-task-types.md)
+ [Create instruction pages](sms-creating-instruction-pages.md)
+ [Create a Labeling Job (Console)](sms-create-labeling-job-console.md)
+ [Create a Labeling Job (API)](sms-create-labeling-job-api.md)
+ [Create a streaming labeling job](sms-streaming-create-job.md)
+ [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md)

# Built-in Task Types
<a name="sms-task-types"></a>

Amazon SageMaker Ground Truth has several built-in task types. Ground Truth provides a worker task template for built-in task types. Additionally, some built in task types support [Automate data labeling](sms-automated-labeling.md). The following topics describe each built-in task type and demo the worker task templates that are provided by Ground Truth in the console. To learn how to create a labeling job in the console using one of these task types, select the task type page.


****  

| Label Images | Label Text | Label Videos and Video Frames | Label 3D Point Clouds | 
| --- | --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html)  | 

**Note**  
Each of the video frame and 3D point cloud task types has an *adjustment* task type that you use to verify and adjust labels from a previous labeling job. Select a video frame or 3D point cloud task type page above to learn how to adjust labels created using that task type. 

# Create instruction pages
<a name="sms-creating-instruction-pages"></a>

Create custom instructions for labeling jobs to improve your worker's accuracy in completing their task. You can modify the default instructions that are provided in the console or you can create your own. The instructions are shown to the worker on the page where they complete their labeling task.

There are two kinds of instructions:
+ *Short instructions*—instructions that are shown on the same webpage where the worker completes their task. These instructions should provide an easy reference to show the worker the correct way to label an object.
+ *Full instructions*—instructions that are shown on a dialog box that overlays the page where the worker completes their task. We recommend that you provide detailed instructions for completing the task with multiple examples showing edge cases and other difficult situations for labeling objects.

Create instructions in the console when you are creating your labeling job. Start with the existing instructions for the task and use the editor to modify them to suit your labeling job.

**Note**  
Once you create your labeling job, it will automatically start and you will not be able to modify your worker instructions. If you need to change your worker instructions, stop the labeling job that you created, clone it, and modify your worker instructions before creating a new job.   
You can clone a labeling job in the console by selecting the labeling job and then selecting **Clone** in the **Actions** menu.   
To clone a labeling job using the Amazon SageMaker API or your preferred Amazon SageMaker SDK, make a new request to the `CreateLabelingJob` operation with the same specifications as your original job after modifying your worker instructions. 

For 3D point cloud and video frame labeling jobs, you can add worker instructions to your label category configuration file. You can use a single string to create instructions or you can add HTML mark up to customize the appearance of your instructions and add images. Make sure that any images you include in your instructions are publicly available, or if your instructions are in Amazon S3, that your workers have read access so that they can view them. For more information about the label category configuration file, see [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md).

## Short Instructions
<a name="sms-creating-quick-instructions"></a>

Short instructions appear on the same web page that workers use to label your data object. For example, the following is the editing page for a bounding box task. The short instructions panel is on the left.

![\[\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms-instructions-10.png)


Keep in mind that a worker will only spend seconds looking at the short instructions. Workers must be able to scan and understand your information quickly. In all cases it should take less time to understand the instructions than it takes to complete the task. Keep these points in mind:
+ Your instructions should be clear and simple.
+ Pictures are better than words. Create a simple illustration of your task that your workers can immediately understand.
+ If you must use words, use short, concise examples.
+ Your short instructions are more important than your full instructions.

The Amazon SageMaker Ground Truth console provides an editor so that you can create your short instructions. Replace the placeholder text and images with instructions for your task. Preview the worker's task page by choosing **Preview**. The preview will open in a new window, be sure to turn off pop-up blocking so that the window will show.

## Full Instructions
<a name="sms-creating-full-instructions"></a>

You can provide additional instructions for your workers in a dialog box that overlays the page where workers label your data objects. Use full instructions to explain more complex tasks and to show workers the proper way to label edge cases or other difficult objects.

You can create full instructions using an editor in the Ground Truth console. As with quick instructions, keep the following in mind:
+ Workers will want detailed instruction the first few times that the complete your task. Any information that they *must* have should be in the quick instructions.
+ Pictures are more important than words.
+ Text should be concise.
+ Full instructions should supplement the short instructions. Don't repeat information that appears in the short instructions.

The Ground Truth console provides an editor so that you can create your full instructions. Replace the placeholder text and images with instructions for your task. Preview the full instruction page by choosing **Preview**. The preview will open in a new window, be sure to turn off pop-up blocking so that the window will show.

## Add example images to your instructions
<a name="sms-using-s3-images"></a>

Images provide useful examples for your workers. To add a publicly accessible image to your instructions:
+ Place the cursor where the image should go in the instructions editor.
+ Click the image icon in the editor toolbar.
+ Enter the URL of your image.

If your instruction image in Amazon S3 is not publicly accessible:
+ As the image URL, enter: `{{ 'https://s3.amazonaws.com/your-bucket-name/image-file-name' | grant_read_access }}`.
+ This renders the image URL with a short-lived, one-time access code appended so the worker's browser can display it. A broken image icon is displayed in the instructions editor, but previewing the tool displays the image in the rendered preview.

# Create a Labeling Job (Console)
<a name="sms-create-labeling-job-console"></a>

You can use the Amazon SageMaker AI console to create a labeling job for all of the Ground Truth built-in task types and custom labeling workflows. For built-in task types, we recommend that you use this page alongside the [page for your task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html). Each task type page includes specific details on creating a labeling job using that task type. 

You need to provide the following to create a labeling job in the SageMaker AI console: 
+ An input manifest file in Amazon S3. You can place your input dataset in Amazon S3 and automatically generate a manifest file using the Ground Truth console (not supported for 3D point cloud labeling jobs). 

  Alternatively, you can manually create an input manifest file. To learn how, see [Input data](sms-data-input.md).
+ An Amazon S3 bucket to store your output data.
+ An IAM role with permission to access your resources in Amazon S3 and with a SageMaker AI execution policy attached. For a general solution, you can attach the managed policy, AmazonSageMakerFullAccess, to an IAM role and include `sagemaker` in your bucket name. 

  For more granular policies, see [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

  3D point cloud task types have additional security considerations. [Learn more](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-general-information.html#sms-security-permission-3d-point-cloud). 
+ A work team. You create a work team from a workforce made up of Amazon Mechanical Turk workers, vendors, or your own private workers.To lean more, see [Workforces](sms-workforce-management.md).

  You cannot use the Mechanical Turk workforce for 3D point cloud or video frame labeling jobs. 
+ If you are using a custom labeling workflow, you must save a worker task template in Amazon S3 and provide an Amazon S3 URI for that template. For more information, see [Creating a custom worker task template](sms-custom-templates-step2.md).
+ (Optional) An AWS KMS key ARN if you want SageMaker AI to encrypt the output of your labeling job using your own AWS KMS encryption key instead of the default Amazon S3 service key.
+ (Optional) Existing labels for the dataset you use for your labeling job. Use this option if you want workers to adjust, or approve and reject labels.
+ If you want to create an adjustment or verification labeling job, you must have an output manifest file in Amazon S3 that contains the labels you want adjusted or verified. This option is only supported for bounding box and semantic segmentation image labeling jobs and 3D point cloud and video frame labeling jobs. It is recommended that you use the instructions on [Label verification and adjustment](sms-verification-data.md) to create a verification or adjustment labeling job. 

**Important**  
Your work team, input manifest file, output bucket, and other resources in Amazon S3 must be in the same AWS Region you use to create your labeling job. 

When you create a labeling job using the SageMaker AI console, you add worker instructions and labels to the worker UI that Ground Truth provides. You can preview and interact with the worker UI while creating your labeling job in the console. You can also see a preview of the worker UI on your [built-in task type page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html).

**To create a labeling job (console)**

1. Sign in to the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 

1. In the left navigation pane, choose **Labeling jobs**. 

1. On the **Labeling jobs** page, choose **Create labeling job**.

1. For **Job name**, enter a name for your labeling job.

1. (Optional) If you want to identify your labels with a key, select **I want to specify a label attribute name different from the labeling job name**. If you do not select this option, the labeling job name you specified in the previous step will be used to identify your labels in your output manifest file. 

1. Choose a data setup to create a connection between your input dataset and Ground Truth. 
   + For **Automated data setup**:
     + Follow the instructions in [Automate data setup for labeling jobs](sms-console-create-manifest-file.md) for image, text, and video clip labeling jobs.
     + Follow the instructions in [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md) for video frame labeling jobs. 
   + For **Manual data setup**:
     + For **Input dataset location**, provide the location in Amazon S3 in which your input manifest file is located. For example, if your input manifest file, manifest.json, is located in **example-bucket**, enter **s3://example-bucket/manifest.json**.
     + For **Output dataset location**, provide the location in Amazon S3 where you want Ground Truth to store the output data from your labeling job. 

1. For **IAM Role**, choose an existing IAM role or create an IAM role with permission to access your resources in Amazon S3, to write to the output Amazon S3 bucket specified above, and with a SageMaker AI execution policy attached. 

1. (Optional) For **Additional configuration**, you can specify how much of your dataset you want workers to label, and if you want SageMaker AI to encrypt the output data for your labeling job using an AWS KMS encryption key. To encrypt your output data, you must have the required AWS KMS permissions attached to the IAM role you provided in the previous step. For more details, see [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md). 

1. In the **Task type** section, under **Task category**, use the dropdown list to select your task category. 

1. In **Task selection**, choose your task type. 

1. (Optional) Provide tags for your labeling job to make it easier to find in the console later. 

1. Choose **Next**. 

1. In the **Workers** section, choose the type of workforce you would like to use. For more details about your workforce options see [Workforces](sms-workforce-management.md).

1. (Optional) After you've selected your workforce, specify the **Task timeout**. This is the maximum amount of time a worker has to work on a task.

   For 3D point cloud annotation tasks, the default task timeout is 3 days. The default timeout for text and image classification and label verification labeling jobs is 5 minutes. The default timeout for all other labeling jobs is 60 minutes.

1. (Optional) For bounding box, semantic segmentation, video frame, and 3D point cloud task types, you can select **Display existing labels** if you want to display labels for your input data set for workers to verify or adjust.

   For bounding box and semantic segmentation labeling jobs, this will create an adjustment labeling job.

   For 3D point cloud and video frame labeling jobs:
   + Select **Adjustment** to create an adjustment labeling job. When you select this option, you can add new labels but you cannot remove or edit existing labels from the previous job. Optionally, you can choose label category attributes and frame attributes that you want workers to edit. To make an attribute editable, select the check box **Allow workers to edit this attribute** for that attribute.

     Optionally, you can add new label category and frame attributes. 
   + Select **Verification** to create an adjustment labeling job. When you select this option, you cannot add, modify, or remove existing labels from the previous job. Optionally, you can choose label category attributes and frame attributes that you want workers to edit. To make an attribute editable, select the check box **Allow workers to edit this attribute** for that attribute. 

     We recommend that you can add new label category attributes to the labels that you want workers to verify, or add one or more frame attributes to have workers provide information about the entire frame.

    For more information, see [Label verification and adjustment](sms-verification-data.md).

1. Configure your workers' UI:
   + If you are using a [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html), specify workers instructions and labels. 
     + For image classification and text classification (single and multi-label) you must specify at least two label categories. For all other built-in task types, you must specify at least one label category. 
     + (Optional) If you are creating a 3D point cloud or video frame labeling job, you can specify label category attributes (not supported for 3D point cloud semantic segmentation) and frame attributes. Label category attributes can be assigned to one or more labels. Frame attributes will appear on each point cloud or video frame workers label. To learn more, see [Worker user interface (UI)](sms-point-cloud-general-information.md#sms-point-cloud-worker-task-ui) for 3D point cloud and [Worker user interface (UI)](sms-video-overview.md#sms-video-worker-task-ui) for video frame. 
     + (Optional) Add **Additional instructions** to help your worker complete your task.
   + If you are creating a custom labeling workflow you must :
     + Enter a [custom template](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates-step2.html) in the code box. Custom templates can be created using a combination of HTML, the Liquid templating language and our pre-built web components. Optionally, you can choose a base-template from the drop-down menu to get started. 
     + Specify pre-annotation and post-annotation lambda functions. To learn how to create these functions, see [Processing data in a custom labeling workflow with AWS Lambda](sms-custom-templates-step3.md).

1. (Optional) You can select **See preview** to preview your worker instructions, labels, and interact with the worker UI. Make sure the pop-up blocker of the browser is disabled before generating the preview.

1. Choose **Create**.

After you've successfully created your labeling job, you are redirected to the **Labeling jobs** page. The status of the labeling job you just created is **In progress**. This status progressively updates as workers complete your tasks. When all tasks are successfully completed, the status changes to **Completed**.

If an issue occurs while creating the labeling job, its status changes to **Failed**.

To view more details about the job, choose the labeling job name. 

## Next Steps
<a name="sms-create-labeling-job-console-next-steps"></a>

After your labeling job status changes to **Completed**, you can view your output data in the Amazon S3 bucket that you specified while creating that labeling job. For details about the format of your output data, see [Labeling job output data](sms-data-output.md).

# Create a Labeling Job (API)
<a name="sms-create-labeling-job-api"></a>

To create a labeling job using the Amazon SageMaker API, you use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. For specific instructions on creating a labeling job for a built-in task type, see that [task type page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html). To learn how to create a streaming labeling job, which is a labeling job that runs perpetually, see [Create a streaming labeling job](sms-streaming-create-job.md).

To use the `CreateLabelingJob` operation, you need the following:
+ A worker task template (`UiTemplateS3Uri`) or human task UI ARN (`[HumanTaskUiArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html#sagemaker-Type-UiConfig-HumanTaskUiArn)`) in Amazon S3. 
  + For 3D point cloud jobs, video object detection and tracking jobs, and NER jobs, use the ARN listed in `HumanTaskUiArn` for your task type.
  + If you are using a built-in task type other than 3D point cloud tasks, you can add your worker instructions to one of the pre-built templates and save the template (using a .html or .liquid extension) in your S3 bucket. Find the pre-build templates on your [task type page](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html).
  + If you are using a custom labeling workflow, you can create a custom template and save the template in your S3 bucket. To learn how to built a custom worker template, see [Creating a custom worker task template](sms-custom-templates-step2.md). For custom HTML elements that you can use to customize your template, see [Crowd HTML Elements Reference](sms-ui-template-reference.md). For a repository of demo templates for a variety of labeling tasks, see [Amazon SageMaker Ground Truth Sample Task UIs ](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis).
+ An input manifest file that specifies your input data in Amazon S3. Specify the location of your input manifest file in `ManifestS3Uri`. For information about creating an input manifest, see [Input data](sms-data-input.md). If you create a streaming labeling job, this is optional. To learn how to create a streaming labeling job, see [Create a streaming labeling job](sms-streaming-create-job.md).
+ An Amazon S3 bucket to store your output data. You specify this bucket, and optionally, a prefix in `S3OutputPath`.
+ A label category configuration file. Each label category name must be unique. Specify the location of this file in Amazon S3 using the `LabelCategoryConfigS3Uri` parameter. The format and label categories for this file depend on the task type you use:
  + For image classification and text classification (single and multi-label) you must specify at least two label categories. For all other task types, the minimum number of label categories required is one. 
  + For named entity recognition tasks, you must provide worker instructions in this file. See [Provide Worker Instructions in a Label Category Configuration File](sms-named-entity-recg.md#worker-instructions-ner) for details and an example.
  + For 3D point cloud and video frame task type, use the format in [Labeling category configuration file with label category and frame attributes reference](sms-label-cat-config-attributes.md).
  + For all other built-in task types and custom tasks, your label category configuration file must be a JSON file in the following format. Identify the labels you want to use by replacing `label_1`, `label_2`,`...`,`label_n` with your label categories. 

    ```
    {
        "document-version": "2018-11-28",
        "labels": [
            {"label": "label_1"},
            {"label": "label_2"},
            ...
            {"label": "label_n"}
        ]
    }
    ```
+ An AWS Identity and Access Management (IAM) role with the [AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution) managed IAM policy attached and with permissions to access your S3 buckets. Specify this role in `RoleArn`. To learn more about this policy, see [Use IAM Managed Policies with Ground Truth](sms-security-permissions-get-started.md). If you require more granular permissions, see [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md).

  If your input or output bucket name does not contain `sagemaker`, you can attach a policy similar to the following to the role that is passed to the `CreateLabelingJob` operation.

------
#### [ JSON ]

****  

  ```
  {
      "Version":"2012-10-17",		 	 	 
      "Statement": [
          {
              "Effect": "Allow",
              "Action": [
                  "s3:GetObject"
              ],
              "Resource": [
                  "arn:aws:s3:::my_input_bucket/*"
              ]
          },
          {
              "Effect": "Allow",
              "Action": [
                  "s3:PutObject"
              ],
              "Resource": [
                  "arn:aws:s3:::my_output_bucket/*"
              ]
          }
      ]
  }
  ```

------
+ A pre-annotation and post-annotation (or annotation-consolidation) AWS Lambda function Amazon Resource Name (ARN) to process your input and output data. 
  + Lambda functions are predefined in each AWS Region for built-in task types. To find the pre-annotation Lambda ARN for your Region, see [PreHumanTaskLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_HumanTaskConfig.html#SageMaker-Type-HumanTaskConfig-PreHumanTaskLambdaArn). To find the annotation-consolidation Lambda ARN for your Region, see [AnnotationConsolidationLambdaArn](https://docs.aws.amazon.com/sagemaker/latest/dg/API_AnnotationConsolidationConfig.html#SageMaker-Type-AnnotationConsolidationConfig-AnnotationConsolidationLambdaArn). 
  + For custom labeling workflows, you must provide a custom pre- and post-annotation Lambda ARN. To learn how to create these Lambda functions, see [Processing data in a custom labeling workflow with AWS Lambda](sms-custom-templates-step3.md).
+ A work team ARN that you specify in `WorkteamArn`. You receive a work team ARN when you subscribe to a vendor workforce or create a private workteam. If you are creating a labeling job for a video frame or point cloud task type, you cannot use the Amazon Mechanical Turk workforce. For all other task types, to use the Mechanical Turk workforce, use the following ARN. Replace *`region`* with the AWS Region you are using to create the labeling job.

  ` arn:aws:sagemaker:region:394669845002:workteam/public-crowd/default`

  If you use the [Amazon Mechanical Turk workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management-public.html), use the `ContentClassifiers` parameter in `DataAttributes` of `InputConfig` to declare that your content is free of personally identifiable information and adult content. 

  Ground Truth *requires* that your input data is free of personally identifiable information (PII) if you use the Mechanical Turk workforce. If you use Mechanical Turk and do not specify that your input data is free of PII using the `FreeOfPersonallyIdentifiableInformation` flag, your labeling job will fail. Use the `FreeOfAdultContent` flag to declare that your input data is free of adult content. SageMaker AI may restrict the Amazon Mechanical Turk workers that can view your task if it contains adult content. 

  To learn more about work teams and workforces, see [Workforces](sms-workforce-management.md). 
+ If you use the Mechanical Turk workforce, you must specify the price you'll pay workers for performing a single task in `PublicWorkforceTaskPrice`.
+ To configure the task, you must provide a task description and title using `TaskDescription` and `TaskTitle` respectively. Optionally, you can provide time limits that control how long the workers have to work on an individual task (`TaskTimeLimitInSeconds`) and how long tasks remain in the worker portal, available to workers (`TaskAvailabilityLifetimeInSeconds`).
+ (Optional) For [some task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-annotation-consolidation.html), you can have multiple workers label a single data object by inputting a number greater than one for the `NumberOfHumanWorkersPerDataObject` parameter. For more information about annotation consolidation, see [Annotation consolidation](sms-annotation-consolidation.md).
+ (Optional) To create an automated data labeling job, specify one of the ARNs listed in [LabelingJobAlgorithmSpecificationArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobAlgorithmsConfig.html) in `LabelingJobAlgorithmsConfig`. This ARN identifies the algorithm used in the automated data labeling job. The task type associated with this ARN must match the task type of the `PreHumanTaskLambdaArn` and `AnnotationConsolidationLambdaArn` you specify. Automated data labeling is supported for the following task types: image classification, bounding box, semantic segmentation, and text classification. The minimum number of objects allowed for automated data labeling is 1,250, and we strongly suggest providing a minimum of 5,000 objects. To learn more about automated data labeling jobs, see [Automate data labeling](sms-automated-labeling.md).
+ (Optional) You can provide [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#API_CreateLabelingJob_RequestSyntax](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#API_CreateLabelingJob_RequestSyntax) that cause the labeling job to stop if one the conditions is met. You can use stopping conditions to control the cost of the labeling job.

## Examples
<a name="sms-create-labeling-job-api-examples"></a>

The following code examples demonstrate how to create a labeling job using `CreateLabelingJob`. You can also see these example notebooks on GitHub in the [SageMaker AI Examples repository](https://github.com/aws/amazon-sagemaker-examples/tree/master/ground_truth_labeling_jobs).

------
#### [ AWS SDK for Python (Boto3) ]

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create a labeling job for a built-in task type in the US East (N. Virginia) Region using a private workforce. Replace all *red-italized text* with your labeling job resources and specifications. 

```
response = client.create_labeling_job(
    LabelingJobName="example-labeling-job",
    LabelAttributeName="label",
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': "s3://bucket/path/manifest-with-input-data.json"
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                "FreeOfPersonallyIdentifiableInformation"|"FreeOfAdultContent",
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': "s3://bucket/path/file-to-store-output-data",
        'KmsKeyId': "string"
    },
    RoleArn="arn:aws:iam::*:role/*",
    LabelCategoryConfigS3Uri="s3://bucket/path/label-categories.json",
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': "arn:aws:sagemaker:region:*:workteam/private-crowd/*",
        'UiConfig': {
            'UiTemplateS3Uri': "s3://bucket/path/custom-worker-task-template.html"
        },
        'PreHumanTaskLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:PRE-tasktype",
        'TaskKeywords': [
            "Images",
            "Classification",
            "Multi-label"
        ],
        'TaskTitle': "Multi-label image classification task",
        'TaskDescription': "Select all labels that apply to the images shown",
        'NumberOfHumanWorkersPerDataObject': 1,
        'TaskTimeLimitInSeconds': 3600,
        'TaskAvailabilityLifetimeInSeconds': 21600,
        'MaxConcurrentTaskCount': 1000,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:ACS-"
        },
    Tags=[
        {
            'Key': "string",
            'Value': "string"
        },
    ]
)
```

------
#### [ AWS CLI ]

The following is an example of an AWS CLI request to create a labeling job for a built-in task type in the US East (N. Virginia) Region using the [Amazon Mechanical Turk workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management-public.html). For more information, see [start-human-loop](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-labeling-job.html) in the *[AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/)*. Replace all *red-italized text* with your labeling job resources and specifications. 

```
$ aws --region us-east-1 sagemaker create-labeling-job \
--labeling-job-name "example-labeling-job" \
--label-attribute-name "label" \
--role-arn "arn:aws:iam::account-id:role/role-name" \
--input-config '{
        "DataAttributes": {
            "ContentClassifiers": [
                "FreeOfPersonallyIdentifiableInformation",
                "FreeOfAdultContent"
            ]
        },
        "DataSource": {
            "S3DataSource": {
                "ManifestS3Uri": "s3://bucket/path/manifest-with-input-data.json"
            }
        }
    }' \
--output-config '{
        "KmsKeyId": "",
        "S3OutputPath": "s3://bucket/path/file-to-store-output-data"
    }' \
--human-task-config '{
        "AnnotationConsolidationConfig": {
            "AnnotationConsolidationLambdaArn": "arn:aws:lambda:us-east-1:432418664414:function:ACS-"
        },
        "TaskAvailabilityLifetimeInSeconds": 21600,
        "TaskTimeLimitInSeconds": 3600,
        "NumberOfHumanWorkersPerDataObject": 1,
        "PreHumanTaskLambdaArn":  "arn:aws:lambda:us-east-1:432418664414:function:PRE-tasktype",
        "WorkteamArn": "arn:aws:sagemaker:us-east-1:394669845002:workteam/public-crowd/default",
        "PublicWorkforceTaskPrice": {
            "AmountInUsd": {
                "Dollars": 0,
                "TenthFractionsOfACent": 6,
                "Cents": 3
            }
        },
        "TaskDescription": "Select all labels that apply to the images shown",
        "MaxConcurrentTaskCount": 1000,
        "TaskTitle": "Multi-label image classification task",,
        "TaskKeywords": [
            "Images",
            "Classification",
            "Multi-label"
        ],
        "UiConfig": {
            "UiTemplateS3Uri": "s3://bucket/path/custom-worker-task-template.html"
        }
    }'
```

------

For more information about this operation, see [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateLabelingJob.html). For information about how to use other language-specific SDKs, see [See Also](https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateLabelingJob.html#API_CreateLabelingJob_SeeAlso) in the `CreateLabelingJobs` topic. 

# Create a streaming labeling job
<a name="sms-streaming-create-job"></a>

Streaming labeling jobs enable you to send individual data objects in real time to a perpetually running, streaming labeling job. To create a streaming labeling job, you can specify the Amazon SNS *input topic* ARN, `SnsTopicArn`, in the `InputConfig` parameter when making a [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) request. Optionally, you can also create an Amazon SNS *output topic* and specify it in `OutputConfig`if you want to receive label data in real time.

**Important**  
If you are a new user of Ground Truth streaming labeling jobs, it is recommended that you review [Ground Truth streaming labeling jobs](sms-streaming-labeling-job.md) before creating a streaming labeling job. Ground Truth streaming labeling jobs are only supported through the SageMaker API.

Use the following sections to create the resources that you need and can use to create a streaming labeling job:
+ Learn how to create SNS topics with the permissions required for Ground Truth streaming labeling jobs by following the steps in [Use Amazon SNS Topics for Data Labeling](sms-create-sns-input-topic.md). Your SNS topics must be created in the same AWS Region as your labeling job. 
+ See [Subscribe an Endpoint to Your Amazon SNS Output Topic](sms-create-sns-input-topic.md#sms-streaming-subscribe-output-topic) to learn how to set up an endpoint to receive labeling task output data at a specified endpoint each time a labeling task is completed.
+ To learn how to configure your Amazon S3 bucket to send notifications to your Amazon SNS input topic, see [Creating Amazon S3 based bucket event notifications based of the Amazon SNS defined in your labeling job](sms-streaming-s3-setup.md).
+ Optionally, add data objects that you want to have labeled as soon as the labeling job starts to your input manifest. For more information, see [Create a Manifest File (Optional)](sms-streaming-manifest.md).
+ There are other resources required to create a labeling job, such as an IAM role, Amazon S3 bucket, a worker task template and label categories. These are described in the Ground Truth documentation on creating a labeling job. For more information, see [Create a Labeling Job](sms-create-labeling-job.md). 
**Important**  
When you create a labeling job you must provide an IAM execution role. Attach the AWS managed policy **AmazonSageMakerGroundTruthExecution** to this role to ensure it has required permissions to execute your labeling job. 

When you submit a request to create a streaming labeling job, the state of your labeling job is `Initializing`. Once the labeling job is active, the state changes to `InProgress`. Do not send new data objects to your labeling job or attempt to stop your labeling job while it is in the `Initializing` state. Once the state changes to `InProgress`, you can start sending new data objects using Amazon SNS and the Amazon S3 configuration. 

**Topics**
+ [Use Amazon SNS Topics for Data Labeling](sms-create-sns-input-topic.md)
+ [Creating Amazon S3 based bucket event notifications based of the Amazon SNS defined in your labeling job](sms-streaming-s3-setup.md)
+ [Create a Manifest File (Optional)](sms-streaming-manifest.md)
+ [Create a Streaming Labeling Job with the SageMaker API](sms-streaming-create-labeling-job-api.md)
+ [Stop a Streaming Labeling Job](sms-streaming-stop-labeling-job.md)

# Use Amazon SNS Topics for Data Labeling
<a name="sms-create-sns-input-topic"></a>

You need to create an Amazon SNS input to create a streaming labeling job. Optionally, you may provide an Amazon SNS output topic.

When you create an Amazon SNS topic to use in your streaming labeling job, note down the topic Amazon Resource Name (ARN). The ARN will be the input values for the parameter `SnsTopicArn` in `InputConfig` and `OutputConfig` when you create a labeling job.

## Create an Input Topic
<a name="sms-streaming-input-topic"></a>

Your input topic is used to send new data objects to Ground Truth. To create an input topic, follow the instructions in [Creating an Amazon SNS topic](https://docs.aws.amazon.com/sns/latest/dg/sns-create-topic.html) in the Amazon Simple Notification Service Developer Guide.

Note down your input topic ARN and use it as input for the `CreateLabelingJob` parameter `SnsTopicArn` in `InputConfig`. 

## Create an Output Topic
<a name="sms-streaming-output-topic"></a>

If you provide an output topic, it is used to send notifications when a data object is labeled. When you create a topic, you have the option to add an encryption key. Use this option to add a AWS Key Management Service customer managed key to your topic to encrypt the output data of your labeling job before it is published to your output topic.

To create an output topic, follow the instructions in [Creating an Amazon SNS topic](https://docs.aws.amazon.com/sns/latest/dg/sns-create-topic.html) in the Amazon Simple Notification Service Developer Guide.

If you add encryption, you must attach additional permission to the topic. See [Add Encryption to Your Output Topic (Optional)](#sms-streaming-encryption). for more information.

**Important**  
To add a customer managed key to your output topic while creating a topic in the console, do not use the **(Default) alias/aws/sns** option. Select a customer managed key that you created. 

Note down your input topic ARN and use it in your `CreateLabelingJob` request in the parameter `SnsTopicArn` in `OutputConfig`. 

### Add Encryption to Your Output Topic (Optional)
<a name="sms-streaming-encryption"></a>

To encrypt messages published to your output topic, you need to provide an AWS KMS customer managed key to your topic. Modify the following policy and add it to your customer managed key to give Ground Truth permission to encrypt output data before publishing it to your output topic.

Replace *`<account_id>`* with the ID of the account that you are using to create your topic. To learn how to find your AWS account ID, see [Finding Your AWS Account ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html#FindingYourAWSId). 

------
#### [ JSON ]

****  

```
{
    "Id": "key-console-policy",
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "Enable IAM User Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Sid": "Allow access for Key Administrators",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/Admin"
            },
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
                "kms:Update*",
                "kms:Revoke*",
                "kms:Disable*",
                "kms:Get*",
                "kms:Delete*",
                "kms:TagResource",
                "kms:UntagResource",
                "kms:ScheduleKeyDeletion",
                "kms:CancelKeyDeletion"
            ],
            "Resource": "*"
        }
    ]
}
```

------

Additionally, you must modify and add the following policy to the execution role that you use to create your labeling job (the input value for `RoleArn`). 

Replace *`<account_id>`* with the ID of the account that you are using to create your topic. Replace *`<region>`* with the AWS Region you are using to create your labeling job. Replace `<key_id>` with your customer managed key ID.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "sid1",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey"
            ],
            "Resource": "arn:aws:kms:us-east-1:111122223333:key/your_key_id"
        }
    ]
}
```

------

For more information on creating and securing keys, see [Creating Keys](https://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html) and [Using Key Policies](https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html) in the AWS Key Management Service Developer Guide.

## Subscribe an Endpoint to Your Amazon SNS Output Topic
<a name="sms-streaming-subscribe-output-topic"></a>

When a worker completes a labeling job task from a Ground Truth streaming labeling job, Ground Truth uses your output topic to publish output data to one or more endpoints that you specify. To receive notifications when a worker finishes a labeling task, you must subscribe an endpoint to your Amazon SNS output topic.

To learn how to add endpoints to your output topic, see [ Subscribing to an Amazon SNS topic](https://docs.aws.amazon.com/sns/latest/dg/sns-create-subscribe-endpoint-to-topic.html) in the *Amazon Simple Notification Service Developer Guide*.

To learn more about the output data format that is published to these endpoints, see [Labeling job output data](sms-data-output.md). 

**Important**  
If you do not subscribe an endpoint to your Amazon SNS output topic, you will not receive notifications when new data objects are labeled. 

# Creating Amazon S3 based bucket event notifications based of the Amazon SNS defined in your labeling job
<a name="sms-streaming-s3-setup"></a>

Changes to your Amazon S3 bucket, event notifications, are enabled either the Amazon S3 console, API, language specific AWS SDKs, or the AWS Command Line Interface. Events must use the same Amazon SNS input topic ARN, `SnsTopicArn`, specified in the `InputConfig` parameter as part of your `CreateLabelingJob` request.

**Amazon S3 bucket notifications and your input data should not be the same Amazon S3 bucket**  
When you create event notifications do not use the same Amazon S3 location that you specified as your `S3OutputPath` in the `OutputConfig` parameters. Linking the two buckets may result in unwanted data objects being processed by Ground Truth for labeling.

You control the types of events that you want to send to your Amazon SNS topic. Ground Truth creates a labeling job when you send [object creation events](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-event-notifications.html#enable-event-notifications-types).

The event structure sent to your Amazon SNS input topic must be a JSON message formatted using the same structure found in [Event message structure](https://docs.aws.amazon.com/AmazonS3/latest/dev/notification-content-structure.html).

To see examples of how you can set up an event notification for your Amazon S3 bucket using the Amazon S3 console, AWS SDK for .NET, and AWS SDK for Java, follow this walkthrough,[Walkthrough: Configure a bucket for notifications (SNS topic or SQS queue)](https://docs.aws.amazon.com/AmazonS3/latest/dev/ways-to-add-notification-config-to-bucket.html) in the *Amazon Simple Storage Service User Guide*.

Amazon EventBridge notifications are not natively supported. To use EventBridge based notification you must update the output format to match the JSON format used in the [Event message structure](https://docs.aws.amazon.com/AmazonS3/latest/dev/notification-content-structure.html).

# Create a Manifest File (Optional)
<a name="sms-streaming-manifest"></a>

When you create a streaming labeling job, you have the one time option to add objects (such as images or text) to an input manifest file that you specify in `ManifestS3Uri` of `CreateLabelingJob`. When the streaming labeling job starts, these objects are sent to workers or added to the Amazon SQS queue if the total number of objects exceed `MaxConcurrentTaskCount`. The results are added to the Amazon S3 path that you specify when creating the labeling job periodically as workers complete labeling tasks. Output data is sent to any endpoint that you subscribe to your output topic. 

If you want to provide initial objects to be labeled, create a manifest file that identifies these objects and place it in Amazon S3. Specify the S3 URI of this manifest file in `ManifestS3Uri` within `InputConfig`.

To learn how to format your manifest file, see [Input data](sms-data-input.md). To use the SageMaker AI console to automatically generate a manifest file (not supported for 3D point cloud task types), see [Automate data setup for labeling jobs](sms-console-create-manifest-file.md).

# Create a Streaming Labeling Job with the SageMaker API
<a name="sms-streaming-create-labeling-job-api"></a>

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) that you can use to start a streaming labeling job for a built-in task type in the US East (N. Virginia) Region. For more details about each parameter below see [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). To learn how you can create a labeling job using this API and associated language specific SDKs, see [Create a Labeling Job (API)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-api.html).

In this example, note the following parameters:
+ `SnsDataSource` – This parameter appears in `InputConfig` and `OutputConfig` and is used to identify your input and output Amazon SNS topics respectively. To create a streaming labeling job, you are required to provide an Amazon SNS input topic. Optionally, you can also provide an Amazon SNS output topic.
+ `S3DataSource` – This parameter is optional. Use this parameter if you want to include an input manifest file of data objects that you want labeled as soon as the labeling job starts.
+ [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-StoppingConditions](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-StoppingConditions) – This parameter is ignored when you create a streaming labeling job. To learn more about stopping a streaming labeling job, see [Stop a Streaming Labeling Job](sms-streaming-stop-labeling-job.md).
+ Streaming labeling jobs do not support automated data labeling. Do not include the `LabelingJobAlgorithmsConfig` parameter.

```
response = client.create_labeling_job(
    LabelingJobName= 'example-labeling-job',
    LabelAttributeName='label',
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': 's3://bucket/path/manifest-with-input-data.json'
            },
            'SnsDataSource': {
                'SnsTopicArn': 'arn:aws:sns:us-east-1:123456789012:your-sns-input-topic'
            }
        },
        'DataAttributes': {
            'ContentClassifiers': [
                'FreeOfPersonallyIdentifiableInformation'|'FreeOfAdultContent',
            ]
        }
    },
    OutputConfig={
        'S3OutputPath': 's3://bucket/path/file-to-store-output-data',
        'KmsKeyId': 'string',
        'SnsTopicArn': 'arn:aws:sns:us-east-1:123456789012:your-sns-output-topic'
    },
    RoleArn='arn:aws:iam::*:role/*',
    LabelCategoryConfigS3Uri='s3://bucket/path/label-categories.json',
    HumanTaskConfig={
        'WorkteamArn': 'arn:aws:sagemaker:us-east-1:*:workteam/private-crowd/*',
        'UiConfig': {
            'UiTemplateS3Uri': 's3://bucket/path/custom-worker-task-template.html'
        },
        'PreHumanTaskLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:PRE-tasktype',
        'TaskKeywords': [
            'Example key word',
        ],
        'TaskTitle': 'Multi-label image classification task',
        'TaskDescription': 'Select all labels that apply to the images shown',
        'NumberOfHumanWorkersPerDataObject': 123,
        'TaskTimeLimitInSeconds': 123,
        'TaskAvailabilityLifetimeInSeconds': 123,
        'MaxConcurrentTaskCount': 123,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:us-east-1:432418664414:function:ACS-tasktype'
            }
        },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
```

# Stop a Streaming Labeling Job
<a name="sms-streaming-stop-labeling-job"></a>

You can manually stop your streaming labeling job using the operation [StopLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_StopLabelingJob.html). 

If your labeling job remains idle for over 10 days, it is automatically stopped by Ground Truth. In this context, a labeling job is considered *idle* if no objects are sent to the Amazon SNS input topic and no objects remain in your Amazon SQS queue, waiting to be labeled. For example, if no data objects are fed to the Amazon SNS input topic and all the objects fed to the labeling job are already labeled, Ground Truth starts a timer. After the timer starts, if no items are received within a 10 day period, the labeling job is stopped. 

When a labeling job is stopped, its status is `STOPPING` while Ground Truth cleans up labeling job resources and unsubscribes your Amazon SNS topic from your Amazon SQS queue. The Amazon SQS is *not* deleted by Ground Truth because this queue may contain unprocessed data objects. You should manually delete the queue if you want to avoid incurring additional charges from Amazon SQS. To learn more, see [Amazon SQS pricing ](https://aws.amazon.com/sqs/pricing/).

# Labeling category configuration file with label category and frame attributes reference
<a name="sms-label-cat-config-attributes"></a>

When you create a 3D point cloud or video frame labeling job using the Amazon SageMaker API operation `CreateLabelingJob`, you use a label category configuration file to specify your labels and worker instructions. Optionally, you can also provide the following in your label category attribute file:
+ You can provide *label category attributes* for video frame and 3D point cloud object tracking and object detection task types. Workers can use one or more attributes to give more information about an object. For example, you may want to use the attribute *occluded* to have workers identify when an object is partially obstructed. You can either specify a label category attribute for a single label using the `categoryAttributes` parameter, or for all labels using the `categoryGlobalAttributes` parameter. 
+ You can provide *frame attributes* for video frame and 3D point cloud object tracking and object detection task types using `frameAttributes`. When you create a frame attribute, it appears on each frame or point cloud in the worker task. In video frame labeling jobs, these are attributes that workers assign to an entire video frame. For 3D point cloud labeling jobs, these attributes are applied to a single point cloud. Use frame attributes to have workers provide more information about the scene in a specific frame or point cloud.
+ For video frame labeling jobs, you use the label category configuration file to specify the task type (bounding box, polyline, polygon, or keypoint) sent to workers. 

For workers, specifying values for label category attributes and frame attributes will be optional.

**Important**  
You should only provide a label attribute name in `auditLabelAttributeName` if you are running an audit job to verify or adjust labels. Use this parameter to input the [LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) used in the labeling job that generated the annotations you want your worker to adjust. When you create a labeling job in the console, if you did not specify a label attribute name, the **Name** of your job is used as the LabelAttributeName.

The following topics show examples of a label category configuration file for different kinds of labeling jobs. They also explain the schema and quotas of a category configuration file.

**Topics**
+ [Examples: label category configuration files for 3D point cloud labeling jobs](#sms-label-cat-config-attributes-3d-pc)
+ [Examples: label category configuration files for video frame labeling jobs](#sms-label-cat-config-attributes-vid-frame)
+ [Label category configuration file schema](#sms-label-cat-config-attributes-schema)
+ [Label and label category attribute quotas](#sms-point-cloud-label-cat-limits)

## Examples: label category configuration files for 3D point cloud labeling jobs
<a name="sms-label-cat-config-attributes-3d-pc"></a>

The following topics show examples of 3D point cloud label category configuration files for object detection, object tracking, semantic segmentation, adjustment, and verification labeling jobs.

**Topics**
+ [Example: 3D point cloud object tracking and object detection](#example-3d-point-cloud-object)
+ [Example: 3D point cloud semantic segmentation](#example-3d-point-cloud-semantic)
+ [Example: 3D point cloud adjustment](#example-3d-point-cloud-adjustment)
+ [Example: 3D point cloud verification](#example-3d-point-cloud-verification)

### Example: 3D point cloud object tracking and object detection
<a name="example-3d-point-cloud-object"></a>

The following is an example of a label category configuration file that includes label category attributes for a 3D point cloud object detection or object tracking labeling job. This example includes a two frame attributes, which will be added to all point clouds submitted to the labeling job. The `Car` label will include four label category attributes—`X`, `Y`, `Z`, and the global attribute, `W`.

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"],
            "isRequired":true 
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buzz", "biz"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number",
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum":["y1", "y2"]
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                }
            ]
        },
        {
            "label": "Pedestrian",
            "categoryAttributes": [...]
        }
    ],
    "instructions": {"shortInstruction":"Draw a tight Cuboid", "fullInstruction":"<html markup>"}
}
```

### Example: 3D point cloud semantic segmentation
<a name="example-3d-point-cloud-semantic"></a>

The following is an example of a label category configuration file for a 3D point cloud semantic segmentation labeling job. 

Label category attributes are not supported for 3D point cloud semantic segmentation task types. Frame attributes are supported. If you provide label category attributes for a semantic segmentation labeling job, they will be ignored.

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "labels": [
        {
            "label": "Car",
        },
        {
            "label": "Pedestrian",
        },
        {
            "label": "Cyclist",
        }
    ],
    "instructions": {"shortInstruction":"Select the appropriate label and paint all objects in the point cloud that it applies to the same color", "fullInstruction":"<html markup>"}
}
```

### Example: 3D point cloud adjustment
<a name="example-3d-point-cloud-adjustment"></a>

The following is an example of a label category configuration file for a 3D point cloud object detection or object tracking adjustment labeling job. For 3D point cloud semantic segmentation adjustment labeling jobs, `categoryGlobalAttributes` and `categoryAttributes` are not supported. 

You must include `auditLabelAttributeName` to specify the label attribute name of the previous labeling job that you use to create the adjustment labeling job. Optionally, you can use the `editsAllowed` parameter to specify whether or not a label or frame attribute can be edited. 

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "editsAllowed":"none",
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "editsAllowed":"any",
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buzz", "biz"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "editsAllowed":"any",
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number"
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum":["y1", "y2"],
                    "editsAllowed":"any"
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                    "editsAllowed":"none"
                }
            ]
        },
        {
            "label": "Pedestrian",
            "categoryAttributes": [...]
        }
    ],
    "instructions": {"shortInstruction":"Draw a tight Cuboid", "fullInstruction":"<html markup>"},
    // include auditLabelAttributeName for label adjustment jobs
    "auditLabelAttributeName": "myPrevJobLabelAttributeName"
}
```

### Example: 3D point cloud verification
<a name="example-3d-point-cloud-verification"></a>

The following is an example of a label category configuration file you may use for a 3D point cloud object detection or object tracking verification labeling job. For a 3D point cloud semantic segmentation verification labeling job, `categoryGlobalAttributes` and `categoryAttributes` are not supported. 

You must include `auditLabelAttributeName` to specify the label attribute name of the previous labeling job that you use to create the verification labeling job. Additionally, you must use the `editsAllowed` parameter to specify that no labels can be edited. 

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "editsAllowed":"any", 
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "editsAllowed":"any", 
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "editsAllowed":"none", 
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buzz", "biz"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "editsAllowed":"none", 
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number",
                    "editsAllowed":"none"
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum":["y1", "y2"],
                    "editsAllowed":"any"
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                    "editsAllowed":"none"
                }
            ]
        },
        {
            "label": "Pedestrian",
            "editsAllowed":"none", 
            "categoryAttributes": [...]
        }
    ],
    "instructions": {"shortInstruction":"Draw a tight Cuboid", "fullInstruction":"<html markup>"},
    // include auditLabelAttributeName for label verification jobs
    "auditLabelAttributeName": "myPrevJobLabelAttributeName"
}
```

## Examples: label category configuration files for video frame labeling jobs
<a name="sms-label-cat-config-attributes-vid-frame"></a>

The annotation tools available to your worker and task type used depends on the value you specify for `annotationType`. For example, if you want workers to use key points to track changes in the pose of specific objects across multiple frames, you would specify `Keypoint` for the `annotationType`. If you do not specify an annotation type, `BoundingBox` will be used by default. 

The following topics show examples of video frame category configuration files.

**Topics**
+ [Example: video frame keypoint](#example-video-frame-keypoint)
+ [Example: video frame adjustment](#example-video-frame-adjustment)
+ [Example: video frame verification](#example-video-frame-verification)

### Example: video frame keypoint
<a name="example-video-frame-keypoint"></a>

The following is an example of a video frame keypoint label category configuration file with label category attributes. This example includes two frame attributes, which will be added to all frames submitted to the labeling job. The `Car` label will include four label category attributes—`X`, `Y`, `Z`, and the global attribute, `W`. 

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buz", "buz2"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number",
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum": ["y1", "y2"]
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                }
            ]
        },
        {
            "label": "Pedestrian",
            "categoryAttributes": [...]
        }
    ],
    "annotationType":"Keypoint",
    "instructions": {"shortInstruction":"add example short instructions here", "fullInstruction":"<html markup>"}
}
```

### Example: video frame adjustment
<a name="example-video-frame-adjustment"></a>

The following is an example of a label category configuration file you may use for a video frame adjustment labeling job.

You must include `auditLabelAttributeName` to specify the label attribute name of the previous labeling job that you use to create the verification labeling job. Optionally, you can use the `editsAllowed` parameter to specify whether or not labels, label category attributes, or frame attributes can be edited. 

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "editsAllowed":"none", 
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "editsAllowed":"any", 
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buz", "buz2"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "editsAllowed":"any", 
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number",
                    "editsAllowed":"any"
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum": ["y1", "y2"],
                    "editsAllowed":"any"
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                    "editsAllowed":"none"
                }
            ]
        },
        {
            "label": "Pedestrian",
            "editsAllowed":"none", 
            "categoryAttributes": [...]
        }
    ],
    "annotationType":"Keypoint",
    "instructions": {"shortInstruction":"add example short instructions here", "fullInstruction":"<html markup>"},
    // include auditLabelAttributeName for label adjustment jobs
    "auditLabelAttributeName": "myPrevJobLabelAttributeName"
}
```

### Example: video frame verification
<a name="example-video-frame-verification"></a>

The following is an example of a label category configuration file for a video frame labeling job.

You must include `auditLabelAttributeName` to specify the label attribute name of the previous labeling job that you use to create the verification labeling job. Additionally, you must use the `editsAllowed` parameter to specify that no labels can be edited. 

```
{
    "documentVersion": "2020-03-01",
    "frameAttributes": [
        {
            "name":"count players",
            "editsAllowed":"none", 
            "description":"How many players to you see in the scene?",
            "type":"number"
        },
        {
            "name":"select one",
            "editsAllowed":"any", 
            "description":"describe the scene",
            "type":"string",
            "enum":["clear","blurry"]
        },   
    ],
    "categoryGlobalAttributes": [
        {
            "name":"W",
            "editsAllowed":"none", 
            "description":"label-attributes-for-all-labels",
            "type":"string",
            "enum": ["foo", "buz", "buz2"]
        }
    ],
    "labels": [
        {
            "label": "Car",
            "editsAllowed":"none", 
            "categoryAttributes": [
                {
                    "name":"X",
                    "description":"enter a number",
                    "type":"number",
                    "editsAllowed":"any"
                },
                {
                    "name":"Y",
                    "description":"select an option",
                    "type":"string",
                    "enum": ["y1", "y2"],
                    "editsAllowed":"any"
                },
                {
                    "name":"Z",
                    "description":"submit a free-form response",
                    "type":"string",
                    "editsAllowed":"none"
                }
            ]
        },
        {
            "label": "Pedestrian",
            "editsAllowed":"none", 
            "categoryAttributes": [...]
        }
    ],
    "annotationType":"Keypoint",
    "instructions": {"shortInstruction":"add example short instructions here", "fullInstruction":"<html markup>"},
    // include auditLabelAttributeName for label adjustment jobs
    "auditLabelAttributeName": "myPrevJobLabelAttributeName"
}
```

## Label category configuration file schema
<a name="sms-label-cat-config-attributes-schema"></a>

The following table lists elements you can and must include in your label category configuration file.

**Note**  
The parameter `annotationType` is only supported for video frame labeling jobs. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
| frameAttributes |  No  |  A list of JSON objects. **Required Parameters in each JSON Object:** `name`, `type`, `description` `minimum` and `maximum` are required if `type` is `"number"` **Optional Parameters in each JSON Object:** `enum`, `editsAllowed`, `isRequired`  | Use this parameter to create a frame attribute that is applied to all frames or 3D point clouds in your labeling job.See the third table in this section for more information.  | 
| categoryGlobalAttributes |  No  |  A list of JSON objects. **Required Parameters in each JSON Object:** `name`, `type` `minimum` and `maximum` are required if `type` is `"number"` **Optional Parameters in each JSON Object:** `description`, `enum`, `editsAllowed`, `isRequired`   | Use this parameter to create label category attributes that are applied to all labels you specify in `labels`. See the third table in this section for more information.  | 
| labels |  Yes  |  A list of up to 30 JSON objects **Required Parameters in each JSON Object:** `label` **Optional Parameters in each JSON Object:** `categoryAttributes`, `editsAllowed`  |  Use this parameter to specify your labels, or classes. Add one `label` for each class.  To add a label category attribute to a label, add `categoryAttributes` to that label. Use `editsAllowed` to specify whether or not a label can be edited in an adjustment labeling job. Set `editsAllowed` to `"none"` for verification labeling jobs. See the following table for more information.  | 
| annotationType (only supported for video frame labeling jobs)  |  No   |  String **Accepted Parameters:** `BoundingBox`, `Polyline`, `Polygon`, `Keypoint` **Default:** `BoundingBox`  |  Use this to specify the task type for your video frame labeling jobs. For example, for a polygon video frame object detection task, choose `Polygon`.  If you do not specify an `annotationType` when you create a video frame labeling job, Ground Truth will use `BoundingBox` by default.   | 
| instructions |  No  | A JSON objectRequired Parameters in each JSON Object:`"shortInstruction"`, `"fullInstruction"` |  Use this parameter to add worker instructions to help your workers complete their tasks. For more information about worker instructions, see [Worker instructions](sms-point-cloud-general-information.md#sms-point-cloud-worker-instructions-general).  Short instructions must be under 255 characters and long instruction must be under 2,048 characters.  For more information, see [Create instruction pages](sms-creating-instruction-pages.md).  | 
| auditLabelAttributeName |  Required for adjustment and verification task types  |  String  |  Enter the [LabelAttributeName](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName) used in the labeling job you want to adjust annotations of.  Only use this parameter if you are creating an adjustment job for video frame and 3D point cloud object detection, object tracking, or 3D point cloud semantic segmentation.   | 

### Labels object schema
<a name="sms-labels-schema"></a>

The following table describes the parameters that you can and must use to create a list of `Labels`. Each parameter should be included in a JSON object. 


****  

| Parameter | Required | Accepted Values | Description | 
| --- | --- | --- | --- | 
| label |  Yes  |  String  |  The name of the label category that is displayed to workers. Each label category name must be unique.  | 
| categoryAttributes |  No  |  A list of JSON objects. **Required Parameters in each JSON Object:** `name`, `type` `minimum` and `maximum` required if `type` is `"number"` **Optional Parameters in each JSON Object:** `description`, `enum`, `editsAllowed`, `isRequired`  | Use this parameter to add label category attributes to specific labels you specify in `labels`. To add one or more label category attributes to a label, include the `categoryAttributes` JSON object in the same `labels` JSON object as that `label`.See the following table for more information.  | 
| editsAllowed |  No  |  String **Supported Values**: `"none"`: no modifications are not allowed. or `"any"` (Default): all modifications are allowed.  |  Specifies whether or not a label can be edited by workers. For video frame or 3D point cloud *adjustment* labeling jobs, add this parameter to one or more JSON objects in the `labels` list to specify whether or not a worker can edit a label. For 3D point cloud and video frame *verification* labeling jobs, add this parameter with the value `"none"` to each JSON object in the `labels` list. This will make all labels uneditable.  | 

### frameAttributes and categoryGlobalAttributes schema
<a name="sms-category-attributes-schema"></a>

The following table describes the parameters that you can and must use to create a frame attributes using `frameAttributes` and label category attribute using the `categoryGlobalAttributes` and `categoryAttributes` parameters.


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
| name |  Yes  |  String  |  Use this parameter to assign a name to your label category or frame attribute. This is the attribute name that workers see. Each label category attribute name in your label category configuration file must be unique. Global label category attributes and label specific label category attributes cannot have the same name.  | 
| type |  Yes  |  String **Required Values**: `"string"` or `"number"`  |  Use this parameter to define the label category or frame attribute type.  If you specify `"string"` for `type` and provide an `enum` value for this attribute, workers will be able to choose from one of the choices you provide.  If you specify `"string"` for `type` and do not provide an `enum` value, workers can enter free form text.  If you specify `number` for `type`, worker can enter a number between the `minimum` and `maximum` numbers you specify.   | 
| enum |  No  |  List of strings  |  Use this parameter to define options that workers can choose from for this label category or frame attribute. Workers can choose one value specified in `enum`. For example, if you specify `["foo", "buzz", "bar"`] for `enum`, workers can choose one of `foo`, `buzz`, or `bar`. You must specify `"string"` for `type` to use an `enum` list.  | 
| description |  `frameAttributes`: Yes `categoryAttributes` or `categoryGlobalAttributes`: No  |  String  |  Use this parameter to add a description of the label category or frame attribute. You can use this field to give workers more information about the attribute.  This field is only required for frame attributes.  | 
| minimum and maximum | Required if attribute type is "number" | Integers |  Use these parameters to specify minimum and maximum (inclusive) values workers can enter for numeric label category or frame attributes. You must specify `"number"` for `type` to use `minimum` and `maximum`.  | 
| editsAllowed |  No  |  String **Required Values**: `"none"`: no modifications are not allowed. or `"any"` (Default): all modifications are allowed.  |  Specifies whether or not a label category or frame attribute can be edited by workers. For video frame or 3D point cloud *adjustment* and *verification* labeling jobs, add this parameter to label category and frame attribute JSON objects to specify whether or not a worker can edit an attribute.  | 
| isRequired |  No  |  Boolean  |  Specifies whether workers are required to annotate an attribute. Workers cannot submit the job until all required attributes are annotated.  | 

## Label and label category attribute quotas
<a name="sms-point-cloud-label-cat-limits"></a>

You can specify up to 10 label category attributes per class. This 10-attribute quotas includes global label category attributes. For example, if you create four global label category attributes, and then assign three label category attributes to label `X`, that label will have 4\$13=7 label category attributes in total. For all label category and label category attribute limits, refer to the following table.


****  

|  Type  |  Min  |  Max  | 
| --- | --- | --- | 
|  Labels (`Labels`)  |  1  |  30  | 
|  Label name character quota  |  1  |  16  | 
|  Label category attributes per label (sum of `categoryAttributes` and `categoryGlobalAttributes`)  |  0  |  10  | 
|  Free form text entry label category attributes per label (sum of `categoryAttributes` and `categoryGlobalAttributes`).   | 0 | 5 | 
|  Frame attributes  |  0  |  10  | 
|  Free form text entry attributes in `frameAttributes`.  | 0 | 5 | 
|  Attribute name character quota (`name`)  |  1  |  16  | 
|  Attribute description character quota (`description`)  |  0  |  128  | 
|  Attribute type characters quota (`type`)  |  1  |  16  | 
|  Allowed values in the `enum` list for a `string` attribute  | 1 | 10 | 
|  Character quota for a value in `enum` list  | 1 | 16 | 
| Maximum characters in free form text response for free form text frameAttributes | 0 | 1000 | 
| Maximum characters in free form text response for free form text categoryAttributes and categoryGlobalAttributes | 0 | 80 | 

# Use input and output data
<a name="sms-data"></a>

The input data that you provide to Amazon SageMaker Ground Truth is sent to your workers for labeling. You choose the data to send to your workers by creating a single manifest file that defines all of the data that requires labeling or by sending input data objects to an ongoing, streaming labeling job to be labeled in real time. 

The output data is the result of your labeling job. The output data file, or *augmented manifest file*, contains label data for each object you send to the labeling job and metadata about the label assigned to data objects.

When you use image classification (single and multi-label), text classification (single and multi-label), object detection, and semantic segmentation built in task types to create a labeling job, you can use the resulting augmented manifest file to launch a SageMaker training job. For a demonstration of how to use an augmented manifest to train an object detection machine learning model with Amazon SageMaker AI, see [object\$1detection\$1augmented\$1manifest\$1training.ipynb](https://sagemaker-examples.readthedocs.io/en/latest/ground_truth_labeling_jobs/object_detection_augmented_manifest_training/object_detection_augmented_manifest_training.html). For more information, see [Augmented Manifest Files for Training Jobs](augmented-manifest.md).

**Topics**
+ [Input data](sms-data-input.md)
+ [3D Point Cloud Input Data](sms-point-cloud-input-data.md)
+ [Video Frame Input Data](sms-video-frame-input-data-overview.md)
+ [Labeling job output data](sms-data-output.md)

# Input data
<a name="sms-data-input"></a>

The input data are the data objects that you send to your workforce to be labeled. There are two ways to send data objects to Ground Truth for labeling: 
+ Send a list of data objects that require labeling using an input manifest file.
+ Send individual data objects in real time to a perpetually running, streaming labeling job. 

If you have a dataset that needs to be labeled one time, and you do not require an ongoing labeling job, create a standard labeling job using an input manifest file. 

If you want to regularly send new data objects to your labeling job after it has started, create a streaming labeling job. When you create a streaming labeling job, you can optionally use an input manifest file to specify a group of data that you want labeled immediately when the job starts. You can continuously send new data objects to a streaming labeling job as long as it is active. 

**Note**  
Streaming labeling jobs are only supported through the SageMaker API. You cannot create a streaming labeling job using the SageMaker AI console.

The following task types have special input data requirements and options:
+ For [3D point cloud](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud.html) labeling job input data requirements, see [3D Point Cloud Input Data](sms-point-cloud-input-data.md). 
+ For [video frame](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-video-task-types.html) labeling job input data requirements, see [Video Frame Input Data](sms-video-frame-input-data-overview.md).

**Topics**
+ [Input manifest files](sms-input-data-input-manifest.md)
+ [Automate data setup for labeling jobs](sms-console-create-manifest-file.md)
+ [Supported data formats](sms-supported-data-formats.md)
+ [Ground Truth streaming labeling jobs](sms-streaming-labeling-job.md)
+ [Input Data Quotas](input-data-limits.md)
+ [Select Data for Labeling](sms-data-filtering.md)

# Input manifest files
<a name="sms-input-data-input-manifest"></a>

Each line in an input manifest file is an entry containing an object, or a reference to an object, to label. An entry can also contain labels from previous jobs and for some task types, additional information. 

Input data and the manifest file must be stored in Amazon Simple Storage Service (Amazon S3). Each has specific storage and access requirements, as follows:
+ The Amazon S3 bucket that contains the input data must be in the same AWS Region in which you are running Amazon SageMaker Ground Truth. You must give Amazon SageMaker AI access to the data stored in the Amazon S3 bucket so that it can read it. For more information about Amazon S3 buckets, see [ Working with Amazon S3 buckets](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html). 
+ The manifest file must be in the same AWS Region as the data files, but it doesn't need to be in the same location as the data files. It can be stored in any Amazon S3 bucket that is accessible to the AWS Identity and Access Management (IAM) role that you assigned to Ground Truth when you created the labeling job.

**Note**  
3D point cloud and video frame [ task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) have different input manifest requirements and attributes.   
For [3D point cloud task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud.html), refer to [Input Manifest Files for 3D Point Cloud Labeling Jobs](sms-point-cloud-input-manifest.md).  
For [video frame task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-video-task-types.html), refer to [Create a Video Frame Input Manifest File](sms-video-manual-data-setup.md#sms-video-create-manifest).

The manifest is a UTF-8 encoded file in which each line is a complete and valid JSON object. Each line is delimited by a standard line break, \$1n or \$1r\$1n. Because each line must be a valid JSON object, you can't have unescaped line break characters. For more information about data format, see [JSON Lines](http://jsonlines.org/).

Each JSON object in the manifest file can be no larger than 100,000 characters. No single attribute within an object can be larger than 20,000 characters. Attribute names can't begin with `$` (dollar sign).

Each JSON object in the manifest file must contain one of the following keys: `source-ref` or `source`. The value of the keys are interpreted as follows:
+ `source-ref` – The source of the object is the Amazon S3 object specified in the value. Use this value when the object is a binary object, such as an image.
+ `source` – The source of the object is the value. Use this value when the object is a text value.



The following is an example of a manifest file for files stored in an Amazon S3 bucket:

```
{"source-ref": "S3 bucket location 1"}
{"source-ref": "S3 bucket location 2"}
   ...
{"source-ref": "S3 bucket location n"}
```

Use the `source-ref` key for image files for bounding box, image classification (single and multi-label), semantic segmentation, and video clips for video classification labeling jobs. 3D point cloud and video frame labeling jobs also use the `source-ref` key but these labeling jobs require additional information in the input manifest file. For more information see [3D Point Cloud Input Data](sms-point-cloud-input-data.md) and [Video Frame Input Data](sms-video-frame-input-data-overview.md).

The following is an example of a manifest file with the input data stored in the manifest:

```
{"source": "Lorem ipsum dolor sit amet"}
{"source": "consectetur adipiscing elit"}
   ...
{"source": "mollit anim id est laborum"}
```

Use the `source` key for single and multi-label text classification and named entity recognition labeling jobs. 

You can include other key-value pairs in the manifest file. These pairs are passed to the output file unchanged. This is useful when you want to pass information between your applications. For more information, see [Labeling job output data](sms-data-output.md).

# Automate data setup for labeling jobs
<a name="sms-console-create-manifest-file"></a>

You can use the automated data setup to create manifest files for your labeling jobs in the Ground Truth console using images, videos, video frames, text (.txt) files, and comma-separated value (.csv) files stored in Amazon S3. When you use automated data setup, you specify an Amazon S3 location where your input data is stored and the input data type, and Ground Truth looks for the files that match that type in the location you specify.

**Note**  
Ground Truth does not use an AWS KMS key to access your input data or write the input manifest file in the Amazon S3 location that you specify. The user or role that creates the labeling job must have permissions to access your input data objects in Amazon S3.

Before using the following procedure, ensure that your input images or files are correctly formatted:
+ Image files – Image files must comply with the size and resolution limits listed in the tables found in [Input File Size Quota](input-data-limits.md#input-file-size-limit). 
+ Text files – Text data can be stored in one or more .txt files. Each item that you want labeled must be separated by a standard line break. 
+ CSV files – Text data can be stored in one or more .csv files. Each item that you want labeled must be in a separate row.
+ Videos – Video files can be any of the following formats: .mp4, .ogg, and .webm. If you want to extract video frames from your video files for object detection or object tracking, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).
+ Video frames – Video frames are images extracted from a videos. All images extracted from a single video are referred to as a *sequence of video frames*. Each sequence of video frames must have unique prefix keys in Amazon S3. See [Provide Video Frames](sms-point-cloud-video-input-data.md#sms-video-provide-frames). For this data type, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md)

**Important**  
For video frame object detection and video frame object tracking labeling jobs, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md) to learn how to use the automated data setup. 

Use these instructions to automatically set up your input dataset connection with Ground Truth.

**Automatically connect your data in Amazon S3 with Ground Truth**

1. Navigate to the **Create labeling job** page in the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 

   This link puts you in the North Virginia (us-east-1) AWS Region. If your input data is in an Amazon S3 bucket in another Region, switch to that Region. To change your AWS Region, on the [navigation bar](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/getting-started.html#select-region), choose the name of the currently displayed Region.

1. Select **Create labeling job**.

1. Enter a **Job name**. 

1. In the section **Input data setup**, select **Automated data setup**.

1. Enter an Amazon S3 URI for **S3 location for input datasets**. 

1. Specify your **S3 location for output datasets**. This is where your output data is stored. 

1. Choose your **Data type** using the dropdown list.

1. Use the drop down menu under **IAM Role** to select an execution role. If you select **Create a new role**, specify the Amazon S3 buckets that you want grant this role permission to access. This role must have permission to access the S3 buckets you specified in Steps 5 and 6.

1. Select **Complete data setup**.

This creates an input manifest in the Amazon S3 location for input datasets that you specified in step 5. If you are creating a labeling job using the SageMaker API or, AWS CLI, or an AWS SDK, use the Amazon S3 URI for this input manifest file as input to the parameter `ManifestS3Uri`. 

The following GIF demonstrates how to use the automated data setup for image data. This example will create a file, `dataset-YYMMDDTHHMMSS.manifest` in the Amazon S3 bucket `example-groundtruth-images` where `YYMMDDTHHmmSS` indicates the year (`YY`), month (`MM`), day (`DD`) and time in hours (`HH`), minutes (`mm`) and seconds (`ss`), that the input manifest file was created. 

![\[GIF showing how to use the automated data setup for image data.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/automated-data-setup.gif)


# Supported data formats
<a name="sms-supported-data-formats"></a>

When you create an input manifest file for a [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) manually, your input data must be in one of the following support file formats for the respective input data type. To learn about automated data setup, see [Automate data setup for labeling jobs](sms-console-create-manifest-file.md).

**Tip**  
When you use the automated data setup, additional data formats can be used to generate an input manifest file for video frame and text based task types.


****  

| Task Types | Input Data Type | Support Formats | Example Input Manifest Line | 
| --- | --- | --- | --- | 
|  Bounding Box, Semantic Segmentation, Image Classification (Single Label and Multi-label), Verify and Adjust Labels  |  Image  |  .jpg, .jpeg, .png  |  <pre>{"source-ref": "s3://amzn-s3-demo-bucket1/example-image.png"}</pre>  | 
|  Named Entity Recognition, Text Classification (Single and Multi-Label)  | Text | Raw text |  <pre>{"source": "Lorem ipsum dolor sit amet"}</pre>  | 
|  Video Classification  | Video clips | .mp4, .ogg, and .webm |  <pre>{"source-ref": "s3:///example-video.mp4"}</pre>  | 
| Video Frame Object Detection, Video Frame Object Tracking (bounding boxes, polylines, polygons or keypoint) | Video frames and video frame sequence files (for Object Tracking) |  **Video frames**: .jpg, .jpeg, .png **Sequence files**: .json  | Refer to [Create a Video Frame Input Manifest File](sms-video-manual-data-setup.md#sms-video-create-manifest). | 
|  3D Point Cloud Semantic Segmentation, 3D Point Cloud Object Detection, 3D Point Cloud Object Tracking  | Point clouds and point cloud sequence files (for Object Tracking) |  **Point clouds**: Binary pack format and ASCII. For more information see [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). **Sequence files**: .json  | Refer to [Input Manifest Files for 3D Point Cloud Labeling Jobs](sms-point-cloud-input-manifest.md). | 

# Ground Truth streaming labeling jobs
<a name="sms-streaming-labeling-job"></a>

If you want to perpetually send new data objects to Amazon SageMaker Ground Truth to be labeled, use a streaming labeling job. Streaming labeling jobs allow you to:
+ Send new dataset objects to workers in real time using a perpetually running labeling job. Workers continuously receive new data objects to label as long as the labeling job is active and new objects are being sent to it.
+ Gain visibility into the number of objects that have been queued and are waiting to be labeled. Use this information to control the flow of data objects sent to your labeling job.
+ Receive label data for individual data objects in real time as workers finish labeling them. 

Ground Truth streaming labeling jobs remain active until they are manually stopped or have been idle for more than 10 days. You can intermittently send new data objects to workers while the labeling job is active.

If you are a new user of Ground Truth streaming labeling jobs, it is recommended that you review [How it works](#sms-streaming-how-it-works). 

Use [Create a streaming labeling job](sms-streaming-create-job.md) to learn how to create a streaming labeling job.

**Note**  
Ground Truth streaming labeling jobs are only supported through the SageMaker API.

## How it works
<a name="sms-streaming-how-it-works"></a>

When you create a Ground Truth streaming labeling job, the job remains active until it is manually stopped, remains idle for more than 10 days, or is unable to access input data sources. You can intermittently send new data objects to workers while it is active. A worker can continue to receive new data objects in real time as long as the total number of tasks currently available to the worker is less than the value in [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount). Otherwise, the data object is sent to a queue that Ground Truth creates on your behalf in [Amazon Simple Queue Service](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/welcome.html) (Amazon SQS) for later processing. These tasks are sent to workers as soon as the total number of tasks currently available to a worker falls below `MaxConcurrentTaskCount`. If a data object is not sent to a worker after 14 days, it expires. You can view the number of tasks pending in the queue and adjust the number of objects you send to the labeling job. For example, you may decrease the speed at which you send objects to the labeling job if the backlog of pending objects moves above a threshold. 

**Topics**
+ [How it works](#sms-streaming-how-it-works)
+ [Send data to a streaming labeling job](sms-streaming-how-it-works-send-data.md)
+ [Manage labeling requests with an Amazon SQS queue](sms-streaming-how-it-works-sqs.md)
+ [Receive output data from a streaming labeling job](sms-streaming-how-it-works-output-data.md)
+ [Duplicate message handling](sms-streaming-impotency.md)

# Send data to a streaming labeling job
<a name="sms-streaming-how-it-works-send-data"></a>

You can optionally submit input data to a streaming labeling job one time when you create the labeling job using an input manifest file. Once the labeling job has started and the state is `InProgress`, you can submit new data objects to your labeling job in real time using your Amazon SNS input topic and Amazon S3 event notifications. 

***Submit Data Objects When you Start the Labeling Job (One Time):***
+ **Use an Input Manifest File** – You can optionally specify an input manifest file Amazon S3 URI in `ManifestS3Uri` when you create the streaming labeling job. Ground Truth sends each data object in the manifest file to workers for labeling as soon as the labeling job starts. To learn more, see [Create a Manifest File (Optional)](sms-streaming-manifest.md).

  After you submit a request to create the streaming labeling job, its status will be `Initializing`. Once the labeling job is active, the state changes to `InProgress` and you can start using the real-time options to submit additional data objects for labeling. 

***Submit Data Objects in Real Time:***
+ **Send data objects using Amazon SNS messages** – You can send Ground Truth new data objects to label by sending an Amazon SNS message. You will send this message to an Amazon SNS input topic that you create and specify when you create your streaming labeling job. For more information, see [Send data objects using Amazon SNS](#sms-streaming-how-it-works-sns).
+ **Send data objects by placing them in an Amazon S3 bucket** – Each time you add a new data object to an Amazon S3 bucket, you can prompt Ground Truth to process that object for labeling. To do this, you add an event notification to the bucket so that it notifies your Amazon SNS input topic each time a new object is added to (or *created in*) that bucket. For more information, see [Send data objects using Amazon S3](#sms-streaming-how-it-works-s3). This option is not available for text-based labeling jobs such as text classification and named entity recognition. 
**Important**  
If you use the Amazon S3 configuration, do not use the same Amazon S3 location for your input data configuration and your output data. You specify the S3 prefix for your output data when you create a labeling job.

## Send data objects using Amazon SNS
<a name="sms-streaming-how-it-works-sns"></a>

You can send data objects to your streaming labeling job using Amazon Simple Notification Service (Amazon SNS). Amazon SNS is a web service that coordinates and manages the delivery of messages to and from *endpoints* (for example, an email address or AWS Lambda function). An Amazon SNS *topic* acts as a communication channel between two or more endpoints. You use Amazon SNS to send, or *publish*, new data objects to the topic specified in the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) parameter `SnsTopicArn` in `InputConfig`. The format of these messages is the same as a single line from an [input manifest file](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-input.html). 

For example, you may send a piece of text to an active text classification labeling job by publishing it to your input topic. The message that you publish may look similar to the following:

```
{"source": "Lorem ipsum dolor sit amet"}
```

To send a new image object to an image classification labeling job, your message may look similar to the following:

```
{"source-ref": "s3://amzn-s3-demo-bucket/example-image.jpg"}
```

**Note**  
You can also include custom deduplication IDs and deduplication keys in your Amazon SNS messages. To learn more, see [Duplicate message handling](sms-streaming-impotency.md).

When Ground Truth creates your streaming labeling job, it subscribes to your Amazon SNS input topic. 

## Send data objects using Amazon S3
<a name="sms-streaming-how-it-works-s3"></a>

You can send one or more new data objects to a streaming labeling job by placing them in an Amazon S3 bucket that is configured with an Amazon SNS event notification. You can set up an event to notify your Amazon SNS input topic anytime a new object is created in your bucket. You must specify this same Amazon SNS input topic in the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) parameter `SnsTopicArn` in `InputConfig`.

Anytime you configure an Amazon S3 bucket to send notifications to Amazon SNS, Ground Truth will publish a test event, `"s3:TestEvent"`, to ensure that the topic exists and that the owner of the Amazon S3 bucket specified has permission to publish to the specified topic. It is recommended that you set up your Amazon S3 connection with Amazon SNS before starting a streaming labeling job. If you do not, this test event may register as a data object and be sent to Ground Truth for labeling. 

**Important**  
If you use the Amazon S3 configuration, do not use the same Amazon S3 location for your input data configuration and your output data. You specify the S3 prefix for your output data when you create a labeling job.  
For image-based labeling jobs, Ground Truth requires all S3 buckets to have a CORS policy attached. To learn more, see [CORS Requirement for Input Image Data](sms-cors-update.md).

Once you have configured your Amazon S3 bucket and created your labeling job, you can add objects to your bucket and Ground Truth either sends that object to workers or places it on your Amazon SQS queue. 

To learn more, see [Creating Amazon S3 based bucket event notifications based of the Amazon SNS defined in your labeling job](sms-streaming-s3-setup.md).

**Important**  
This option is not available for text-based labeling jobs such as text classification and named entity recognition.

# Manage labeling requests with an Amazon SQS queue
<a name="sms-streaming-how-it-works-sqs"></a>

When Ground Truth creates your streaming labeling job, it creates an Amazon SQS queue in the AWS account used to create the labeling job. The queue name is `GroundTruth-labeling_job_name` where `labeling_job_name` is the name of your labeling job, in lowercase letters. When you send data objects to your labeling job, Ground Truth either sends the data objects directly to workers or places the task in your queue to be processed at a later time. If a data object is not sent to a worker after 14 days, it expires and is removed from the queue. You can setup an alarm in Amazon SQS to detect when objects expire and use this mechanism to control the volume of objects you send to your labeling job.

**Important**  
Modifying, deleting, or sending objects directly to the Amazon SQS queue associated with your streaming labeling job may lead to job failures. 

# Receive output data from a streaming labeling job
<a name="sms-streaming-how-it-works-output-data"></a>

Your Amazon S3 output bucket is periodically updated with new output data from your streaming labeling job. Optionally, you can specify an Amazon SNS output topic. Each time a worker submits a labeled object, a notification with the output data is sent to that topic. You can subscribe an endpoint to your SNS output topic to receive notifications or trigger events when you receive output data from a labeling task. Use an Amazon SNS output topic if you want to do real time chaining to another streaming job and receive an Amazon SNS notifications each time a data object is submitted by a worker.

To learn more, see [Subscribe an Endpoint to Your Amazon SNS Output Topic](sms-create-sns-input-topic.md#sms-streaming-subscribe-output-topic).

# Duplicate message handling
<a name="sms-streaming-impotency"></a>

For data objects sent in real time, Ground Truth guarantees idempotency by ensuring each unique object is only sent for labeling once, even if the input message referring to that object is received multiple times (duplicate messages). To do this, each data object sent to a streaming labeling job is assigned a *deduplication ID*, which is identified with a *deduplication key*. If you send your requests to label data objects directly through your Amazon SNS input topic using Amazon SNS messages, you can optionally choose a custom deduplication key and deduplication IDs for your objects. For more information, see [Specify a deduplication key and ID in an Amazon SNS message](sms-streaming-impotency-create.md).

If you do not provide your own deduplication key, or if you use the Amazon S3 configuration to send data objects to your labeling job, Ground Truth uses one of the following for the deduplication ID:
+ For messages sent directly to your Amazon SNS input topic, Ground Truth uses the SNS message ID. 
+ For messages that come from an Amazon S3 configuration, Ground Truth creates a deduplication ID by combining the Amazon S3 URI of the object with the [sequencer token](https://docs.aws.amazon.com/AmazonS3/latest/dev/notification-content-structure.html) in the message.

# Specify a deduplication key and ID in an Amazon SNS message
<a name="sms-streaming-impotency-create"></a>

When you send a data object to your streaming labeling job using an Amazon SNS message, you have the option to specify your deduplication key and deduplication ID in one of the following ways. In all of these scenarios, identify your deduplication key with `dataset-objectid-attribute-name`.

**Bring Your Own Deduplication Key and ID**

Create your own deduplication key and deduplication ID by configuring your Amazon SNS message as follows. Replace `byo-key` with your key and `UniqueId` with the deduplication ID for that data object.

```
{
    "source-ref":"s3://amzn-s3-demo-bucket/prefix/object1", 
    "dataset-objectid-attribute-name":"byo-key",
    "byo-key":"UniqueId" 
}
```

Your deduplication key can be up to 140 characters. Supported patterns include: `"^[$a-zA-Z0-9](-*[a-zA-Z0-9])*"`.

Your deduplication ID can be up to 1,024 characters. Supported patterns include: `^(https|s3)://([^/]+)/?(.*)$`.

**Use an Existing Key for your Deduplication Key**

You can use an existing key in your message as the deduplication key. When you do this, the value associated with that key is used for the deduplication ID. 

For example, you can specify use the `source-ref` key as your deduplication key by formatting your message as follows: 

```
{
    "source-ref":"s3://amzn-s3-demo-bucket/prefix/object1",
    "dataset-objectid-attribute-name":"source-ref" 
}
```

In this example, Ground Truth uses `"s3://amzn-s3-demo-bucket/prefix/object1"` for the deduplication id.

# Find deduplication key and ID in your output data
<a name="sms-streaming-impotency-output"></a>

You can see the deduplication key and ID in your output data. The deduplication key is identified by `dataset-objectid-attribute-name`. When you use your own custom deduplication key, your output contains something similar to the following:

```
"dataset-objectid-attribute-name": "byo-key",
"byo-key": "UniqueId",
```

When you do not specify a key, you can find the deduplication ID that Ground Truth assigned to your data object as follows. The `$label-attribute-name-object-id` parameter identifies your deduplication ID. 

```
{
    "source-ref":"s3://bucket/prefix/object1", 
    "dataset-objectid-attribute-name":"$label-attribute-name-object-id"
    "label-attribute-name" :0,
    "label-attribute-name-metadata": {...},
    "$label-attribute-name-object-id":"<service-generated-key>"
}
```

For `<service-generated-key>`, if the data object came through an Amazon S3 configuration, Ground Truth adds a unique value used by the service and emits a new field keyed by `$sequencer` which shows the Amazon S3 sequencer used. If object was fed to SNS directly, Ground Truth use the SNS message ID.

**Note**  
Do not use the `$` character in your label attribute name. 

# Input Data Quotas
<a name="input-data-limits"></a>

Input datasets used in semantic segmentation labeling jobs have a quota of 20,000 items. For all other labeling job types, the dataset size quota is 100,000 items. To request an increase to the quota for labeling jobs other than semantic segmentation jobs, review the procedures in [AWS Service Quotas](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) to request a quota increase.

Input image data for active and non-active learning labeling jobs must not exceed size and resolution quotas. *Active learning* refers to labeling job that use [automated data labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html). *Non-active learning* refers to labeling jobs that don't use automated data labeling.

Additional quotas apply for label categories for all task types, and for input data and labeling category attributes for 3D point cloud and video frame task types. 

## Input File Size Quota
<a name="input-file-size-limit"></a>

Input files can't exceed the following size- quotas for both active and non-active learning labeling jobs. There is no input file size quota for videos used in [video classification](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-video-classification.html) labeling jobs.


| Labeling Job Task Type | Input File Size Quota | 
| --- | --- | 
| Image classification | 40 MB | 
| Bounding box (Object detection) | 40 MB | 
| Semantic segmentation | 40 MB | 
| Bounding box (Object detection) label adjustment | 40 MB | 
| Semantic segmentation label adjustment | 40 MB | 
| Bounding box (Object detection) label verification | 40 MB | 
| Semantic segmentation label verification | 40 MB | 

## Input Image Resolution Quotas
<a name="non-active-learning-input-data-limits"></a>

Image file resolution refers to the number of pixels in an image, and determines the amount of detail an image holds. Image resolution quotas differ depending on the labeling job type and the SageMaker AI built-in algorithm used. The following table lists the resolution quotas for images used in active and non-active learning labeling jobs.


| Labeling Job Task Type | **Resolution Quota - Non Active Learning** | Resolution Quota - Active Learning | 
| --- | --- | --- | 
| Image classification | 100 million pixels | 3840 x 2160 pixels (4 K) | 
| Bounding box (Object detection) | 100 million pixels | 3840 x 2160 pixels (4 K) | 
| Semantic segmentation | 100 million pixels | 1920 x 1080 pixels (1080 p) | 
| Object detection label adjustment | 100 million pixels | 3840 x 2160 pixels (4 K) | 
| Semantic segmentation label adjustment | 100 million pixels | 1920 x 1080 pixels (1080 p) | 
| Object detection label verification | 100 million pixels | Not available | 
| Semantic segmentation label verification | 100 million pixels | Not available | 

## Label Category Quotas
<a name="sms-label-quotas"></a>

Each labeling job task type has a quota for the number of label categories you can specify. Workers select label categories to create annotations. For example, you may specify label categories *car*, *pedestrian*, and *biker* when creating a bounding box labeling job and workers will select the *car* category before drawing bounding boxes around cars.

**Important**  
Label category names cannot exceed 256 characters.   
All label categories must be unique. You cannot specify duplicate label categories. 

The following label category limits apply to labeling jobs. Quotas for label categories depend on whether you use the SageMaker API operation `CreateLabelingJob` or the console to create a labeling job.


****  

| Labeling Job Task Type | Label Category Quota - API | Label Category Quota - Console | 
| --- | --- | --- | 
| Image classification (Multi-label) | 50 | 50 | 
| Image classification (Single label) | Unlimited | 30 | 
| Bounding box (Object detection) | 50 | 50 | 
| Label verification | Unlimited | 30 | 
| Semantic segmentation (with active learning) | 20 | 10 | 
| Semantic segmentation (without active learning) | Unlimited | 10 | 
| Named entity recognition | Unlimited | 30 | 
| Text classification (Multi-label) | 50 | 50 | 
| Text classification (Single label) | Unlimited | 30 | 
| Video classification | 30 | 30 | 
| Video frame object detection | 30 | 30 | 
| Video frame object tracking | 30 | 30 | 
| 3D point cloud object detection | 30 | 30 | 
| 3D point cloud object tracking | 30 | 30 | 
| 3D point cloud semantic segmentation | 30 | 30 | 

## Generative AI Labeling Job Quotas
<a name="gen-ai-labeling-job-quotas"></a>

The following quotas apply for question-answer pairs that you provide in the labeling application.


| Quota Type | Data Quota | 
| --- | --- | 
| Question-answer pairs | Minimum is one pair. Maximum is 20 pairs. | 
| Word count of a question | Minimum is one word. Maximum is 200 words. | 
| Word count of an answer | Minimum is one word. Maximum is 200 words. | 

## 3D Point Cloud and Video Frame Labeling Job Quotas
<a name="sms-input-data-quotas-other"></a>

The following quotas apply for 3D point cloud and video frame labeling job input data.


****  

| Labeling Job Task Type | Input Data Quota | 
| --- | --- | 
| Video frame object detection  |  2,000 video frames (images) per sequence  | 
| Video frame object detection  |  10 video frame sequences per manifest file | 
| Video frame object tracking |  2,000 video frames (images) per sequence  | 
| Video frame object tracking |  10 video frame sequences per manifest file | 
| 3D point cloud object detection |  100,000 point cloud frames per labeling job | 
| 3D point cloud object tracking |  100,000 point cloud frame sequences per labeling job | 
| 3D point cloud object tracking |  500 point cloud frames in each sequence file | 

When you create a video frame or 3D point cloud labeling job, you can add one or more *label category attributes* to each label category that you specify to have workers provide more information about an annotation.

Each label category attribute has a single label category attribute `name`, and a list of one or more options (values) to choose from. To learn more, see [Worker user interface (UI)](sms-point-cloud-general-information.md#sms-point-cloud-worker-task-ui) for 3D point cloud labeling jobs and [Worker user interface (UI)](sms-video-overview.md#sms-video-worker-task-ui) for video frame labeling jobs. 

 The following quotas apply to the number of label category attributes names and values you can specify for labeling jobs.


****  

| Labeling Job Task Type | Label Category Attribute (name) Quota | Label Category Attribute Values Quota | 
| --- | --- | --- | 
| Video frame object detection  | 10 | 10 | 
| Video frame object tracking | 10 | 10 | 
| 3D point cloud object detection | 10 | 10 | 
| 3D point cloud object tracking | 10 | 10 | 
| 3D point cloud semantic segmentation | 10 | 10 | 

# Select Data for Labeling
<a name="sms-data-filtering"></a>

You can use the Amazon SageMaker AI console to select a portion of your dataset for labeling. The data must be stored in an Amazon S3 bucket. You have three options:
+ Use the full dataset.
+ Choose a randomly selected sample of the dataset.
+ Specify a subset of the dataset using a query.

The following options are available in the **Labeling jobs** section of the [SageMaker AI console](https://console.aws.amazon.com/sagemaker/groundtruth) after selecting **Create labeling job**. To learn how to create a labeling job in the console, see [Getting started: Create a bounding box labeling job with Ground Truth](sms-getting-started.md). To configure the dataset that you use for labeling, in the **Job overview** section, choose **Additional configuration**.

## Use the Full Dataset
<a name="sms-full-dataset"></a>

When you choose to use the **Full dataset**, you must provide a manifest file for your data objects. You can provide the path of the Amazon S3 bucket that contains the manifest file or use the SageMaker AI console to create the file. To learn how to create a manifest file using the console, see [Automate data setup for labeling jobs](sms-console-create-manifest-file.md). 

## Choose a Random Sample
<a name="sms-random-dataset"></a>

When you want to label a random subset of your data, select **Random sample**. The dataset is stored in the Amazon S3 bucket specified in the ** Input dataset location ** field. 

After you have specified the percentage of data objects that you want to include in the sample, choose **Create subset**. SageMaker AI randomly picks the data objects for your labeling job. After the objects are selected, choose **Use this subset**. 

SageMaker AI creates a manifest file for the selected data objects. It also modifies the value in the **Input dataset location** field to point to the new manifest file.

## Specify a Subset
<a name="sms-select-dataset"></a>

**Amazon S3 Select**  
Amazon S3 Select is no longer available to new customers. Existing customers of Amazon S3 Select can continue to use the feature as usual. To learn more see, [How to optimize querying your data in Amazon S3](https://aws.amazon.com/blogs/storage/how-to-optimize-querying-your-data-in-amazon-s3/)

You can specify a subset of your data objects using an Amazon S3 `SELECT` query on the object file names. 

The `SELECT` statement of the SQL query is defined for you. You provide the `WHERE` clause to specify which data objects should be returned.

For more information about the Amazon S3 `SELECT` statement, see [ Selecting Content from Objects](https://docs.aws.amazon.com/AmazonS3/latest/dev/selecting-content-from-objects.html).

Choose **Create subset** to start the selection, and then choose **Use this subset** to use the selected data. 

SageMaker AI creates a manifest file for the selected data objects. It also updates the value in the **Input dataset location** field to point to the new manifest file.

# 3D Point Cloud Input Data
<a name="sms-point-cloud-input-data"></a>

To create a 3D point cloud labeling job, you must create an input manifest file. Use this topic to learn the formatting requirements of the input manifest file for each task type. To learn about the raw input data formats Ground Truth accepts for 3D point cloud labeling jobs, see the section [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md).

Use your [labeling job task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-task-types.html) to choose a topics on [Input Manifest Files for 3D Point Cloud Labeling Jobs](sms-point-cloud-input-manifest.md) to learn about the formatting requirements for each line of your input manifest file.

**Topics**
+ [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md)
+ [Input Manifest Files for 3D Point Cloud Labeling Jobs](sms-point-cloud-input-manifest.md)
+ [Understand Coordinate Systems and Sensor Fusion](sms-point-cloud-sensor-fusion-details.md)

# Accepted Raw 3D Data Formats
<a name="sms-point-cloud-raw-data-types"></a>

Ground Truth uses your 3D point cloud data to render a 3D scenes that workers annotate. This section describes the raw data formats that are accepted for point cloud data and sensor fusion data for a point cloud frame. To learn how to create an input manifest file to connect your raw input data files with Ground Truth, see [Input Manifest Files for 3D Point Cloud Labeling Jobs](sms-point-cloud-input-manifest.md).

For each frame, Ground Truth supports Compact Binary Pack Format (.bin) and ASCII (.txt) files. These files contain information about the location (`x`, `y`, and `z` coordinates) of all points that make up that frame, and, optionally, information about the pixel color of each point for colored point clouds. When you create a 3D point cloud labeling job input manifest file, you can specify the format of your raw data in the `format` parameter. 

The following table lists elements that Ground Truth supports in point cloud frame files to describe individual points. 


****  

| Symbol | Value | 
| --- | --- | 
|  `x`  |  The x coordinate of the point.  | 
|  `y`  |  The y coordinate of the point.  | 
|  `z`  |  The z coordinate of the point.  | 
|  `i`  |  The intensity of the point.  | 
|  `r`  |  The red color channel component. An 8-bit value (0-255).  | 
|  `g`  |  The green color channel component. An 8-bit value (0-255)  | 
|  `b`  |  The blue color channel component. An 8-bit value (0-255)  | 

Ground Truth assumes the following about your input data:
+ All of the positional coordinates (x, y, z) are in meters. 
+ All the pose headings (qx, qy, qz, qw) are measured in Spatial [Quaternions](https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation) .

## Compact Binary Pack Format
<a name="sms-point-cloud-raw-data-cbpf-format"></a>

The Compact Binary Pack Format represents a point cloud as an ordered set of a stream of points. Each point in the stream is an ordered binary pack of 4-byte float values in some variant of the form `xyzirgb`. The `x`, `y`, and `z` elements are required and additional information about that pixel can be included in a variety of ways using `i`, `r`, `g`, and `b`. 

To use a binary file to input point cloud frame data to a Ground Truth 3D point cloud labeling job, enter `binary/` in the `format` parameter for your input manifest file and replace `` with the order of elements in each binary pack. For example, you may enter one of the following for the `format` parameter. 
+ `binary/xyzi` – When you use this format, your point element stream would be in the following order: `x1y1z1i1x2y2z2i2...`
+ `binary/xyzrgb` – When you use this format, your point element stream would be in the following order: `x1y1z1r1g1b1x2y2z2r2g2b2...`
+ `binary/xyzirgb` – When you use this format, your point element stream would be in the following order: `x1y1z1i1r1g1b1x2y2z2i2r2g2b2...`

When you use a binary file for your point cloud frame data, if you do not enter a value for `format`, the default pack format `binary/xyzi` is used. 

## ASCII Format
<a name="sms-point-cloud-raw-data-ascii-format"></a>

The ASCII format uses a text file to represent a point cloud, where each line in the ASCII point cloud file represents a single point. Each point is a line the text file and contains white space separated values, each of which is a 4-byte float ASCII values. The `x`, `y`, and `z` elements are required for each point and additional information about that point can be included in a variety of ways using `i`, `r`, `g`, and `b`.

To use a text file to input point cloud frame data to a Ground Truth 3D point cloud labeling job, enter `text/` in the `format` parameter for your input manifest file and replace `` with the order of point elements on each line. 

For example, if you enter `text/xyzi` for `format`, your text file for each point cloud frame should look similar to the following: 

```
x1 y1 z1 i1
x2 y2 z2 i2
...
...
```

If you enter `text/xyzrgb`, your text file should look similar to the following: 

```
x1 y1 z1 r1 g1 b1
x2 y2 z2 r2 g2 b1
...
...
```

When you use a text file for your point cloud frame data, if you do not enter a value for `format`, the default format `text/xyzi` will be used. 

## Point Cloud Resolution Limits
<a name="sms-point-cloud-resolution"></a>

Ground Truth does not have a resolution limit for 3D point cloud frames. However, we recommend that you limit each point cloud frame to 500K points for optimal performance. When Ground Truth renders the 3D point cloud visualization, it must be viewable on your workers' computers, which depends on workers' computer hardware. Point cloud frames that are larger than 1 million points may not render on standard machines, or may take too long to load. 

# Input Manifest Files for 3D Point Cloud Labeling Jobs
<a name="sms-point-cloud-input-manifest"></a>

When you create a labeling job, you provide an input manifest file where each line of the manifest describes a unit of task to be completed by annotators. The format of your input manifest file depends on your task type. 
+ If you are creating a 3D point cloud **object detection** or **semantic segmentation** labeling job, each line in your input manifest file contains information about a single 3D point cloud frame. This is called a *point cloud frame input manifest*. To learn more, see [Create a Point Cloud Frame Input Manifest File](sms-point-cloud-single-frame-input-data.md). 
+ If you are creating a 3D point cloud **object tracking** labeling job, each line of your input manifest file contains a sequence of 3D point cloud frames and associated data. This is called a *point cloud sequence input manifest*. To learn more, see [Create a Point Cloud Sequence Input Manifest](sms-point-cloud-multi-frame-input-data.md). 

# Create a Point Cloud Frame Input Manifest File
<a name="sms-point-cloud-single-frame-input-data"></a>

The manifest is a UTF-8 encoded file in which each line is a complete and valid JSON object. Each line is delimited by a standard line break, \$1n or \$1r\$1n. Because each line must be a valid JSON object, you can't have unescaped line break characters. In the single-frame input manifest file, each line in the manifest contains data for a single point cloud frame. The point cloud frame data can either be stored in binary or ASCII format (see [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md)). This is the manifest file formatting required for 3D point cloud object detection and semantic segmentation. Optionally, you can also provide camera sensor fusion data for each point cloud frame. 

Ground Truth supports point cloud and video camera sensor fusion in the [world coordinate system](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-world-coordinate-system) for all modalities. If you can obtain your 3D sensor extrinsic (like a LiDAR extrinsic), we recommend that you transform 3D point cloud frames into the world coordinate system using the extrinsic. For more information, see [Sensor Fusion](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-sensor-fusion). 

However, if you cannot obtain a point cloud in world coordinate system, you can provide coordinates in the original coordinate system that the data was captured in. If you are providing camera data for sensor fusion, it is recommended that you provide LiDAR sensor and camera pose in the world coordinate system. 

To create a single-frame input manifest file, you will identify the location of each point cloud frame that you want workers to label using the `source-ref` key. Additionally, you must use the `source-ref-metadata` key to identify the format of your dataset, a timestamp for that frame, and, optionally, sensor fusion data and video camera images.

The following example demonstrates the syntax used for an input manifest file for a single-frame point cloud labeling job. The example includes two point cloud frames. For details about each parameter, see the table following this example. 

**Important**  
Each line in your input manifest file must be in [JSON Lines](http://jsonlines.org/) format. The following code block shows an input manifest file with two JSON objects. Each JSON object is used to point to and provide details about a single point cloud frame. The JSON objects have been expanded for readability, but you must minimize each JSON object to fit on a single line when creating an input manifest file. An example is provided under this code block.

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/examplefolder/frame1.bin",
    "source-ref-metadata":{
        "format": "binary/xyzi",
        "unix-timestamp": 1566861644.759115,
        "ego-vehicle-pose":{
            "position": {
                "x": -2.7161461413869947,
                "y": 116.25822288149078,
                "z": 1.8348751887989483
            },
            "heading": {
                "qx": -0.02111296123795955,
                "qy": -0.006495469416730261,
                "qz": -0.008024565904865688,
                "qw": 0.9997181192298087
            }
        },
        "prefix": "s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/someprefix/",
        "images": [
        {
            "image-path": "images/frame300.bin_camera0.jpg",
            "unix-timestamp": 1566861644.759115,
            "fx": 847.7962624528487,
            "fy": 850.0340893791985,
            "cx": 576.2129134707038,
            "cy": 317.2423573573745,
            "k1": 0,
            "k2": 0,
            "k3": 0,
            "k4": 0,
            "p1": 0,
            "p2": 0,
            "skew": 0,
            "position": {
                "x": -2.2722515189268138,
                "y": 116.86003310568965,
                "z": 1.454614668542299
            },
            "heading": {
                "qx": 0.7594754093069037,
                "qy": 0.02181790885672969,
                "qz": -0.02461725233103356,
                "qw": -0.6496916273040025
            },
            "camera-model": "pinhole"
        }]
    }
}
{
    "source-ref": "s3://amzn-s3-demo-bucket/examplefolder/frame2.bin",
    "source-ref-metadata":{
        "format": "binary/xyzi",
        "unix-timestamp": 1566861632.759133,
        "ego-vehicle-pose":{
            "position": {
                "x": -2.7161461413869947,
                "y": 116.25822288149078,
                "z": 1.8348751887989483
            },
            "heading": {
                "qx": -0.02111296123795955,
                "qy": -0.006495469416730261,
                "qz": -0.008024565904865688,
                "qw": 0.9997181192298087
            }
        },
        "prefix": "s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/someprefix/",
        "images": [
        {
            "image-path": "images/frame300.bin_camera0.jpg",
            "unix-timestamp": 1566861644.759115,
            "fx": 847.7962624528487,
            "fy": 850.0340893791985,
            "cx": 576.2129134707038,
            "cy": 317.2423573573745,
            "k1": 0,
            "k2": 0,
            "k3": 0,
            "k4": 0,
            "p1": 0,
            "p2": 0,
            "skew": 0,
            "position": {
                "x": -2.2722515189268138,
                "y": 116.86003310568965,
                "z": 1.454614668542299
            },
            "heading": {
                "qx": 0.7594754093069037,
                "qy": 0.02181790885672969,
                "qz": -0.02461725233103356,
                "qw": -0.6496916273040025
            },
            "camera-model": "pinhole"
        }]
    }
}
```

When you create an input manifest file, you must collapse your JSON objects to fit on a single line. For example, the code block above would appear as follows in an input manifest file:

```
{"source-ref":"s3://amzn-s3-demo-bucket/examplefolder/frame1.bin","source-ref-metadata":{"format":"binary/xyzi","unix-timestamp":1566861644.759115,"ego-vehicle-pose":{"position":{"x":-2.7161461413869947,"y":116.25822288149078,"z":1.8348751887989483},"heading":{"qx":-0.02111296123795955,"qy":-0.006495469416730261,"qz":-0.008024565904865688,"qw":0.9997181192298087}},"prefix":"s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/someprefix/","images":[{"image-path":"images/frame300.bin_camera0.jpg","unix-timestamp":1566861644.759115,"fx":847.7962624528487,"fy":850.0340893791985,"cx":576.2129134707038,"cy":317.2423573573745,"k1":0,"k2":0,"k3":0,"k4":0,"p1":0,"p2":0,"skew":0,"position":{"x":-2.2722515189268138,"y":116.86003310568965,"z":1.454614668542299},"heading":{"qx":0.7594754093069037,"qy":0.02181790885672969,"qz":-0.02461725233103356,"qw":-0.6496916273040025},"camera-model":"pinhole"}]}}
{"source-ref":"s3://amzn-s3-demo-bucket/examplefolder/frame2.bin","source-ref-metadata":{"format":"binary/xyzi","unix-timestamp":1566861632.759133,"ego-vehicle-pose":{"position":{"x":-2.7161461413869947,"y":116.25822288149078,"z":1.8348751887989483},"heading":{"qx":-0.02111296123795955,"qy":-0.006495469416730261,"qz":-0.008024565904865688,"qw":0.9997181192298087}},"prefix":"s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/someprefix/","images":[{"image-path":"images/frame300.bin_camera0.jpg","unix-timestamp":1566861644.759115,"fx":847.7962624528487,"fy":850.0340893791985,"cx":576.2129134707038,"cy":317.2423573573745,"k1":0,"k2":0,"k3":0,"k4":0,"p1":0,"p2":0,"skew":0,"position":{"x":-2.2722515189268138,"y":116.86003310568965,"z":1.454614668542299},"heading":{"qx":0.7594754093069037,"qy":0.02181790885672969,"qz":-0.02461725233103356,"qw":-0.6496916273040025},"camera-model":"pinhole"}]}}
```

The following table shows the parameters you can include in your input manifest file:


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `source-ref`  |  Yes  |  String **Accepted string value format**:  `s3://<bucket-name>/<folder-name>/point-cloud-frame-file`  |  The Amazon S3 location of a single point cloud frame.  | 
|  `source-ref-metadata`  |  Yes  |  JSON object **Accepted parameters**:  `format`, `unix-timestamp`, `ego-vehicle-pose`, `position`, `prefix`, `images`  |  Use this parameter to include additional information about the point cloud in `source-ref`, and to provide camera data for sensor fusion.   | 
|  `format`  |  No  |  String **Accepted string values**: `"binary/xyz"`, `"binary/xyzi"`, `"binary/xyzrgb"`, `"binary/xyzirgb"`, `"text/xyz"`, `"text/xyzi"`, `"text/xyzrgb"`, `"text/xyzirgb"` **Default Values**:  When the file identified in `source-ref` has a .bin extension, `binary/xyzi` When the file identified in `source-ref` has a .txt extension, `text/xyzi`  |  Use this parameter to specify the format of your point cloud data. For more information, see [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md).  | 
|  `unix-timestamp`  |  Yes  |  Number A unix timestamp.   |  The unix timestamp is the number of seconds since January 1st, 1970 until the UTC time that the data was collected by a sensor.   | 
|  `ego-vehicle-pose`  |  No  |  JSON object  |  The pose of the device used to collect the point cloud data. For more information about this parameter, see [Include Vehicle Pose Information in Your Input Manifest](#sms-point-cloud-single-frame-ego-vehicle-input).  | 
|  `prefix`  |  No  |  String **Accepted string value format**:  `s3://<bucket-name>/<folder-name>/`  |  The location in Amazon S3 where your metadata, such as camera images, is stored for this frame.  The prefix must end with a forward slash: `/`.  | 
|  `images`  |  No  |  List  |  A list of parameters describing color camera images used for sensor fusion. You can include up to 8 images in this list. For more information about the parameters required for each image, see [Include Camera Data in Your Input Manifest](#sms-point-cloud-single-frame-image-input).   | 

## Include Vehicle Pose Information in Your Input Manifest
<a name="sms-point-cloud-single-frame-ego-vehicle-input"></a>

Use the ego-vehicle location to provide information about the location of the vehicle used to capture point cloud data. Ground Truth use this information to compute LiDAR extrinsic matrix. 

Ground Truth uses extrinsic matrices to project labels to and from the 3D scene and 2D images. For more information, see [Sensor Fusion](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-sensor-fusion).

The following table provides more information about the `position` and orientation (`heading`) parameters that are required when you provide ego-vehicle information. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `position`  |  Yes  |  JSON object **Required Parameters**: `x`, `y`, and `z`. Enter numbers for these parameters.   |  The translation vector of the ego vehicle in the world coordinate system.   | 
|  `heading`  |  Yes  |  JSON Object **Required Parameters**: `qx`, `qy`, `qz`, and `qw`. Enter numbers for these parameters.   |  The orientation of the frame of reference of the device or sensor mounted on the vehicle sensing the surrounding, measured in [quaternions](https://en.wikipedia.org/wiki/Quaternion), (`qx`, `qy`, `qz`, `qw`) in the a coordinate system.  | 

## Include Camera Data in Your Input Manifest
<a name="sms-point-cloud-single-frame-image-input"></a>

If you want to include video camera data with a frame, use the following parameters to provide information about each image. The **Required** column below applies when the `images` parameter is included in the input manifest file under `source-ref-metadata`. You are not required to include images in your input manifest file. 

If you include camera images, you must include information about the camera `position` and `heading` used the capture the images in the world coordinate system.

If your images are distorted, Ground Truth can automatically undistort them using information you provide about the image in your input manifest file, including distortion coefficients (`k1`, `k2`, `k3`, `k4`, `p1`, `p1`), the camera model and the camera intrinsic matrix. The intrinsic matrix is made up of focal length (`fx`, `fy`), and the principal point (`cx`, `cy)`. See [Intrinsic Matrix](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-intrinsic) to learn how Ground Truth uses the camera intrinsic. If distortion coefficients are not included, Ground Truth will not undistort an image. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `image-path`  |  Yes  |  String **Example of format**:  `<folder-name>/<imagefile.png>`  |  The relative location, in Amazon S3 of your image file. This relative path will be appended to the path you specify in `prefix`.   | 
|  `unix-timestamp`  |  Yes  |  Number  |  The unix timestamp is the number of seconds since January 1st, 1970 until the UTC time that the data was collected by a camera.   | 
|  `camera-model`  |  No  |  String: **Accepted Values**: `"pinhole"`, `"fisheye"` **Default**: `"pinhole"`  |  The model of the camera used to capture the image. This information is used to undistort camera images.   | 
|  `fx, fy`  |  Yes  |  Numbers  |  The focal length of the camera, in the x (`fx`) and y (`fy`) directions.  | 
|  `cx, cy`  |  Yes  | Numbers |  The x (`cx`) and y (`cy`) coordinates of the principal point.   | 
|  `k1, k2, k3, k4`  |  No  |  Number  |  Radial distortion coefficients. Supported for both **fisheye** and **pinhole** camera models.   | 
|  `p1, p2`  |  No  |  Number  |  Tangential distortion coefficients. Supported for **pinhole** camera models.  | 
|  `skew`  |  No  |  Number  |  A parameter to measure the skew of an image.   | 
|  `position`  |  Yes  |  JSON object **Required Parameters**: `x`, `y`, and `z`. Enter numbers for these parameters.   | The location or origin of the frame of reference of the camera mounted on the vehicle capturing images. | 
|  `heading`  |  Yes  |  JSON Object **Required Parameters**: `qx`, `qy`, `qz`, and `qw`. Enter numbers for these parameters.   |  The orientation of the frame of reference of the camera mounted on the vehicle capturing images, measured using [quaternions](https://en.wikipedia.org/wiki/Quaternion), (`qx`, `qy`, `qz`, `qw`), in the world coordinate system.   | 

## Point Cloud Frame Limits
<a name="sms-point-cloud-single-frame-limits"></a>

You can include up to 100,000 point cloud frames in your input manifest file. 3D point cloud labeling job have longer pre-processing times than other Ground Truth task types. For more information, see [Job pre-processing time](sms-point-cloud-general-information.md#sms-point-cloud-job-creation-time).

# Create a Point Cloud Sequence Input Manifest
<a name="sms-point-cloud-multi-frame-input-data"></a>

The manifest is a UTF-8 encoded file in which each line is a complete and valid JSON object. Each line is delimited by a standard line break, \$1n or \$1r\$1n. Because each line must be a valid JSON object, you can't have unescaped line break characters. In the point cloud sequence input manifest file, each line in the manifest contains a sequence of point cloud frames. The point cloud data for each frame in the sequence can either be stored in binary or ASCII format. For more information, see [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md). This is the manifest file formatting required for 3D point cloud object tracking. Optionally, you can also provide point attribute and camera sensor fusion data for each point cloud frame. When you create a sequence input manifest file, you must provide LiDAR and video camera sensor fusion data in a [world coordinate system](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-world-coordinate-system). 

The following example demonstrates the syntax used for an input manifest file when each line in the manifest is a sequence file. Each line in your input manifest file must be in [JSON Lines](http://jsonlines.org/) format.

```
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq1.json"}
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq2.json"}
```

The data for each sequence of point cloud frames needs to be stored in a JSON data object. The following is an example of the format you use for a sequence file. Information about each frame is included as a JSON object and is listed in the `frames` list. This is an example of a sequence file with two point cloud frame files, `frame300.bin` and `frame303.bin`. The *...* is used to indicated where you should include information for additional frames. Add a JSON object for each frame in the sequence.

The following code block includes a JSON object for a single sequence file. The JSON object has been expanded for readability.

```
{
  "seq-no": 1,
  "prefix": "s3://amzn-s3-demo-bucket/example_lidar_sequence_dataset/seq1/",
  "number-of-frames": 100,
  "frames":[
    {
        "frame-no": 300, 
        "unix-timestamp": 1566861644.759115, 
        "frame": "example_lidar_frames/frame300.bin", 
        "format": "binary/xyzi", 
        "ego-vehicle-pose":{
            "position": {
                "x": -2.7161461413869947,
                "y": 116.25822288149078,
                "z": 1.8348751887989483
            },
            "heading": {
                "qx": -0.02111296123795955,
                "qy": -0.006495469416730261,
                "qz": -0.008024565904865688,
                "qw": 0.9997181192298087
            }
        }, 
        "images": [
        {
            "image-path": "example_images/frame300.bin_camera0.jpg",
            "unix-timestamp": 1566861644.759115,
            "fx": 847.7962624528487,
            "fy": 850.0340893791985,
            "cx": 576.2129134707038,
            "cy": 317.2423573573745,
            "k1": 0,
            "k2": 0,
            "k3": 0,
            "k4": 0,
            "p1": 0,
            "p2": 0,
            "skew": 0,
            "position": {
                "x": -2.2722515189268138,
                "y": 116.86003310568965,
                "z": 1.454614668542299
            },
            "heading": {
                "qx": 0.7594754093069037,
                "qy": 0.02181790885672969,
                "qz": -0.02461725233103356,
                "qw": -0.6496916273040025
            },
            "camera-model": "pinhole"
        }]
    },
    {
        "frame-no": 303, 
        "unix-timestamp": 1566861644.759115, 
        "frame": "example_lidar_frames/frame303.bin", 
        "format": "text/xyzi", 
        "ego-vehicle-pose":{...}, 
        "images":[{...}]
    },
     ...
  ]
}
```

The following table provides details about the top-level parameters of a sequence file. For detailed information about the parameters required for individual frames in the sequence file, see [Parameters for Individual Point Cloud Frames](#sms-point-cloud-multi-frame-input-single-frame).


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `seq-no`  |  Yes  |  Integer  |  The ordered number of the sequence.   | 
|  `prefix`  |  Yes  |  String **Accepted Values**: `s3://<bucket-name>/<prefix>/`  |  The Amazon S3 location where the sequence files are located.  The prefix must end with a forward slash: `/`.  | 
|  `number-of-frames`  |  Yes  |  Integer  |  The total number of frames included in the sequence file. This number must match the total number of frames listed in the `frames` parameter in the next row.  | 
|  `frames`  |  Yes  |  List of JSON objects  |  A list of frame data. The length of the list must equal `number-of-frames`. In the worker UI, frames in a sequence will be the same as the order of frames in this array.  For details about the format of each frame, see [Parameters for Individual Point Cloud Frames](#sms-point-cloud-multi-frame-input-single-frame).   | 

## Parameters for Individual Point Cloud Frames
<a name="sms-point-cloud-multi-frame-input-single-frame"></a>

The following table shows the parameters you can include in your input manifest file.


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `frame-no`  |  No  |  Integer  |  A frame number. This is an optional identifier specified by the customer to identify the frame within a sequence. It is not used by Ground Truth.  | 
|  `unix-timestamp`  |  Yes  |  Number  |  The unix timestamp is the number of seconds since January 1st, 1970 until the UTC time that the data was collected by a sensor.  The timestamp for each frame must be different and timestamps must be sequential because they are used for cuboid interpolation. Ideally, this should be the real timestamp when the data was collected. If this is not available, you must use an incremental sequence of timestamps, where the first frame in your sequence file corresponds to the first timestamp in the sequence.  | 
|  `frame`  |  Yes  |  String **Example of format** `<folder-name>/<sequence-file.json>`  |  The relative location, in Amazon S3 of your sequence file. This relative path will be appended to the path you specify in `prefix`.  | 
|  `format`  |  No  |  String **Accepted string values**: `"binary/xyz"`, `"binary/xyzi"`, `"binary/xyzrgb"`, `"binary/xyzirgb"`, `"text/xyz"`, `"text/xyzi"`, `"text/xyzrgb"`, `"text/xyzirgb"` **Default Values**:  When the file identified in `source-ref` has a .bin extension, `binary/xyzi` When the file identified in `source-ref` has a .txt extension, `text/xyzi`  |  Use this parameter to specify the format of your point cloud data. For more information, see [Accepted Raw 3D Data Formats](sms-point-cloud-raw-data-types.md).  | 
|  `ego-vehicle-pose`  |  No  |  JSON object  |  The pose of the device used to collect the point cloud data. For more information about this parameter, see [Include Vehicle Pose Information in Your Input Manifest](#sms-point-cloud-multi-frame-ego-vehicle-input).  | 
|  `prefix`  |  No  |  String **Accepted string value format**:  `s3://<bucket-name>/<folder-name>/`  |  The location in Amazon S3 where your metadata, such as camera images, is stored for this frame.  The prefix must end with a forward slash: `/`.  | 
|  `images`  |  No  |  List  |  A list parameters describing color camera images used for sensor fusion. You can include up to 8 images in this list. For more information about the parameters required for each image, see [Include Camera Data in Your Input Manifest](#sms-point-cloud-multi-frame-image-input).   | 

## Include Vehicle Pose Information in Your Input Manifest
<a name="sms-point-cloud-multi-frame-ego-vehicle-input"></a>

Use the ego-vehicle location to provide information about the pose of the vehicle used to capture point cloud data. Ground Truth use this information to compute LiDAR extrinsic matrices. 

Ground Truth uses extrinsic matrices to project labels to and from the 3D scene and 2D images. For more information, see [Sensor Fusion](sms-point-cloud-sensor-fusion-details.md#sms-point-cloud-sensor-fusion).

The following table provides more information about the `position` and orientation (`heading`) parameters that are required when you provide ego-vehicle information. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `position`  |  Yes  |  JSON object **Required Parameters**: `x`, `y`, and `z`. Enter numbers for these parameters.   |  The translation vector of the ego vehicle in the world coordinate system.   | 
|  `heading`  |  Yes  |  JSON Object **Required Parameters**: `qx`, `qy`, `qz`, and `qw`. Enter numbers for these parameters.   |  The orientation of the frame of reference of the device or sensor mounted on the vehicle sensing the surrounding, measured in [quaternions](https://en.wikipedia.org/wiki/Quaternion), (`qx`, `qy`, `qz`, `qw`) in the a coordinate system.  | 

## Include Camera Data in Your Input Manifest
<a name="sms-point-cloud-multi-frame-image-input"></a>

If you want to include color camera data with a frame, use the following parameters to provide information about each image. The **Required** column in the following table applies when the `images` parameter is included in the input manifest file. You are not required to include images in your input manifest file. 

If you include camera images, you must include information about the `position` and orientation (`heading`) of the camera used the capture the images. 

If your images are distorted, Ground Truth can automatically undistort them using information you provide about the image in your input manifest file, including distortion coefficients (`k1`, `k2`, `k3`, `k4`, `p1`, `p1`), camera model and focal length (`fx`, `fy`), and the principal point (`cx`, `cy)`. To learn more about these coefficients and undistorting images, see [Camera calibration With OpenCV](https://docs.opencv.org/2.4.13.7/doc/tutorials/calib3d/camera_calibration/camera_calibration.html). If distortion coefficients are not included, Ground Truth will not undistort an image. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `image-path`  |  Yes  |  String **Example of format**:  `<folder-name>/<imagefile.png>`  |  The relative location, in Amazon S3 of your image file. This relative path will be appended to the path you specify in `prefix`.   | 
|  `unix-timestamp`  |  Yes  |  Number  |  The timestamp of the image.   | 
|  `camera-model`  |  No  |  String: **Accepted Values**: `"pinhole"`, `"fisheye"` **Default**: `"pinhole"`  |  The model of the camera used to capture the image. This information is used to undistort camera images.   | 
|  `fx, fy`  |  Yes  |  Numbers  |  The focal length of the camera, in the x (`fx`) and y (`fy`) directions.  | 
|  `cx, cy`  |  Yes  | Numbers |  The x (`cx`) and y (`cy`) coordinates of the principal point.   | 
|  `k1, k2, k3, k4`  |  No  |  Number  |  Radial distortion coefficients. Supported for both **fisheye** and **pinhole** camera models.   | 
|  `p1, p2`  |  No  |  Number  |  Tangential distortion coefficients. Supported for **pinhole** camera models.  | 
|  `skew`  |  No  |  Number  |  A parameter to measure any known skew in the image.  | 
|  `position`  |  Yes  |  JSON object **Required Parameters**: `x`, `y`, and `z`. Enter numbers for these parameters.   |  The location or origin of the frame of reference of the camera mounted on the vehicle capturing images.  | 
|  `heading`  |  Yes  |  JSON Object **Required Parameters**: `qx`, `qy`, `qz`, and `qw`. Enter numbers for these parameters.   |  The orientation of the frame of reference of the camera mounted on the vehicle capturing images, measured using [quaternions](https://en.wikipedia.org/wiki/Quaternion), (`qx`, `qy`, `qz`, `qw`).   | 

## Sequence File and Point Cloud Frame Limits
<a name="sms-point-cloud-multi-frame-limits"></a>

You can include up to 100,000 point cloud frame sequences in your input manifest file. You can include up to 500 point cloud frames in each sequence file. 

Keep in mind that 3D point cloud labeling job have longer pre-processing times than other Ground Truth task types. For more information, see [Job pre-processing time](sms-point-cloud-general-information.md#sms-point-cloud-job-creation-time).

# Understand Coordinate Systems and Sensor Fusion
<a name="sms-point-cloud-sensor-fusion-details"></a>

Point cloud data is always located in a coordinate system. This coordinate system may be local to the vehicle or the device sensing the surroundings, or it may be a world coordinate system. When you use Ground Truth 3D point cloud labeling jobs, all the annotations are generated using the coordinate system of your input data. For some labeling job task types and features, you must provide data in a world coordinate system. 

In this topic, you'll learn the following:
+ When you *are required to* provide input data in a world coordinate system or global frame of reference.
+ What a world coordinate is and how you can convert point cloud data to a world coordinate system. 
+ How you can use your sensor and camera extrinsic matrices to provide pose data when using sensor fusion. 

## Coordinate System Requirements for Labeling Jobs
<a name="sms-point-cloud-sensor-fusion-coordinate-requirements"></a>

If your point cloud data was collected in a local coordinate system, you can use an extrinsic matrix of the sensor used to collect the data to convert it to a world coordinate system or a global frame of reference. If you cannot obtain an extrinsic for your point cloud data and, as a result, cannot obtain point clouds in a world coordinate system, you can provide point cloud data in a local coordinate system for 3D point cloud object detection and semantic segmentation task types. 

For object tracking, you must provide point cloud data in a world coordinate system. This is because when you are tracking objects across multiple frames, the ego vehicle itself is moving in the world and so all of the frames need a point of reference. 

If you include camera data for sensor fusion, it is recommended that you provide camera poses in the same world coordinate system as the 3D sensor (such as a LiDAR sensor). 

## Using Point Cloud Data in a World Coordinate System
<a name="sms-point-cloud-world-coordinate-system"></a>

This section explains what a world coordinate system (WCS), also referred to as a *global frame of reference*, is and explains how you can provide point cloud data in a world coordinate system.

### What is a World Coordinate System?
<a name="sms-point-cloud-what-is-wcs"></a>

A WCS or global frame of reference is a fixed universal coordinate system in which vehicle and sensor coordinate systems are placed. For example, if multiple point cloud frames are located in different coordinate systems because they were collected from two sensors, a WCS can be used to translate all of the coordinates in these point cloud frames into a single coordinate system, where all frames have the same origin, (0,0,0). This transformation is done by translating the origin of each frame to the origin of the WCS using a translation vector, and rotating the three axes (typically x, y, and z) to the right orientation using a rotation matrix. This rigid body transformation is called a *homogeneous transformation*.

A world coordinate system is important in global path planning, localization, mapping, and driving scenario simulations. Ground Truth uses the right-handed Cartesian world coordinate system such as the one defined in [ISO 8855](https://www.iso.org/standard/51180.html), where the x axis is forward toward the car’s movement, y axis is left, and the z axis points up from the ground. 

The global frame of reference depends on the data. Some datasets use the LiDAR position in the first frame as the origin. In this scenario, all the frames use the first frame as a reference and device heading and position will be near the origin in the first frame. For example, KITTI datasets have the first frame as a reference for world coordinates. Other datasets use a device position that is different from the origin.

Note that this is not the GPS/IMU coordinate system, which is typically rotated by 90 degrees along the z-axis. If your point cloud data is in a GPS/IMU coordinate system (such as OxTS in the open source AV KITTI dataset), then you need to transform the origin to a world coordinate system (typically the vehicle's reference coordinate system). You apply this transformation by multiplying your data with transformation metrics (the rotation matrix and translation vector). This will transform the data from its original coordinate system to a global reference coordinate system. Learn more about this transformation in the next section. 

### Convert 3D Point Cloud Data to a WCS
<a name="sms-point-cloud-coordinate-system-general"></a>

Ground Truth assumes that your point cloud data has already been transformed into a reference coordinate system of your choice. For example, you can choose the reference coordinate system of the sensor (such as LiDAR) as your global reference coordinate system. You can also take point clouds from various sensors and transform them from the sensor's view to the vehicle's reference coordinate system view. You use the a sensor's extrinsic matrix, made up of a rotation matrix and translation vector, to convert your point cloud data to a WCS or global frame of reference. 

Collectively, the translation vector and rotation matrix can be used to make up an *extrinsic matrix*, which can be used to convert data from a local coordinate system to a WCS. For example, your LiDAR extrinsic matrix may be composed as follows, where `R` is the rotation matrix and `T` is the translation vector:

```
LiDAR_extrinsic = [R T;0 0 0 1]
```

For example, the autonomous driving KITTI dataset includes a rotation matrix and translation vector for the LiDAR extrinsic transformation matrix for each frame. The [pykitti](https://github.com/utiasSTARS/pykitti) python module can be used for loading the KITTI data, and in the dataset `dataset.oxts[i].T_w_imu` gives the LiDAR extrinsic transform for the `i`th frame with can be multiplied with points in that frame to convert them to a world frame - `np.matmul(lidar_transform_matrix, points)`. Multiplying a point in LiDAR frame with a LiDAR extrinsic matrix transforms it into world coordinates. Multiplying a point in the world frame with the camera extrinsic matrix gives the point coordinates in the camera's frame of reference.

The following code example demonstrates how you can convert point cloud frames from the KITTI dataset into a WCS. 

```
import pykitti
import numpy as np

basedir = '/Users/nameofuser/kitti-data'
date = '2011_09_26'
drive = '0079'

# The 'frames' argument is optional - default: None, which loads the whole dataset.
# Calibration, timestamps, and IMU data are read automatically. 
# Camera and velodyne data are available via properties that create generators
# when accessed, or through getter methods that provide random access.
data = pykitti.raw(basedir, date, drive, frames=range(0, 50, 5))

# i is frame number
i = 0

# lidar extrinsic for the ith frame
lidar_extrinsic_matrix = data.oxts[i].T_w_imu

# velodyne raw point cloud in lidar scanners own coordinate system
points = data.get_velo(i)

# transform points from lidar to global frame using lidar_extrinsic_matrix
def generate_transformed_pcd_from_point_cloud(points, lidar_extrinsic_matrix):
    tps = []
    for point in points:
        transformed_points = np.matmul(lidar_extrinsic_matrix, np.array([point[0], point[1], point[2], 1], dtype=np.float32).reshape(4,1)).tolist()
        if len(point) > 3 and point[3] is not None:
            tps.append([transformed_points[0][0], transformed_points[1][0], transformed_points[2][0], point[3]])
       
    return tps
    
# customer transforms points from lidar to global frame using lidar_extrinsic_matrix
transformed_pcl = generate_transformed_pcd_from_point_cloud(points, lidar_extrinsic_matrix)
```

## Sensor Fusion
<a name="sms-point-cloud-sensor-fusion"></a>

Ground Truth supports sensor fusion of point cloud data with up to 8 video camera inputs. This feature allows human labellers to view the 3D point cloud frame side-by-side with the synchronized video frame. In addition to providing more visual context for labeling, sensor fusion allows workers to adjust annotations in the 3D scene and in 2D images and the adjustment are projected into the other view. The following video demonstrates a 3D point cloud labeling job with LiDAR and camera sensor fusion. 

![\[Gif showing a 3D point cloud labeling job with LiDAR and camera sensor fusion.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/gifs/object_tracking/sensor-fusion.gif)


For best results, when using sensor fusion, your point cloud should be in a WCS. Ground Truth uses your sensor (such as LiDAR), camera, and ego vehicle pose information to compute extrinsic and intrinsic matrices for sensor fusion. 

### Extrinsic Matrix
<a name="sms-point-cloud-extrinsics"></a>

Ground Truth uses sensor (such as LiDAR) extrinsic and camera extrinsic and intrinsic matrices to project objects to and from the point cloud data's frame of reference to the camera's frame of reference. 

For example, in order to project a label from the 3D point cloud to camera image plane, Ground Truth transforms 3D points from LiDAR’s own coordinate system to the camera's coordinate system. This is typically done by first transforming 3D points from LiDAR’s own coordinate system to a world coordinate system (or a global reference frame) using the LiDAR extrinsic matrix. Ground Truth then uses the camera inverse extrinsic (which converts points from a global frame of reference to the camera's frame of reference) to transform the 3D points from world coordinate system obtained in previous step into the camera image plane. The LiDAR extrinsic matrix can also be used to transform 3D data into a world coordinate system. If your 3D data is already transformed into world coordinate system then the first transformation doesn’t have any impact on label translation, and label translation only depends on the camera inverse extrinsic. A view matrix is used to visualize projected labels. To learn more about these transformations and the view matrix, see [Ground Truth Sensor Fusion Transformations](#sms-point-cloud-extrinsic-intrinsic-explanation).

 Ground Truth computes these extrinsic matrices by using LiDAR and camera *pose data* that you provide: `heading` ( in quaternions: `qx`, `qy`, `qz`, and `qw`) and `position` (`x`, `y`, `z`). For the vehicle, typically the heading and position are described in vehicle's reference frame in a world coordinate system and are called a *ego vehicle pose*. For each camera extrinsic, you can add pose information for that camera. For more information, see [Pose](#sms-point-cloud-pose).

### Intrinsic Matrix
<a name="sms-point-cloud-intrinsic"></a>

Ground Truth use the camera extrinsic and intrinsic matrices to compute view metrics to transform labels to and from the 3D scene to camera images. Ground Truth computes the camera intrinsic matrix using camera focal length (`fx`, `fy`) and optical center coordinates (`cx`,`cy`) that you provide. For more information, see [Intrinsic and Distortion](#sms-point-cloud-camera-intrinsic-distortion).

### Image Distortion
<a name="sms-point-cloud-image-distortion"></a>

Image distortion can occur for a variety of reasons. For example, images may be distorted due to barrel or fish-eye effects. Ground Truth uses intrinsic parameters along with distortion co-efficient to undistort images you provide when creating 3D point cloud labeling jobs. If a camera image is already been undistorted, all distortion coefficients should be set to 0.

For more information about the transformations Ground Truth performs to undistort images, see [Camera Calibrations: Extrinsic, Intrinsic and Distortion](#sms-point-cloud-extrinsic-camera-explanation).

### Ego Vehicle
<a name="sms-point-cloud-ego-vehicle"></a>

To collect data for autonomous driving applications, the measurements used to generate point cloud data and are taken from sensors mounted on a vehicle, or the *ego vehicle*. To project label adjustments to and from the 3D scene and 2D images, Ground Truth needs your ego vehicle pose in a world coordinate system. The ego vehicle pose is comprised of position coordinates and orientation quaternion. 

 Ground Truth uses your ego vehicle pose to compute rotation and transformations matrices. Rotations in 3 dimensions can be represented by a sequence of 3 rotations around a sequence of axes. In theory, any three axes spanning the 3D Euclidean space are enough. In practice, the axes of rotation are chosen to be the basis vectors. The three rotations are expected to be in a global frame of reference (extrinsic). Ground Truth does not a support body centered frame of reference (intrinsic) which is attached to, and moves with, the object under rotation. To track objects, Ground Truth needs to measure from a global reference where all vehicles are moving. When using Ground Truth 3D point cloud labeling jobs, z specifies the axis of rotation (extrinsic rotation) and yaw Euler angles are in radians (rotation angle).

### Pose
<a name="sms-point-cloud-pose"></a>

Ground Truth uses pose information for 3D visualizations and sensor fusion. Pose information you input through your manifest file is used to compute extrinsic matrices. If you already have an extrinsic matrix, you can use it to extract sensor and camera pose data. 

For example in the autonomous driving KITTI dataset, the [pykitti](https://github.com/utiasSTARS/pykitti) python module can be used for loading the KITTI data. In the dataset `dataset.oxts[i].T_w_imu` gives the LiDAR extrinsic transform for the `i`th frame and it can be multiplied with the points to get them in a world frame - `matmul(lidar_transform_matrix, points)`. This transform can be converted into position (translation vector) and heading (in quaternion) of LiDAR for the input manifest file JSON format. Camera extrinsic transform for `cam0` in `i`th frame can be calculated by `inv(matmul(dataset.calib.T_cam0_velo, inv(dataset.oxts[i].T_w_imu)))` and this can be converted into heading and position for `cam0`.

```
import numpy

rotation = [[ 9.96714314e-01, -8.09890350e-02,  1.16333982e-03],
 [ 8.09967396e-02,  9.96661051e-01, -1.03090934e-02],
 [-3.24531964e-04,  1.03694477e-02,  9.99946183e-01]]
 
origin= [1.71104606e+00,
          5.80000039e-01,
          9.43144935e-01]

         
from scipy.spatial.transform import Rotation as R

# position is the origin
position = origin 
r = R.from_matrix(np.asarray(rotation))

# heading in WCS using scipy 
heading = r.as_quat()
print(f"pose:{position}\nheading: {heading}")
```

**Position**  
In the input manifest file, `position` refers to the position of the sensor with respect to a world frame. If you are unable to put the device position in a world coordinate system, you can use LiDAR data with local coordinates. Similarly, for mounted video cameras you can specify the position and heading in a world coordinate system. For camera, if you do not have position information, please use (0, 0, 0). 

The following are the fields in the position object:

1.  `x` (float) – x coordinate of ego vehicle, sensor, or camera position in meters. 

1.  `y` (float) – y coordinate of ego vehicle, sensor, or camera position in meters. 

1.  `z` (float) – z coordinate of ego vehicle, sensor, or camera position in meters. 

The following is an example of a `position` JSON object: 

```
{
    "position": {
        "y": -152.77584902657554,
        "x": 311.21505956090624,
        "z": -10.854137529636024
      }
}
```

**Heading**  
In the input manifest file, `heading` is an object that represents the orientation of a device with respect to world frame. Heading values should be in quaternion. A [quaternion](https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation) is a representation of the orientation consistent with geodesic spherical properties. If you are unable to put the sensor heading in world coordinates, please use the identity quaternion `(qx = 0, qy = 0, qz = 0, qw = 1)`. Similarly, for cameras, specify the heading in quaternions. If you are unable to obtain extrinsic camera calibration parameters, please also use the identity quaternion. 

Fields in `heading` object are as follows:

1.  `qx` (float) - x component of ego vehicle, sensor, or camera orientation. 

1.  `qy` (float) - y component of ego vehicle, sensor, or camera orientation. 

1.  `qz` (float) - z component of ego vehicle, sensor, or camera orientation. 

1. `qw` (float) - w component of ego vehicle, sensor, or camera orientation. 

The following is an example of a `heading` JSON object: 

```
{
    "heading": {
        "qy": -0.7046155108831117,
        "qx": 0.034278837280808494,
        "qz": 0.7070617895701465,
        "qw": -0.04904659893885366
      }
}
```

To learn more, see [Compute Orientation Quaternions and Position](#sms-point-cloud-ego-vehicle-orientation).

## Compute Orientation Quaternions and Position
<a name="sms-point-cloud-ego-vehicle-orientation"></a>

Ground Truth requires that all orientation, or heading, data be given in quaternions. A [quaternions](https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation) is a representation of the orientation consistent with geodesic spherical properties that can be used to approximate of rotation. Compared to [Euler angles](https://en.wikipedia.org/wiki/Euler_angles) they are simpler to compose and avoid the problem of [gimbal lock](https://en.wikipedia.org/wiki/Gimbal_lock). Compared to rotation matrices they are more compact, more numerically stable, and more efficient. 

You can compute quaternions from a rotation matrix or a transformation matrix.

If you have a rotation matrix (made up of the axis rotations) and translation vector (or origin) in world coordinate system instead of a single 4x4 rigid transformation matrix, then you can directly use the rotation matrix and translation vector to compute quaternions. Libraries like [scipy](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.html) and [pyqaternion ](http://kieranwynn.github.io/pyquaternion/#explicitly-by-rotation-or-transformation-matrix) can help. The following code-block shows an example using these libraries to compute quaternion from a rotation matrix. 

```
import numpy

rotation = [[ 9.96714314e-01, -8.09890350e-02,  1.16333982e-03],
 [ 8.09967396e-02,  9.96661051e-01, -1.03090934e-02],
 [-3.24531964e-04,  1.03694477e-02,  9.99946183e-01]]
 
origin = [1.71104606e+00,
          5.80000039e-01,
          9.43144935e-01]

         
from scipy.spatial.transform import Rotation as R
# position is the origin
position = origin 
r = R.from_matrix(np.asarray(rotation))
# heading in WCS using scipy 
heading = r.as_quat()
print(f"position:{position}\nheading: {heading}")
```

A UI tool like [3D Rotation Converter](https://www.andre-gaschler.com/rotationconverter/) can also be useful.

If you have a 4x4 extrinsic transformation matrix, note that the transformation matrix is in the form `[R T; 0 0 0 1]` where `R` is the rotation matrix and `T` is the origin translation vector. That means you can extract rotation matrix and translation vector from the transformation matrix as follows.

```
import numpy as np

transformation 
= [[ 9.96714314e-01, -8.09890350e-02,  1.16333982e-03, 1.71104606e+00],
   [ 8.09967396e-02,  9.96661051e-01, -1.03090934e-02, 5.80000039e-01],
   [-3.24531964e-04,  1.03694477e-02,  9.99946183e-01, 9.43144935e-01],
   [              0,               0,               0,              1]]

transformation  = np.array(transformation )
rotation = transformation[0:3,0:3]
translation= transformation[0:3,3]

from scipy.spatial.transform import Rotation as R
# position is the origin translation
position = translation
r = R.from_matrix(np.asarray(rotation))
# heading in WCS using scipy 
heading = r.as_quat()
print(f"position:{position}\nheading: {heading}")
```

With your own setup, you can compute an extrinsic transformation matrix using the GPS/IMU position and orientation (latitude, longitude, altitude and roll, pitch, yaw) with respect to the LiDAR sensor on the ego vehicle. For example, you can compute pose from KITTI raw data using `pose = convertOxtsToPose(oxts)` to transform the oxts data into a local euclidean poses, specified by 4x4 rigid transformation matrices. You can then transform this pose transformation matrix to a global reference frame using the reference frames transformation matrix in the world coordinate system.

```
struct Quaternion
{
    double w, x, y, z;
};

Quaternion ToQuaternion(double yaw, double pitch, double roll) // yaw (Z), pitch (Y), roll (X)
{
    // Abbreviations for the various angular functions
    double cy = cos(yaw * 0.5);
    double sy = sin(yaw * 0.5);
    double cp = cos(pitch * 0.5);
    double sp = sin(pitch * 0.5);
    double cr = cos(roll * 0.5);
    double sr = sin(roll * 0.5);

    Quaternion q;
    q.w = cr * cp * cy + sr * sp * sy;
    q.x = sr * cp * cy - cr * sp * sy;
    q.y = cr * sp * cy + sr * cp * sy;
    q.z = cr * cp * sy - sr * sp * cy;

    return q;
}
```

## Ground Truth Sensor Fusion Transformations
<a name="sms-point-cloud-extrinsic-intrinsic-explanation"></a>

The following sections go into greater detail about the Ground Truth sensor fusion transformations that are performed using the pose data you provide.

### LiDAR Extrinsic
<a name="sms-point-cloud-extrinsic-lidar-explanation"></a>

In order to project to and from a 3D LiDAR scene to a 2D camera image, Ground Truth computes the rigid transformation projection metrics using the ego vehicle pose and heading. Ground Truth computes rotation and translation of a world coordinates into the 3D plane by doing a simple sequence of rotations and translation. 

Ground Truth computes rotation metrics using the heading quaternions as follows:

![\[Equation: Ground Truth point cloud rotation metrics.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/sms-point-cloud-rotation-matrix.png)


Here, `[x, y, z, w]` corresponds to parameters in the `heading` JSON object, `[qx, qy, qz, qw]`. Ground Truth computes the translation column vector as `T = [poseX, poseY, poseZ]`. Then the extrinsic metrics is simply as follows:

```
LiDAR_extrinsic = [R T;0 0 0 1]
```

### Camera Calibrations: Extrinsic, Intrinsic and Distortion
<a name="sms-point-cloud-extrinsic-camera-explanation"></a>

*Geometric camera calibration*, also referred to as *camera resectioning*, estimates the parameters of a lens and image sensor of an image or video camera. You can use these parameters to correct for lens distortion, measure the size of an object in world units, or determine the location of the camera in the scene. Camera parameters include intrinsics and distortion coefficients.

#### Camera Extrinsic
<a name="sms-point-cloud-camera-extrinsic"></a>

If the camera pose is given, then Ground Truth computes the camera extrinsic based on a rigid transformation from the 3D plane into the camera plane. The calculation is the same as the one used for the [LiDAR Extrinsic](#sms-point-cloud-extrinsic-lidar-explanation), except that Ground Truth uses camera pose (`position` and `heading`) and computes the inverse extrinsic.

```
 camera_inverse_extrinsic = inv([Rc Tc;0 0 0 1]) #where Rc and Tc are camera pose components
```

#### Intrinsic and Distortion
<a name="sms-point-cloud-camera-intrinsic-distortion"></a>

Some cameras, such as pinhole or fisheye cameras, may introduce significant distortion in photos. This distortion can be corrected using distortion coefficients and the camera focal length. To learn more, see [Camera calibration With OpenCV](https://docs.opencv.org/2.4.13.7/doc/tutorials/calib3d/camera_calibration/camera_calibration.html) in the OpenCV documentation.

There are two types of distortion Ground Truth can correct for: radial distortion and tangential distortion.

*Radial distortion* occurs when light rays bend more near the edges of a lens than they do at its optical center. The smaller the lens, the greater the distortion. The presence of the radial distortion manifests in form of the *barrel* or *fish-eye* effect and Ground Truth uses Formula 1 to undistort it. 

**Formula 1:**

![\[Formula 1: equations for x_{corrected} and y_{corrected}, to undistort radial distortion.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/sms-point-cloud-camera-distortion-1.png)


*Tangential distortion* occurs because the lenses used to take the images are not perfectly parallel to the imaging plane. This can be corrected with Formula 2. 

**Formula 2:**

![\[Formula 2: equations for x_{corrected} and y_{corrected}, to correct for tangential distortion.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/sms-point-cloud-camera-distortion-2.png)


In the input manifest file, you can provide distortion coefficients and Ground Truth will undistort your images. All distortion coefficients are floats. 
+ `k1`, `k2`, `k3`, `k4` – Radial distortion coefficients. Supported for both fisheye and pinhole camera models.
+ `p1` ,`p2` – Tangential distortion coefficients. Supported for pinhole camera models.

If images are already undistorted, all distortion coefficients should be 0 in your input manifest. 

In order to correctly reconstruct the corrected image, Ground Truth does a unit conversion of the images based on focal lengths. If a common focal length is used with a given aspect ratio for both axes, such as 1, in the upper formula we will have a single focal length. The matrix containing these four parameters is referred to as the *in camera intrinsic calibration matrix*. 

![\[The in camera intrinsic calibration matrix.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/pointcloud/sms-point-cloud-camera-intrinsic.png)


While the distortion coefficients are the same regardless of the camera resolutions used, these should be scaled with the current resolution from the calibrated resolution. 

The following are float values. 
+ `fx` - focal length in x direction.
+ `fy` - focal length in y direction.
+ `cx` - x coordinate of principal point.
+ `cy` - y coordinate of principal point.

Ground Truth use the camera extrinsic and camera intrinsic to compute view metrics as shown in the following code block to transform labels between the 3D scene and 2D images.

```
def generate_view_matrix(intrinsic_matrix, extrinsic_matrix):
    intrinsic_matrix = np.c_[intrinsic_matrix, np.zeros(3)]
    view_matrix = np.matmul(intrinsic_matrix, extrinsic_matrix)
    view_matrix = np.insert(view_matrix, 2, np.array((0, 0, 0, 1)), 0)
    return view_matrix
```

# Video Frame Input Data
<a name="sms-video-frame-input-data-overview"></a>

When you create a video frame object detection or object tracking labeling job, you can choose video files (MP4 files) or video frames for input data. All worker tasks are created using video frames, so if you choose video files, use the Ground Truth frame extraction tool to extract video frames (images) from your video files. 

For both of these options, you can use the **Automated data setup** option in the Ground Truth section of the Amazon SageMaker AI console to set up a connection between Ground Truth and your input data in Amazon S3 so that Ground Truth knows where to look for your input data when creating your labeling tasks. This creates and stores an input manifest file in your Amazon S3 input dataset location. To learn more, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md).

Alternatively, you can manually create sequence files for each sequence of video frames that you want labeled and provide the Amazon S3 location of an input manifest file that references each of these sequences files using the `source-ref` key. To learn more, see [Create a Video Frame Input Manifest File](sms-video-manual-data-setup.md#sms-video-create-manifest). 

**Topics**
+ [Choose Video Files or Video Frames for Input Data](sms-point-cloud-video-input-data.md)
+ [Input Data Setup](sms-video-data-setup.md)

# Choose Video Files or Video Frames for Input Data
<a name="sms-point-cloud-video-input-data"></a>

When you create a video frame object detection or object tracking labeling job, you can provide a sequence of video frames (images) or you can use the Amazon SageMaker AI console to have Ground Truth automatically extract video frames from your video files. Use the following sections to learn more about these options. 

## Provide Video Frames
<a name="sms-video-provide-frames"></a>

Video frames are sequences of images extracted from a video file. You can create a Ground Truth labeling job to have workers label multiple sequences of video frames. Each sequence is made up of images extracted from a single video. 

To create a labeling job using video frame sequences, you must store each sequence using a unique [key name prefix](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys) in Amazon S3. In the Amazon S3 console, key name prefixes are folders. So in the Amazon S3 console, each sequence of video frames must be located in its own folder in Amazon S3.

For example, if you have two sequences of video frames, you might use the key name prefixes `sequence1/` and `sequence2/` to identify your sequences. In this example, your sequences may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence1/` and `s3://amzn-s3-demo-bucket/video-frames/sequence2/`.

If you are using the Ground Truth console to create an input manifest file, all of the sequence key name prefixes should be in the same location in Amazon S3. For example, in the Amazon S3 console, each sequence could be in a folder in `s3://amzn-s3-demo-bucket/video-frames/`. In this example, your first sequence of video frames (images) may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence1/` and your second sequence may be located in `s3://amzn-s3-demo-bucket/video-frames/sequence2/`. 

**Important**  
Even if you only have a single sequence of video frames that you want workers to label, that sequence must have a key name prefix in Amazon S3. If you are using the Amazon S3 console, this means that your sequence is located in a folder. It cannot be located in the root of your S3 bucket. 

When creating worker tasks using sequences of video frames, Ground Truth uses one sequence per task. In each task, Ground Truth orders your video frames using [UTF-8](https://en.wikipedia.org/wiki/UTF-8) binary order. 

For example, video frames might be in the following order in Amazon S3: 

```
[0001.jpg, 0002.jpg, 0003.jpg, ..., 0011.jpg]
```

They are arranged in the same order in the worker’s task: `0001.jpg, 0002.jpg, 0003.jpg, ..., 0011.jpg`.

Frames might also be ordered using a naming convention like the following:

```
[frame1.jpg, frame2.jpg, ..., frame11.jpg]
```

In this case, `frame10.jpg` and `frame11.jpg` come before `frame2.jpg` in the worker task. Your worker sees your video frames in the following order: `frame1.jpg, frame10.jpg, frame11.jpg, frame2.jpg, ..., frame9.jpg`. 

## Provide Video Files
<a name="sms-point-cloud-video-frame-extraction"></a>

You can use the Ground Truth frame splitting feature when creating a new labeling job in the console to extract video frames from video files (MP4 files). A series of video frames extracted from a single video file is referred to as a *sequence of video frames*.

You can either have Ground Truth automatically extract all frames, up to 2,000, from the video, or you can specify a frequency for frame extraction. For example, you can have Ground Truth extract every 10th frame from your videos.

You can provide up to 50 videos when you use automated data setup to extract frames, however your input manifest file cannot reference more than 10 video frame sequence files when you create a video frame object tracking and video frame object detection labeling job. If you use the automated data setup console tool to extract video frames from more than 10 video files, you will need to modify the manifest file the tool generates or create a new one to include 10 video frame sequence files or less. To learn more about these quotas, see [3D Point Cloud and Video Frame Labeling Job Quotas](input-data-limits.md#sms-input-data-quotas-other).

To use the video frame extraction tool, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md). 

When all of your video frames have been successfully extracted from your videos, you will see the following in your S3 input dataset location:
+ A key name prefix (a folder in the Amazon S3 console) named after each video. Each of these prefixes leads to:
  + A sequence of video frames extracted from the video used to name that prefix.
  + A sequence file used to identify all of the images that make up that sequence. 
+ An input manifest file with a .manifest extension. This identifies all of the sequence files that will be used to create your labeling job. 

All of the frames extracted from a single video file are used for a labeling task. If you extract video frames from multiple video files, multiple tasks are created for your labeling job, one for each sequence of video frames. 

 Ground Truth stores each sequence of video frames that it extracts in your Amazon S3 location for input datasets using a unique [key name prefix](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys). In the Amazon S3 console, key name prefixes are folders.

# Input Data Setup
<a name="sms-video-data-setup"></a>

When you create a video frame labeling job, you need to let Ground Truth know where to look for your input data. You can do this in one of two ways:
+ You can store your input data in Amazon S3 and have Ground Truth automatically detect the input dataset used for your labeling job. See [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md) to learn more about this option. 
+ You can create an input manifest file and sequence files and upload them to Amazon S3. See [Set up Video Frame Input Data Manually](sms-video-manual-data-setup.md) to learn more about this option. 

**Topics**
+ [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md)
+ [Set up Video Frame Input Data Manually](sms-video-manual-data-setup.md)

# Set up Automated Video Frame Input Data
<a name="sms-video-automated-data-setup"></a>

You can use the Ground Truth automated data setup to automatically detect video files in your Amazon S3 bucket and extract video frames from those files. To learn how, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

If you already have video frames in Amazon S3, you can use the automated data setup to use these video frames in your labeling job. For this option, all video frames from a single video must be stored using a unique prefix. To learn about the requirements to use this option, see [Provide Video Frames](sms-point-cloud-video-input-data.md#sms-video-provide-frames).

Select one of the following sections to learn how to set up your automatic input dataset connection with Ground Truth.

## Provide Video Files and Extract Frames
<a name="sms-video-provide-files-auto-setup-console"></a>

Use the following procedure to connect your video files with Ground Truth and automatically extract video frames from those files for video frame object detection and object tracking labeling jobs.

**Note**  
If you use the automated data setup console tool to extract video frames from more than 10 video files, you will need to modify the manifest file the tool generates or create a new one to include 10 video frame sequence files or less. To learn more, see [Provide Video Files](sms-point-cloud-video-input-data.md#sms-point-cloud-video-frame-extraction).

Make sure your video files are stored in an Amazon S3 bucket in the same AWS Region that you perform the automated data setup in. 

**Automatically connect your video files in Amazon S3 with Ground Truth and extract video frames:**

1. Navigate to the **Create labeling job** page in the Amazon SageMaker AI console: [https://console.aws.amazon.com/sagemaker/groundtruth](https://console.aws.amazon.com//sagemaker/groundtruth). 

   Your input and output S3 buckets must be located in the same AWS Region that you create your labeling job in. This link puts you in the North Virginia (us-east-1) AWS Region. If your input data is in an Amazon S3 bucket in another Region, switch to that Region. To change your AWS Region, on the [navigation bar](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/getting-started.html#select-region), choose the name of the currently displayed Region.

1. Select **Create labeling job**.

1. Enter a **Job name**. 

1. In the section **Input data setup**, select **Automated data setup**.

1. Enter an Amazon S3 URI for **S3 location for input datasets**. An S3 URI looks like the following: `s3://amzn-s3-demo-bucket/path-to-files/`. This URI should point to the Amazon S3 location where your video files are stored.

1. Specify your **S3 location for output datasets**. This is where your output data is stored. You can choose to store your output data in the **Same location as input dataset** or **Specify a new location** and entering the S3 URI of the location that you want to store your output data.

1. Choose **Video Files** for your **Data type** using the dropdown list.

1. Choose **Yes, extract frames for object tracking and detection tasks**. 

1. Choose a method of **Frame extraction**.
   + When you choose **Use all frames extracted from the video to create a labeling task**, Ground Truth extracts all frames from each video in your **S3 location for input datasets**, up to 2,000 frames. If a video in your input dataset contains more than 2,000 frames, the first 2,000 are extracted and used for that labeling task. 
   + When you choose **Use every *x* frame from a video to create a labeling task**, Ground Truth extracts every *x*th frame from each video in your **S3 location for input datasets**. 

     For example, if your video is 2 seconds long, and has a [frame rate](https://en.wikipedia.org/wiki/Frame_rate) of 30 frames per second, there are 60 frames in your video. If you specify 10 here, Ground Truth extracts every 10th frame from your video. This means the 1st, 10th, 20th, 30th, 40th, 50th, and 60th frames are extracted. 

1. Choose or create an IAM execution role. Make sure that this role has permission to access your Amazon S3 locations for input and output data specified in steps 5 and 6. 

1. Select **Complete data setup**.

## Provide Video Frames
<a name="sms-video-provide-frames-auto-setup-console"></a>

Use the following procedure to connect your sequences of video frames with Ground Truth for video frame object detection and object tracking labeling jobs. 

Make sure your video frames are stored in an Amazon S3 bucket in the same AWS Region that you perform the automated data setup in. Each sequence of video frames should have a unique prefix. For example, if you have two sequences stored in `s3://amzn-s3-demo-bucket/video-frames/sequences/`, each should have a unique prefix like `sequence1` and `sequence2` and should both be located directly under the `/sequences/` prefix. In the example above, the locations of these two sequences is: `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence1/` and `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence2/`. 

**Automatically connect your video frame in Amazon S3 with Ground Truth:**

1. Navigate to the **Create labeling job** page in the Amazon SageMaker AI console: [https://console.aws.amazon.com/sagemaker/groundtruth](https://console.aws.amazon.com//sagemaker/groundtruth). 

   Your input and output S3 buckets must be located in the same AWS Region that you create your labeling job in. This link puts you in the North Virginia (us-east-1) AWS Region. If your input data is in an Amazon S3 bucket in another Region, switch to that Region. To change your AWS Region, on the [navigation bar](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/getting-started.html#select-region), choose the name of the currently displayed Region.

1. Select **Create labeling job**.

1. Enter a **Job name**. 

1. In the section **Input data setup**, select **Automated data setup**.

1. Enter an Amazon S3 URI for **S3 location for input datasets**. 

   This should be the Amazon S3 location where your sequences are stored. For example, if you have two sequences stored in `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence1/`, `s3://amzn-s3-demo-bucket/video-frames/sequences/sequence2/`, enter `s3://amzn-s3-demo-bucket/video-frames/sequences/` here.

1. Specify your **S3 location for output datasets**. This is where your output data is stored. You can choose to store your output data in the **Same location as input dataset** or **Specify a new location** and entering the S3 URI of the location that you want to store your output data.

1. Choose **Video frames** for your **Data type** using the dropdown list. 

1. Choose or create an IAM execution role. Make sure that this role has permission to access your Amazon S3 locations for input and output data specified in steps 5 and 6. 

1. Select **Complete data setup**.

These procedures will create an input manifest in the Amazon S3 location for input datasets that you specified in step 5. If you are creating a labeling job using the SageMaker API or, AWS CLI, or an AWS SDK, use the Amazon S3 URI for this input manifest file as input to the parameter `ManifestS3Uri`.

# Set up Video Frame Input Data Manually
<a name="sms-video-manual-data-setup"></a>

Choose the manual data setup option if you have created sequence files for each of your video frame sequences, and a manifest file listing references to those sequences files.

## Create a Video Frame Input Manifest File
<a name="sms-video-create-manifest"></a>

 Ground Truth uses the input manifest file to identify the location of your input dataset when creating labeling tasks. For video frame object detection and object tracking labeling jobs, each line in the input manifest file identifies the location of a video frame sequence file. Each sequence file identifies the images included in a single sequence of video frames.

Use this page to learn how to create a video frame sequence file and an input manifest file for video frame object tracking and object detection labeling jobs.

If you want Ground Truth to automatically generate your sequence files and input manifest file, see [Set up Automated Video Frame Input Data](sms-video-automated-data-setup.md). 

### Create a Video Frame Sequence Input Manifest
<a name="sms-video-create-input-manifest-file"></a>

In the video frame sequence input manifest file, each line in the manifest is a JSON object, with a `"source-ref"` key that references a sequence file. Each sequence file identifies the location of a sequence of video frames. This is the manifest file formatting required for all video frame labeling jobs. 

The following example demonstrates the syntax used for an input manifest file:

```
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq1.json"}
{"source-ref": "s3://amzn-s3-demo-bucket/example-folder/seq2.json"}
```

### Create a Video Frame Sequence File
<a name="sms-video-create-sequence-file"></a>

The data for each sequence of video frames needs to be stored in a JSON data object. The following is an example of the format you use for a sequence file. Information about each frame is included as a JSON object and is listed in the `frames` list. The following JSON has been expanded for readability. 

```
{
 "seq-no": 1,
 "prefix": "s3://amzn-s3-demo-bucket/prefix/video1/",
 "number-of-frames": 3,
 "frames":[
   {"frame-no": 1, "unix-timestamp": 1566861644, "frame": "frame0001.jpg" },
   {"frame-no": 2, "unix-timestamp": 1566861644, "frame": "frame0002.jpg" }, 
   {"frame-no": 3, "unix-timestamp": 1566861644, "frame": "frame0003.jpg" }   
 ]
}
```

The following table provides details about the parameters shown in the this code example. 


****  

|  Parameter  |  Required  |  Accepted Values  |  Description  | 
| --- | --- | --- | --- | 
|  `seq-no`  |  Yes  |  Integer  |  The ordered number of the sequence.   | 
|  `prefix`  |  Yes  |  String **Accepted Values**: `s3://<bucket-name>/<prefix>/`  |  The Amazon S3 location where the sequence files are located.  The prefix must end with a forward slash: `/`.  | 
|  `number-of-frames`  |  Yes  |  Integer  |  The total number of frames included in the sequence file. This number must match the total number of frames listed in the `frames` parameter in the next row.  | 
|  `frames`  |  Yes  |  List of JSON objects **Required**: `frame-no`, `frame` **Optional**: `unix-timestamp`  |  A list of frame data. The length of the list must equal `number-of-frames`. In the worker UI, frames in a sequence are ordered in [UTF-8](https://en.wikipedia.org/wiki/UTF-8) binary order. To learn more about this ordering, see [Provide Video Frames](sms-point-cloud-video-input-data.md#sms-video-provide-frames).  | 
| frame-no |  Yes  |  Integer  |  The frame order number. This will determine the order of a frame in the sequence.   | 
|  `unix-timestamp`  |  No  |  Integer  |  The unix timestamp of a frame. The number of seconds since January 1st, 1970 until the UTC time when the frame was captured.   | 
| frame |  Yes  |  String  |  The name of a video frame image file.   | 

# Labeling job output data
<a name="sms-data-output"></a>

The output from a labeling job is placed in the Amazon S3 location that you specified in the console or in the call to the [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. Output data appears in this location when the workers have submitted one or more tasks, or when tasks expire. Note that it may take a few minutes for output data to appear in Amazon S3 after the worker submits the task or the task expires.

Each line in the output data file is identical to the manifest file with the addition of an attribute and value for the label assigned to the input object. The attribute name for the value is defined in the console or in the call to the `CreateLabelingJob` operation. You can't use `-metadata` in the label attribute name. If you are running an image semantic segmentation, 3D point cloud semantic segmentation, or 3D point cloud object tracking job, the label attribute must end with `-ref`. For any other type of job, the attribute name can't end with `-ref`.

The output of the labeling job is the value of the key-value pair with the label. The label and the value overwrites any existing JSON data in the input file with the new value. 

For example, the following is the output from an image classification labeling job where the input data files were stored in an Amazon S3 `amzn-s3-demo-bucket` and the label attribute name was defined as *`sport`*. In this example the JSON object is formatted for readability, in the actual output file the JSON object is on a single line. For more information about the data format, see [JSON Lines](http://jsonlines.org/). 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/image_example.png",
    "sport":0,
    "sport-metadata":
    {
        "class-name": "football",
        "confidence": 0.00,
        "type":"groundtruth/image-classification",
        "job-name": "identify-sport",
        "human-annotated": "yes",
        "creation-date": "2018-10-18T22:18:13.527256"
    }
}
```

The value of the label can be any valid JSON. In this case the label's value is the index of the class in the classification list. Other job types, such as bounding box, have more complex values.

Any key-value pair in the input manifest file other than the label attribute is unchanged in the output file. You can use this to pass data to your application.

The output from a labeling job can be used as the input to another labeling job. You can use this when you are chaining together labeling jobs. For example, you can send one labeling job to determine the sport that is being played. Then you send another using the same data to determine if the sport is being played indoors or outdoors. By using the output data from the first job as the manifest for the second job, you can consolidate the results of the two jobs into one output file for easier processing by your applications. 

The output data file is written to the output location periodically while the job is in progress. These intermediate files contain one line for each line in the manifest file. If an object is labeled, the label is included. If the object hasn't been labeled, it is written to the intermediate output file identically to the manifest file.

## Output directories
<a name="sms-output-directories"></a>

Ground Truth creates several directories in your Amazon S3 output path. These directories contain the results of your labeling job and other artifacts of the job. The top-level directory for a labeling job is given the same name as your labeling job; the output directories are placed beneath it. For example, if you named your labeling job **find-people**, your output would be in the following directories:

```
s3://amzn-s3-demo-bucket/find-people/activelearning
s3://amzn-s3-demo-bucket/find-people/annotations
s3://amzn-s3-demo-bucket/find-people/inference
s3://amzn-s3-demo-bucket/find-people/manifests
s3://amzn-s3-demo-bucket/find-people/training
```

Each directory contains the following output:

### Active learning directory
<a name="sms-output-activelearning"></a>

The `activelearning` directory is only present when you are using automated data labeling. It contains the input and output validation set for automated data labeling, and the input and output folder for automatically labeled data.

### Annotations directory
<a name="sms-directories-annotations"></a>

The `annotations` directory contains all of the annotations made by the workforce. These are the responses from individual workers that have not been consolidated into a single label for the data object. 

There are three subdirectories in the `annotations` directory. 
+ The first, `worker-response`, contains the responses from individual workers. This contains a subdirectory for each iteration, which in turn contains a subdirectory for each data object in that iteration. The worker response data for each data object is stored in a timestamped JSON file that contains the answers submitted by each worker for that data object, and if you use a private workforce, metadata about those workers. To learn more about this metadata, see [Worker metadata](#sms-worker-id-private).
+ The second, `consolidated-annotation`, contains information required to consolidate the annotations in the current batch into labels for your data objects.
+ The third, `intermediate`, contains the output manifest for the current batch with any completed labels. This file is updated as the label for each data object is completed.

**Note**  
We recommend that you do not use files that are not mentioned in the documentation.

### Inference directory
<a name="sms-directories-inference"></a>

The `inference` directory is only present when you are using automated data labeling. This directory contains the input and output files for the SageMaker AI batch transform used while labeling data objects.

### Manifest directory
<a name="sms-directories-manifest"></a>

The `manifest` directory contains the output manifest from your labeling job. There is one subdirectory in the manifest directory, `output`. The `output` directory contains the output manifest file for your labeling job. The file is named `output.manifest`.

### Training directory
<a name="sms-directories-training"></a>

The `training` directory is only present when you are using automated data labeling. This directory contains the input and output files used to train the automated data labeling model.

## Confidence score
<a name="sms-output-confidence"></a>

When you have more than one worker annotate a single task, your label results from annotation consolidation. Ground Truth calculates a confidence score for each label. A *confidence score* is a number between 0 and 1 that indicates how confident Ground Truth is in the label. You can use the confidence score to compare labeled data objects to each other, and to identify the least or most confident labels.

You should not interpret the value of a confidence score as an absolute value, or compare confidence scores across labeling jobs. For example, if all of the confidence scores are between 0.98 and 0.998, you should only compare the data objects with each other and not rely on the high confidence scores. 

You should not compare the confidence scores of human-labeled data objects and auto-labeled data objects. The confidence scores for humans are calculated using the annotation consolidation function for the task, while the confidence scores for automated labeling are calculated using a model that incorporates object features. The two models generally have different scales and average confidence.

For a bounding box labeling job, Ground Truth calculates a confidence score per box. You can compare confidence scores within one image or across images for the same labeling type (human or auto). You can't compare confidence scores across labeling jobs.

If a single worker annotates a task (`NumberOfHumanWorkersPerDataObject` is set to `1` or in the console, you enter 1 for **Number of workers per dataset object**), the confidence score is set to `0.00`. 

## Worker metadata
<a name="sms-worker-id-private"></a>

Ground Truth provides information that you can use to track individual workers in task output data. The following data is located in the directories under the `worker-response` located in the [Annotations directory](#sms-directories-annotations):
+ The `acceptanceTime` is the time that the worker accepted the task. The format of this date and time stamp is `YYYY-MM-DDTHH:MM:SS.mmmZ` for the year (`YYYY`), month (`MM`), day (`DD`), hour (`HH`), minute (`MM`), second (`SS`) and millisecond (`mmm`). The date and time are separated by a **T**. 
+ The `submissionTime` is the time that the worker submitted their annotations using the **Submit** button. The format of this date and time stamp is `YYYY-MM-DDTHH:MM:SS.mmmZ` for the year (`YYYY`), month (`MM`), day (`DD`), hour (`HH`), minute (`MM`), second (`SS`) and millisecond (`mmm`). The date and time are separated by a **T**. 
+ `timeSpentInSeconds` reports the total time, in seconds, that a worker actively worked on that task. This metric does not include time when a worker paused or took a break.
+ The `workerId` is unique to each worker. 
+ If you use a [private workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private.html), in `workerMetadata`, you see the following.
  + The `identityProviderType` is the service used to manage the private workforce. 
  + The `issuer` is the Cognito user pool or OIDC Identity Provider (IdP) issuer associated with the work team assigned to this human review task.
  + A unique `sub` identifier refers to the worker. If you create a workforce using Amazon Cognito, you can retrieve details about this worker (such as the name or user name) using this ID using Amazon Cognito. To learn how, see [Managing and Searching for User Accounts](https://docs.aws.amazon.com/cognito/latest/developerguide/how-to-manage-user-accounts.html#manage-user-accounts-searching-user-attributes) in [Amazon Cognito Developer Guide](https://docs.aws.amazon.com/cognito/latest/developerguide/).

The following is an example of the output you may see if you use Amazon Cognito to create a private workforce. This is identified in the `identityProviderType`.

```
"submissionTime": "2020-12-28T18:59:58.321Z",
"acceptanceTime": "2020-12-28T18:59:15.191Z", 
"timeSpentInSeconds": 40.543,
"workerId": "a12b3cdefg4h5i67",
"workerMetadata": {
    "identityData": {
        "identityProviderType": "Cognito",
        "issuer": "https://cognito-idp.aws-region.amazonaws.com/aws-region_123456789",
        "sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
    }
}
```

 The following is an example of the `workerMetadata` you may see if you use your own OIDC IdP to create a private workforce:

```
"workerMetadata": {
        "identityData": {
            "identityProviderType": "Oidc",
            "issuer": "https://example-oidc-ipd.com/adfs",
            "sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
        }
}
```

To learn more about using private workforces, see [Private workforce](sms-workforce-private.md).

## Output metadata
<a name="sms-output-metadata"></a>

The output from each job contains metadata about the label assigned to data objects. These elements are the same for all jobs with minor variations. The following example shows the metadata elements:

```
    "confidence": 0.00,
    "type": "groundtruth/image-classification",
    "job-name": "identify-animal-species",
    "human-annotated": "yes",
    "creation-date": "2020-10-18T22:18:13.527256"
```

The elements have the following meaning:
+ `confidence` – The confidence that Ground Truth has that the label is correct. For more information, see [Confidence score](#sms-output-confidence).
+ `type` – The type of classification job. For job types, see [Built-in Task Types](sms-task-types.md). 
+ `job-name` – The name assigned to the job when it was created.
+ `human-annotated` – Whether the data object was labeled by a human or by automated data labeling. For more information, see [Automate data labeling](sms-automated-labeling.md).
+ `creation-date` – The date and time that the label was created.

## Classification job output
<a name="sms-output-class"></a>

The following are sample outputs (output manifest files) from an image classification job and a text classification job. They include the label that Ground Truth assigned to the data object, the value for the label, and metadata that describes the label.

In addition to the standard metadata elements, the metadata for a classification job includes the text value of the label's class. For more information, see [Image Classification - MXNet](image-classification.md).

The red, italicized text in the examples below depends on labeling job specifications and output data. 

```
{
    "source-ref":"s3://amzn-s3-demo-bucket/example_image.jpg",
    "species":"0",
    "species-metadata":
    {
        "class-name": "dog",
        "confidence": 0.00,
        "type": "groundtruth/image-classification",
        "job-name": "identify-animal-species",
        "human-annotated": "yes",
        "creation-date": "2018-10-18T22:18:13.527256"
    }
}
```

```
{
    "source":"The food was delicious",
    "mood":"1",
    "mood-metadata":
    {
        "class-name": "positive",
        "confidence": 0.8,
        "type": "groundtruth/text-classification",
        "job-name": "label-sentiment",
        "human-annotated": "yes",
        "creation-date": "2020-10-18T22:18:13.527256"
    }
}
```

## Multi-label classification job output
<a name="sms-output-multi-label-classification"></a>

The following are example output manifest files from a multi-label image classification job and a multi-label text classification job. They include the labels that Ground Truth assigned to the data object (for example, the image or piece of text) and metadata that describes the labels the worker saw when completing the labeling task. 

The label attribute name parameter (for example, `image-label-attribute-name`) contains an array of all of the labels selected by at least one of the workers who completed this task. This array contains integer keys (for example, `[1,0,8]`) that correspond to the labels found in `class-map`. In the multi-label image classification example, `bicycle`, `person`, and `clothing` were selected by at least one of the workers who completed the labeling task for the image, `exampleimage.jpg`.

The `confidence-map` shows the confidence score that Ground Truth assigned to each label selected by a worker. To learn more about Ground Truth confidence scores, see [Confidence score](#sms-output-confidence).

The red, italicized text in the examples below depends on labeling job specifications and output data. 

The following is an example of a multi-label image classification output manifest file. 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example_image.jpg",
    "image-label-attribute-name":[1,0,8],
    "image-label-attribute-name-metadata":
       {
        "job-name":"labeling-job/image-label-attribute-name",
        "class-map":
            {
                "1":"bicycle","0":"person","8":"clothing"
            },
        "human-annotated":"yes",
        "creation-date":"2020-02-27T21:36:25.000201",
        "confidence-map":
            {
                "1":0.95,"0":0.77,"8":0.2
            },
        "type":"groundtruth/image-classification-multilabel"
        }
}
```

The following is an example of a multi-label text classification output manifest file. In this example, `approving`, `sad` and `critical` were selected by at least one of the workers who completed the labeling task for the object `exampletext.txt` found in `amzn-s3-demo-bucket`.

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/exampletext.txt",
    "text-label-attribute-name":[1,0,4],
    "text-label-attribute-name-metadata":
       {
        "job-name":"labeling-job/text-label-attribute-name",
        "class-map":
            {
                "1":"approving","0":"sad","4":"critical"
            },
        "human-annotated":"yes",
        "creation-date":"2020-02-20T21:36:25.000201",
        "confidence-map":
            {
                "1":0.95,"0":0.77,"4":0.2
            },
        "type":"groundtruth/text-classification-multilabel"
        }
}
```

## Bounding box job output
<a name="sms-output-box"></a>

The following is sample output (output manifest file) from a bounding box job. For this task, three bounding boxes are returned. The label value contains information about the size of the image, and the location of the bounding boxes.

The `class_id` element is the index of the box's class in the list of available classes for the task. The `class-map` metadata element contains the text of the class.

The metadata has a separate confidence score for each bounding box. The metadata also includes the `class-map` element that maps the `class_id` to the text value of the class. For more information, see [Object Detection - MXNet](object-detection.md).

The red, italicized text in the examples below depends on labeling job specifications and output data. 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example_image.png",
    "bounding-box-attribute-name":
    {
        "image_size": [{ "width": 500, "height": 400, "depth":3}],
        "annotations":
        [
            {"class_id": 0, "left": 111, "top": 134,
                    "width": 61, "height": 128},
            {"class_id": 5, "left": 161, "top": 250,
                     "width": 30, "height": 30},
            {"class_id": 5, "left": 20, "top": 20,
                     "width": 30, "height": 30}
        ]
    },
    "bounding-box-attribute-name-metadata":
    {
        "objects":
        [
            {"confidence": 0.8},
            {"confidence": 0.9},
            {"confidence": 0.9}
        ],
        "class-map":
        {
            "0": "dog",
            "5": "bone"
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2018-10-18T22:18:13.527256",
        "job-name": "identify-dogs-and-toys"
    }
 }
```

The output of a bounding box adjustment job looks like the following JSON. Note that the original JSON is kept intact and two new jobs are listed, each with “adjust-” prepended to the original attribute’s name. 

```
{
    "source-ref": "S3 bucket location",
    "bounding-box-attribute-name":
    {
        "image_size": [{ "width": 500, "height": 400, "depth":3}],
        "annotations":
        [
            {"class_id": 0, "left": 111, "top": 134,
                    "width": 61, "height": 128},
            {"class_id": 5, "left": 161, "top": 250,
                     "width": 30, "height": 30},
            {"class_id": 5, "left": 20, "top": 20,
                     "width": 30, "height": 30}
        ]
    },
    "bounding-box-attribute-name-metadata":
    {
        "objects":
        [
            {"confidence": 0.8},
            {"confidence": 0.9},
            {"confidence": 0.9}
        ],
        "class-map":
        {
            "0": "dog",
            "5": "bone"
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2018-10-18T22:18:13.527256",
        "job-name": "identify-dogs-and-toys"
    },
    "adjusted-bounding-box":
    {
        "image_size": [{ "width": 500, "height": 400, "depth":3}],
        "annotations":
        [
            {"class_id": 0, "left": 110, "top": 135,
                    "width": 61, "height": 128},
            {"class_id": 5, "left": 161, "top": 250,
                     "width": 30, "height": 30},
            {"class_id": 5, "left": 10, "top": 10,
                     "width": 30, "height": 30}
        ]
    },
    "adjusted-bounding-box-metadata":
    {
        "objects":
        [
            {"confidence": 0.8},
            {"confidence": 0.9},
            {"confidence": 0.9}
        ],
        "class-map":
        {
            "0": "dog",
            "5": "bone"
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2018-11-20T22:18:13.527256",
        "job-name": "adjust-bounding-boxes-on-dogs-and-toys",
        "adjustment-status": "adjusted"
    }
}
```

In this output, the job's `type` doesn't change, but an `adjustment-status` field is added. This field has the value of `adjusted` or `unadjusted`. If multiple workers have reviewed the object and at least one adjusted the label, the status is `adjusted`.

## Named entity recognition
<a name="sms-output-data-ner"></a>

The following is an example output manifest file from a named entity recognition (NER) labeling task. For this task, seven `entities` are returned.

In the output manifest, the JSON object, `annotations`, includes a list of the `labels` (label categories) that you provided.

Worker responses are in a list named `entities`. Each entity in this list is a JSON object that contains a `label` value that matches one in the `labels` list, an integer `startOffset` value for labeled span's starting Unicode offset, and an integer `endOffset` value for the ending Unicode offset.

The metadata has a separate confidence score for each entity. If a single worker labeled each data object, the confidence value for each entity will be zero.

The red, italicized text in the examples below depends on labeling job inputs and worker responses.

```
{
    "source": "Amazon SageMaker is a cloud machine-learning platform that was launched in November 2017. SageMaker enables developers to create, train, and deploy machine-learning (ML) models in the cloud. SageMaker also enables developers to deploy ML models on embedded systems and edge-devices",
    "ner-labeling-job-attribute-name": {
        "annotations": {
            "labels": [
                {
                    "label": "Date",
                    "shortDisplayName": "dt"
                },
                {
                    "label": "Verb",
                    "shortDisplayName": "vb"
                },
                {
                    "label": "Thing",
                    "shortDisplayName": "tng"
                },
                {
                    "label": "People",
                    "shortDisplayName": "ppl"
                }
            ],
            "entities": [
                {
                    "label": "Thing",
                    "startOffset": 22,
                    "endOffset": 53
                },
                {
                    "label": "Thing",
                    "startOffset": 269,
                    "endOffset": 281
                },
                {
                    "label": "Verb",
                    "startOffset": 63,
                    "endOffset": 71
                },
                {
                    "label": "Verb",
                    "startOffset": 228,
                    "endOffset": 234
                },
                {
                    "label": "Date",
                    "startOffset": 75,
                    "endOffset": 88
                },
                {
                    "label": "People",
                    "startOffset": 108,
                    "endOffset": 118
                },
                {
                    "label": "People",
                    "startOffset": 214,
                    "endOffset": 224
                }
            ]
        }
    },
    "ner-labeling-job-attribute-name-metadata": {
        "job-name": "labeling-job/example-ner-labeling-job",
        "type": "groundtruth/text-span",
        "creation-date": "2020-10-29T00:40:39.398470",
        "human-annotated": "yes",
        "entities": [
            {
                "confidence": 0
            },
            {
                "confidence": 0
            },
            {
                "confidence": 0
            },
            {
                "confidence": 0
            },
            {
                "confidence": 0
            },
            {
                "confidence": 0
            },
            {
                "confidence": 0
            }
        ]
    }
}
```

## Label verification job output
<a name="sms-output-bounding-box-verification"></a>

The output (output manifest file) of a bounding box verification job looks different than the output of a bounding box annotation job. That's because the workers have a different type of task. They're not labeling objects, but evaluating the accuracy of prior labeling, making a judgment, and then providing that judgment and perhaps some comments.

If human workers are verifying or adjusting prior bounding box labels, the output of a verification job would look like the following JSON. The red, italicized text in the examples below depends on labeling job specifications and output data. 

```
{
    "source-ref":"s3://amzn-s3-demo-bucket/image_example.png",
    "bounding-box-attribute-name":
    {
        "image_size": [{ "width": 500, "height": 400, "depth":3}],
        "annotations":
        [
            {"class_id": 0, "left": 111, "top": 134,
                    "width": 61, "height": 128},
            {"class_id": 5, "left": 161, "top": 250,
                     "width": 30, "height": 30},
            {"class_id": 5, "left": 20, "top": 20,
                     "width": 30, "height": 30}
        ]
    },
    "bounding-box-attribute-name-metadata":
    {
        "objects":
        [
            {"confidence": 0.8},
            {"confidence": 0.9},
            {"confidence": 0.9}
        ],
        "class-map":
        {
            "0": "dog",
            "5": "bone"
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2018-10-18T22:18:13.527256",
        "job-name": "identify-dogs-and-toys"
    },
    "verify-bounding-box-attribute-name":"1",
    "verify-bounding-box-attribute-name-metadata":
    {
        "class-name": "bad",
        "confidence": 0.93,
        "type": "groundtruth/label-verification",
        "job-name": "verify-bounding-boxes",
        "human-annotated": "yes",
        "creation-date": "2018-11-20T22:18:13.527256",
        "worker-feedback": [
            {"comment": "The bounding box on the bird is too wide on the right side."},
            {"comment": "The bird on the upper right is not labeled."}
        ]
    }
}
```

Although the `type` on the original bounding box output was `groundtruth/object-detection`, the new `type` is `groundtruth/label-verification`. Also note that the `worker-feedback` array provides worker comments. If the worker doesn't provide comments, the empty fields are excluded during consolidation.

## Semantic Segmentation Job Output
<a name="sms-output-segmentation"></a>

The following is the output manifest file from a semantic segmentation labeling job. The value of the label for this job is a reference to a PNG file in an Amazon S3 bucket.

In addition to the standard elements, the metadata for the label includes a color map that defines which color is used to label the image, the class name associated with the color, and the confidence score for each color. For more information, see [Semantic Segmentation Algorithm](semantic-segmentation.md).

The red, italicized text in the examples below depends on labeling job specifications and output data. 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example_city_image.png",
    "city-streets-ref": "S3 bucket location",
    "city-streets-ref-metadata": {
      "internal-color-map": {
        "0": {
           "class-name": "BACKGROUND",
           "confidence": 0.9,
           "hex-color": "#ffffff"
        },
        "1": {
           "class-name": "buildings",
           "confidence": 0.9,
           "hex-color": "#2acf59"
        },
        "2":  {
           "class-name": "road",
           "confidence": 0.9,
           "hex-color": "#f28333"
       }
     },
     "type": "groundtruth/semantic-segmentation",
     "human-annotated": "yes",
     "creation-date": "2018-10-18T22:18:13.527256",
     "job-name": "label-city-streets",
     },
     "verify-city-streets-ref":"1",
     "verify-city-streets-ref-metadata":
     {
        "class-name": "bad",
        "confidence": 0.93,
        "type": "groundtruth/label-verification",
        "job-name": "verify-city-streets",
        "human-annotated": "yes",
        "creation-date": "2018-11-20T22:18:13.527256",
        "worker-feedback": [
            {"comment": "The mask on the leftmost building is assigned the wrong side of the road."},
            {"comment": "The curb of the road is not labeled but the instructions say otherwise."}
        ]
     }
}
```

Confidence is scored on a per-image basis. Confidence scores are the same across all classes within an image. 

The output of a semantic segmentation adjustment job looks similar to the following JSON.

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example_city_image.png",
    "city-streets-ref": "S3 bucket location",
    "city-streets-ref-metadata": {
      "internal-color-map": {
        "0": {
           "class-name": "BACKGROUND",
           "confidence": 0.9,
           "hex-color": "#ffffff"
        },
        "1": {
           "class-name": "buildings",
           "confidence": 0.9,
           "hex-color": "#2acf59"
        },
        "2":  {
           "class-name": "road",
           "confidence": 0.9,
           "hex-color": "#f28333"
       }
     },
     "type": "groundtruth/semantic-segmentation",
     "human-annotated": "yes",
     "creation-date": "2018-10-18T22:18:13.527256",
     "job-name": "label-city-streets",
     },
     "adjusted-city-streets-ref": "s3://amzn-s3-demo-bucket/example_city_image.png",
     "adjusted-city-streets-ref-metadata": {
      "internal-color-map": {
        "0": {
           "class-name": "BACKGROUND",
           "confidence": 0.9,
           "hex-color": "#ffffff"
        },
        "1": {
           "class-name": "buildings",
           "confidence": 0.9,
           "hex-color": "#2acf59"
        },
        "2":  {
           "class-name": "road",
           "confidence": 0.9,
           "hex-color": "#f28333"
       }
     },
     "type": "groundtruth/semantic-segmentation",
     "human-annotated": "yes",
     "creation-date": "2018-11-20T22:18:13.527256",
     "job-name": "adjust-label-city-streets",
     }
}
```

## Video frame object detection output
<a name="sms-output-video-object-detection"></a>

The following is the output manifest file from a video frame object detection labeling job. The *red, italicized text* in the examples below depends on labeling job specifications and output data.

In addition to the standard elements, the metadata includes a class map that lists each class that has at least one label in the sequence. The metadata also includes `job-name` which is the name you assigned to the labeling job. For adjustment tasks, If one or more bounding boxes were modified, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted`. 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example-path/input-manifest.json",
    "CarObjectDetection-ref": "s3://amzn-s3-demo-bucket/output/labeling-job-name/annotations/consolidated-annotation/output/0/SeqLabel.json",
    "CarObjectDetection-ref-metadata": {
        "class-map": {
            "0": "car",
            "1": "bus"
        },
        "job-name": "labeling-job/labeling-job-name",
        "human-annotated": "yes",
        "creation-date": "2021-09-29T05:50:35.566000",
        "type": "groundtruth/video-object-detection"
        }
}
```

Ground Truth creates one output sequence file for each sequence of video frames that was labeled. Each output sequence file contains the following: 
+ All annotations for all frames in a sequence in the `detection-annotations` list of JSON objects. 
+ For each frame that was annotated by a worker, the frame file name (`frame`), number (`frame-no`), a list of JSON objects containing annotations (`annotations`), and if applicable, `frame-attributes`. The name of this list is defined by the task type you use: `polylines`, `polygons`, `keypoints`, and for bounding boxes, `annotations`.

   

  Each JSON object contains information about a single annotation and associated label. The following table outlines the parameters you'll see for each video frame task type.   
****    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html)

  In addition to task type specific values, you will see the following in each JSON object:
  + Values of any `label-category-attributes` that were specified for that label. 
  + The `class-id` of the box. Use the `class-map` in the output manifest file to see which label category this ID maps to. 

The following is an example of a `SeqLabel.json` file from a bounding box video frame object detection labeling job. This file will be located under `s3://amzn-s3-demo-bucket/output-prefix/annotations/consolidated-annotation/output/annotation-number/`

```
{
    "detection-annotations": [
        {
            "annotations": [
                {
                    "height": 41,
                    "width": 53,
                    "top": 152,
                    "left": 339,
                    "class-id": "1",
                    "label-category-attributes": {
                        "occluded": "no",
                        "size": "medium"
                    }
                },
                {
                    "height": 24,
                    "width": 37,
                    "top": 148,
                    "left": 183,
                    "class-id": "0",
                    "label-category-attributes": {
                        "occluded": "no",
                    }
                }
            ],
            "frame-no": 0,
            "frame": "frame_0000.jpeg", 
            "frame-attributes": {name: value, name: value}
        },
        {
            "annotations": [
                {
                    "height": 41,
                    "width": 53,
                    "top": 152,
                    "left": 341,
                    "class-id": "0",
                    "label-category-attributes": {}
                },
                {
                    "height": 24,
                    "width": 37,
                    "top": 141,
                    "left": 177,
                    "class-id": "0",
                    "label-category-attributes": {
                        "occluded": "no",
                    }
                }
            ],
            "frame-no": 1,
            "frame": "frame_0001.jpeg",
            "frame-attributes": {name: value, name: value}
        }
    ]
}
```

## Video frame object tracking output
<a name="sms-output-video-object-tracking"></a>

The following is the output manifest file from a video frame object tracking labeling job. The *red, italicized text* in the examples below depends on labeling job specifications and output data.

In addition to the standard elements, the metadata includes a class map that lists each class that has at least one label in the sequence of frames. The metadata also includes `job-name` which is the name you assigned to the labeling job. For adjustment tasks, If one or more bounding boxes were modified, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted`. 

```
{
    "source-ref": "s3://amzn-s3-demo-bucket/example-path/input-manifest.json",
    "CarObjectTracking-ref": "s3://amzn-s3-demo-bucket/output/labeling-job-name/annotations/consolidated-annotation/output/0/SeqLabel.json",
    "CarObjectTracking-ref-metadata": {
        "class-map": {
            "0": "car",
            "1": "bus"
        },
        "job-name": "labeling-job/labeling-job-name",
        "human-annotated": "yes",
        "creation-date": "2021-09-29T05:50:35.566000",
        "type": "groundtruth/video-object-tracking"
        }
 }
```

Ground Truth creates one output sequence file for each sequence of video frames that was labeled. Each output sequence file contains the following: 
+ All annotations for all frames in a sequence in the `tracking-annotations` list of JSON objects. 
+ For each frame that was annotated by a worker, the frame (`frame`), number (`frame-no`), a list of JSON objects containing annotations (`annotations`), and if applicable, frame attributes (`frame-attributes`). The name of this list is defined by the task type you use: `polylines`, `polygons`, `keypoints`, and for bounding boxes, `annotations`.

  Each JSON object contains information about a single annotation and associated label. The following table outlines the parameters you'll see for each video frame task type.   
****    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html)

  In addition to task type specific values, you will see the following in each JSON object: 
  + Values of any `label-category-attributes` that were specified for that label. 
  + The `class-id` of the box. Use the `class-map` in the output manifest file to see which label category this ID maps to. 
  + An `object-id` which identifies an instance of a label. This ID will be the same across frames if a worker identified the same instance of an object in multiple frames. For example, if a car appeared in multiple frames, all bounding boxes uses to identify that car would have the same `object-id`.
  + The `object-name` which is the instance ID of that annotation. 

The following is an example of a `SeqLabel.json` file from a bounding box video frame object tracking labeling job. This file will be located under `s3://amzn-s3-demo-bucket/output-prefix/annotations/consolidated-annotation/output/annotation-number/`

```
{
    "tracking-annotations": [
        {
            "annotations": [
                {
                    "height": 36,
                    "width": 46,
                    "top": 178,
                    "left": 315,
                    "class-id": "0",
                    "label-category-attributes": {
                        "occluded": "no"
                    },
                    "object-id": "480dc450-c0ca-11ea-961f-a9b1c5c97972",
                    "object-name": "car:1"
                }
            ],
            "frame-no": 0,
            "frame": "frame_0001.jpeg",
            "frame-attributes": {}
        },
        {
            "annotations": [
                {
                    "height": 30,
                    "width": 47,
                    "top": 163,
                    "left": 344,
                    "class-id": "1",
                    "label-category-attributes": {
                        "occluded": "no",
                        "size": "medium"
                    },
                    "object-id": "98f2b0b0-c0ca-11ea-961f-a9b1c5c97972",
                    "object-name": "bus:1"
                },
                {
                    "height": 28,
                    "width": 33,
                    "top": 150,
                    "left": 192,
                    "class-id": "0",
                    "label-category-attributes": {
                        "occluded": "partially"
                    },
                    "object-id": "480dc450-c0ca-11ea-961f-a9b1c5c97972",
                    "object-name": "car:1"
                }
            ],
            "frame-no": 1,
            "frame": "frame_0002.jpeg",
            "frame-attributes": {name: value, name: value}
        }
    ]
}
```

## 3D point cloud semantic segmentation output
<a name="sms-output-point-cloud-segmentation"></a>

The following is the output manifest file from a 3D point cloud semantic segmentation labeling job. 

In addition to the standard elements, the metadata for the label includes a color map that defines which color is used to label the image, the class name associated with the color, and the confidence score for each color. Additionally, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted` if the color mask is modified. If you added one or more `frameAttributes` to your label category configuration file, worker responses for frame attributes are in the JSON object, `dataset-object-attributes`.

The `your-label-attribute-ref` parameter contains the location of a compressed file with a .zlib extension. When you uncompress this file, it contains an array. Each index in the array corresponds to the index of an annotated point in the input point cloud. The value of the array at a given index gives the class of the point at the same index in the point cloud, based on the semantic color map found in the `color-map` parameter of the `metadata`.

You can use Python code similar to the following to decompress a .zlib file:

```
import zlib
from array import array

# read the label file
compressed_binary_file = open(zlib_file_path/file.zlib, 'rb').read()

# uncompress the label file
binary_content = zlib.decompress(compressed_binary_file)

# load labels to an array
my_int_array_data = array('B', binary_content);

print(my_int_array_data)
```

The code block above will produce an output similar to the following. Each element of the printed array contains the class of a point at the that index in the point cloud. For example, `my_int_array_data[0] = 1` means `point[0]` in the input point cloud has a class `1`. In the following output manifest file example, class `0` corresponds with `"Background"`, `1` with `Car`, and `2` with `Pedestrian`.

```
>> array('B', [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
```

The following is an example of a semantic segmentation 3D point cloud labeling job output manifest file. The red, italicized text in the examples below depends on labeling job specifications and output data. 

```
{
   "source-ref": "s3://amzn-s3-demo-bucket/examplefolder/frame1.bin",
   "source-ref-metadata":{
      "format": "binary/xyzi",
      "unix-timestamp": 1566861644.759115,
      "ego-vehicle-pose":{...}, 
      "prefix": "s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/prefix",
      "images": [{...}] 
    },
    "lidar-ss-label-attribute-ref": "s3://amzn-s3-demo-bucket/labeling-job-name/annotations/consolidated-annotation/output/dataset-object-id/filename.zlib",
    "lidar-ss-label-attribute-ref-metadata": { 
        'color-map': {
            "0": {
                "class-name": "Background",
                "hex-color": "#ffffff",
                "confidence": 0.00
            },
            "1": {
                "class-name": "Car",
                "hex-color": "#2ca02c",
                "confidence": 0.00
            },
            "2": {
                "class-name": "Pedestrian",
                "hex-color": "#1f77b4",
                "confidence": 0.00
            },
            "3": {
                "class-name": "Tree",
                "hex-color": "#ff7f0e",
                "confidence": 0.00
            }
        },
        'type': 'groundtruth/point_cloud_single_frame_semantic_segmentation', 
        'human-annotated': 'yes',
        'creation-date': '2019-11-12T01:18:14.271944',
        'job-name': 'labeling-job-name',
        //only present for adjustment audit workflow
        "adjustment-status": "adjusted", // "adjusted" means the label was adjusted
        "dataset-object-attributes": {name: value, name: value}
    }
}
```

## 3D point cloud object detection output
<a name="sms-output-point-cloud-object-detection"></a>

The following is sample output from a 3D point cloud objected detection job. For this task type, the data about 3D cuboids is returned in the `3d-bounding-box` parameter, in a list named `annotations`. In this list, each 3D cuboid is described using the following information. 
+ Each class, or label category, that you specify in your input manifest is associated with a `class-id`. Use the `class-map` to identify the class associated with each class ID.
+ These classes are used to give each 3D cuboid an `object-name` in the format `<class>:<integer>` where `integer` is a unique number to identify that cuboid in the frame. 
+ `center-x`, `center-y`, and `center-z` are the coordinates of the center of the cuboid, in the same coordinate system as the 3D point cloud input data used in your labeling job.
+ `length`, `width`, and `height` describe the dimensions of the cuboid. 
+ `yaw` is used to describe the orientation (heading) of the cuboid in radians.
**Note**  
`yaw` is now in the right-handed Cartesian system. Since this feature was added on September 02, 2022 19:02:17 UTC, you can convert the `yaw` measurement in the output data prior to that using the following (all units are in radians):  

  ```
  old_yaw_in_output = pi - yaw
  ```
+ In our definition, \$1x is to the right, \$1y is to the forward, and \$1z is up from the ground plane. The rotation order is x - y - z. The `roll`, `pitch` and `yaw` are represented in the right-handed Cartesian system. In 3D space, `roll` is along the x-axis, `pitch` is along the y-axis and `yaw` is along the z-axis. All three are counterclockwise.
+ If you included label attributes in your input manifest file for a given class, a `label-category-attributes` parameter is included for all cuboids for which workers selected label attributes. 

If one or more cuboids were modified, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted`. If you added one or more `frameAttributes` to your label category configuration file, worker responses for frame attributes are in the JSON object, `dataset-object-attributes`.

The *red, italicized text* in the examples below depends on labeling job specifications and output data. The ellipses (*...*) denote a continuation of that list, where additional objects with the same format as the proceeding object can appear.

```
{
   "source-ref": "s3://amzn-s3-demo-bucket/examplefolder/frame1.txt",
   "source-ref-metadata":{
      "format": "text/xyzi",
      "unix-timestamp": 1566861644.759115, 
      "prefix": "s3://amzn-s3-demo-bucket/lidar_singleframe_dataset/prefix",
      "ego-vehicle-pose": {
            "heading": {
                "qx": -0.02111296123795955,
                "qy": -0.006495469416730261,
                "qz": -0.008024565904865688,
                "qw": 0.9997181192298087
            },
            "position": {
                "x": -2.7161461413869947,
                "y": 116.25822288149078,
                "z": 1.8348751887989483
            }
       },
       "images": [
            {
                "fx": 847.7962624528487,
                "fy": 850.0340893791985,
                "cx": 576.2129134707038,
                "cy": 317.2423573573745,
                "k1": 0,
                "k2": 0,
                "k3": 0,
                "k4": 0,
                "p1": 0,
                "p2": 0,
                "skew": 0,
                "unix-timestamp": 1566861644.759115,
                "image-path": "images/frame_0_camera_0.jpg", 
                "position": {
                    "x": -2.2722515189268138,
                    "y": 116.86003310568965,
                    "z": 1.454614668542299
                },
                "heading": {
                    "qx": 0.7594754093069037,
                    "qy": 0.02181790885672969,
                    "qz": -0.02461725233103356,
                    "qw": -0.6496916273040025
                },
                "camera_model": "pinhole"
            }
        ]
    },
   "3d-bounding-box": 
    {
       "annotations": [
            {
                "label-category-attributes": {
                    "Occlusion": "Partial",
                    "Type": "Sedan"
                },
                "object-name": "Car:1",
                "class-id": 0,
                "center-x": -2.616382013657516,
                "center-y": 125.04149850484193,
                "center-z": 0.311272296465834,
                "length": 2.993000265181146,
                "width": 1.8355260519692056,
                "height": 1.3233490884304047,
                "roll": 0,
                "pitch": 0,
                "yaw": 1.6479308313703527
            },
            {
                "label-category-attributes": {
                    "Occlusion": "Partial",
                    "Type": "Sedan"
                },
                "object-name": "Car:2",
                "class-id": 0,
                "center-x": -5.188984560617168,
                "center-y": 99.7954483288783,
                "center-z": 0.2226435567445657,
                "length": 4,
                "width": 2,
                "height": 2,
                "roll": 0,
                "pitch": 0,
                "yaw": 1.6243170732068055
            }
        ]
    },
    "3d-bounding-box-metadata":
    {
        "objects": [], 
        "class_map": 
        {
            "0": "Car",
        },
        "type": "groundtruth/point_cloud_object_detection",
        "human-annotated": "yes", 
        "creation-date": "2018-10-18T22:18:13.527256",
        "job-name": "identify-3d-objects",
        "adjustment-status": "adjusted",
        "dataset-object-attributes": {name: value, name: value}
    }
}
```

## 3D point cloud object tracking output
<a name="sms-output-point-cloud-object-tracking"></a>

The following is an example of an output manifest file from a 3D point cloud object tracking labeling job. The *red, italicized text* in the examples below depends on labeling job specifications and output data. The ellipses (*...*) denote a continuation of that list, where additional objects with the same format as the proceeding object can appear.

In addition to the standard elements, the metadata includes a class map that lists each class that has at least one label in the sequence. If one or more cuboids were modified, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted`. 

```
{
   "source-ref": "s3://amzn-s3-demo-bucket/myfolder/seq1.json",
    "lidar-label-attribute-ref": "s3://amzn-s3-demo-bucket/<labelingJobName>/annotations/consolidated-annotation/output/<datasetObjectId>/SeqLabel.json",
    "lidar-label-attribute-ref-metadata": { 
        "objects": 
        [
            {   
                "frame-no": 300,
                "confidence": []
            },
            {
                "frame-no": 301,
                "confidence": []
            },
            ...
        ],    
        'class-map': {'0': 'Car', '1': 'Person'}, 
        'type': 'groundtruth/point_cloud_object_tracking', 
        'human-annotated': 'yes',
        'creation-date': '2019-11-12T01:18:14.271944',
        'job-name': 'identify-3d-objects',
        "adjustment-status": "adjusted" 
    }
}
```

In the above example, the cuboid data for each frame in `seq1.json` is in `SeqLabel.json` in the Amazon S3 location, `s3://amzn-s3-demo-bucket/<labelingJobName>/annotations/consolidated-annotation/output/<datasetObjectId>/SeqLabel.json`. The following is an example of this label sequence file.

For each frame in the sequence, you see the `frame-number`, `frame-name`, if applicable, `frame-attributes`, and a list of `annotations`. This list contains 3D cubiods that were drawn for that frame. Each annotation includes the following information: 
+ An `object-name` in the format `<class>:<integer>` where `class` identifies the label category and `integer` is a unique ID across the dataset.
+ When workers draw a cuboid, it is associated with a unique `object-id` which is associated with all cuboids that identify the same object across multiple frames.
+ Each class, or label category, that you specified in your input manifest is associated with a `class-id`. Use the `class-map` to identify the class associated with each class ID.
+ `center-x`, `center-y`, and `center-z` are the coordinates of the center of the cuboid, in the same coordinate system as the 3D point cloud input data used in your labeling job.
+ `length`, `width`, and `height` describe the dimensions of the cuboid. 
+ `yaw` is used to describe the orientation (heading) of the cuboid in radians.
**Note**  
`yaw` is now in the right-handed Cartesian system. Since this feature was added on September 02, 2022 19:02:17 UTC, you can convert the `yaw` measurement in the output data prior to that using the following (all units are in radians):  

  ```
  old_yaw_in_output = pi - yaw
  ```
+ In our definition, \$1x is to the right, \$1y is to the forward, and \$1z is up from the ground plane. The rotation order is x - y - z. The `roll`, `pitch` and `yaw` are represented in the right-handed Cartesian system. In 3D space, `roll` is along the x-axis, `pitch` is along the y-axis and `yaw` is along the z-axis. All three are counterclockwise.
+ If you included label attributes in your input manifest file for a given class, a `label-category-attributes` parameter is included for all cuboids for which workers selected label attributes. 

```
{
    "tracking-annotations": [
        {
            "frame-number": 0,
            "frame-name": "0.txt.pcd",
            "frame-attributes": {name: value, name: value},
            "annotations": [
                {
                    "label-category-attributes": {},
                    "object-name": "Car:4",
                    "class-id": 0,
                    "center-x": -2.2906369208300674,
                    "center-y": 103.73924823843463,
                    "center-z": 0.37634114027023313,
                    "length": 4,
                    "width": 2,
                    "height": 2,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": 1.5827222214406014,
                    "object-id": "ae5dc770-a782-11ea-b57d-67c51a0561a1"
                },
                {
                    "label-category-attributes": {
                        "Occlusion": "Partial",
                        "Type": "Sedan"
                    },
                    "object-name": "Car:1",
                    "class-id": 0,
                    "center-x": -2.6451293634707413,
                    "center-y": 124.9534455706848,
                    "center-z": 0.5020834081743839,
                    "length": 4,
                    "width": 2,
                    "height": 2.080488827301309,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": -1.5963335581398077,
                    "object-id": "06efb020-a782-11ea-b57d-67c51a0561a1"
                },
                {
                    "label-category-attributes": {
                        "Occlusion": "Partial",
                        "Type": "Sedan"
                    },
                    "object-name": "Car:2",
                    "class-id": 0,
                    "center-x": -5.205611313118477,
                    "center-y": 99.91731932137061,
                    "center-z": 0.22917217081212138,
                    "length": 3.8747142207671956,
                    "width": 1.9999999999999918,
                    "height": 2,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": 1.5672228760316775,
                    "object-id": "26fad020-a782-11ea-b57d-67c51a0561a1"
                }
            ]
        },
        {
            "frame-number": 1,
            "frame-name": "1.txt.pcd",
            "frame-attributes": {},
            "annotations": [
                {
                    "label-category-attributes": {},
                    "object-name": "Car:4",
                    "class-id": 0,
                    "center-x": -2.2906369208300674,
                    "center-y": 103.73924823843463,
                    "center-z": 0.37634114027023313,
                    "length": 4,
                    "width": 2,
                    "height": 2,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": 1.5827222214406014,
                    "object-id": "ae5dc770-a782-11ea-b57d-67c51a0561a1"
                },
                {
                    "label-category-attributes": {
                        "Occlusion": "Partial",
                        "Type": "Sedan"
                    },
                    "object-name": "Car:1",
                    "class-id": 0,
                    "center-x": -2.6451293634707413,
                    "center-y": 124.9534455706848,
                    "center-z": 0.5020834081743839,
                    "length": 4,
                    "width": 2,
                    "height": 2.080488827301309,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": -1.5963335581398077,
                    "object-id": "06efb020-a782-11ea-b57d-67c51a0561a1"
                },
                {
                    "label-category-attributes": {
                        "Occlusion": "Partial",
                        "Type": "Sedan"
                    },
                    "object-name": "Car:2",
                    "class-id": 0,
                    "center-x": -5.221311072916759,
                    "center-y": 100.4639841045424,
                    "center-z": 0.22917217081212138,
                    "length": 3.8747142207671956,
                    "width": 1.9999999999999918,
                    "height": 2,
                    "roll": 0,
                    "pitch": 0,
                    "yaw": 1.5672228760316775,
                    "object-id": "26fad020-a782-11ea-b57d-67c51a0561a1"
                }
            ]
        }       
    ]
}
```

## 3D-2D object tracking point cloud object tracking output
<a name="sms-output-3d-2d-point-cloud-object-tracking"></a>

The following is an example of an output manifest file from a 3D point cloud object tracking labeling job. The *red, italicized text* in the examples below depends on labeling job specifications and output data. The ellipses (*...*) denote a continuation of that list, where additional objects with the same format as the proceeding object can appear.

In addition to the standard elements, the metadata includes a class map that lists each class that has at least one label in the sequence. If one or more cuboids were modified, there is an `adjustment-status` parameter in the metadata for audit workflows that is set to `adjusted`. 

```
{
  "source-ref": "s3://amzn-s3-demo-bucket/artifacts/gt-point-cloud-demos/sequences/seq2.json",
  "source-ref-metadata": {
    "json-paths": [
      "number-of-frames",
      "prefix",
      "frames{frame-no, frame}"
    ]
  },
  "3D2D-linking-ref": "s3://amzn-s3-demo-bucket/xyz/3D2D-linking/annotations/consolidated-annotation/output/0/SeqLabel.json",
  "3D2D-linking-ref-metadata": {
    "objects": [
      {
        "frame-no": 0,
        "confidence": []
      },
      {
        "frame-no": 1,
        "confidence": []
      },
      {
        "frame-no": 2,
        "confidence": []
      },
      {
        "frame-no": 3,
        "confidence": []
      },
      {
        "frame-no": 4,
        "confidence": []
      },
      {
        "frame-no": 5,
        "confidence": []
      },
      {
        "frame-no": 6,
        "confidence": []
      },
      {
        "frame-no": 7,
        "confidence": []
      },
      {
        "frame-no": 8,
        "confidence": []
      },
      {
        "frame-no": 9,
        "confidence": []
      }
    ],
    "class-map": {
      "0": "Car"
    },
    "type": "groundtruth/point_cloud_object_tracking",
    "human-annotated": "yes",
    "creation-date": "2023-01-19T02:55:10.206508",
    "job-name": "mcm-linking"
  },
  "3D2D-linking-chain-ref": "s3://amzn-s3-demo-bucket/xyz/3D2D-linking-chain/annotations/consolidated-annotation/output/0/SeqLabel.json",
  "3D2D-linking-chain-ref-metadata": {
    "objects": [
      {
        "frame-no": 0,
        "confidence": []
      },
      {
        "frame-no": 1,
        "confidence": []
      },
      {
        "frame-no": 2,
        "confidence": []
      },
      {
        "frame-no": 3,
        "confidence": []
      },
      {
        "frame-no": 4,
        "confidence": []
      },
      {
        "frame-no": 5,
        "confidence": []
      },
      {
        "frame-no": 6,
        "confidence": []
      },
      {
        "frame-no": 7,
        "confidence": []
      },
      {
        "frame-no": 8,
        "confidence": []
      },
      {
        "frame-no": 9,
        "confidence": []
      }
    ],
    "class-map": {
      "0": "Car"
    },
    "type": "groundtruth/point_cloud_object_tracking",
    "human-annotated": "yes",
    "creation-date": "2023-01-19T03:29:49.149935",
    "job-name": "3d2d-linking-chain"
  }
}
```

In the above example, the cuboid data for each frame in `seq2.json` is in `SeqLabel.json` in the Amazon S3 location, `s3://amzn-s3-demo-bucket/<labelingJobName>/annotations/consolidated-annotation/output/<datasetObjectId>/SeqLabel.json`. The following is an example of this label sequence file.

For each frame in the sequence, you see the `frame-number`, `frame-name`, if applicable, `frame-attributes`, and a list of `annotations`. This list contains 3D cubiods that were drawn for that frame. Each annotation includes the following information: 
+ An `object-name` in the format `<class>:<integer>` where `class` identifies the label category and `integer` is a unique ID across the dataset.
+ When workers draw a cuboid, it is associated with a unique `object-id` which is associated with all cuboids that identify the same object across multiple frames.
+ Each class, or label category, that you specified in your input manifest is associated with a `class-id`. Use the `class-map` to identify the class associated with each class ID.
+ `center-x`, `center-y`, and `center-z` are the coordinates of the center of the cuboid, in the same coordinate system as the 3D point cloud input data used in your labeling job.
+ `length`, `width`, and `height` describe the dimensions of the cuboid. 
+ `yaw` is used to describe the orientation (heading) of the cuboid in radians.
**Note**  
`yaw` is now in the right-handed Cartesian system. Since this feature was added on September 02, 2022 19:02:17 UTC, you can convert the `yaw` measurement in the output data prior to that using the following (all units are in radians):  

  ```
  old_yaw_in_output = pi - yaw
  ```
+ In our definition, \$1x is to the right, \$1y is to the forward, and \$1z is up from the ground plane. The rotation order is x - y - z. The `roll`, `pitch` and `yaw` are represented in the right-handed Cartesian system. In 3D space, `roll` is along the x-axis, `pitch` is along the y-axis and `yaw` is along the z-axis. All three are counterclockwise.
+ If you included label attributes in your input manifest file for a given class, a `label-category-attributes` parameter is included for all cuboids for which workers selected label attributes. 

```
{
  "lidar": {
    "tracking-annotations": [
      {
        "frame-number": 0,
        "frame-name": "0.txt.pcd",
        "annotations": [
          {
            "label-category-attributes": {
              "Type": "Sedan"
            },
            "object-name": "Car:1",
            "class-id": 0,
            "center-x": 12.172361721602815,
            "center-y": 120.23067521992364,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "505b39e0-97a4-11ed-8903-dd5b8b903715"
          },
          {
            "label-category-attributes": {},
            "object-name": "Car:4",
            "class-id": 0,
            "center-x": 17.192725195301094,
            "center-y": 114.55705365827872,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "1afcb670-97a9-11ed-9a84-ff627d099e16"
          }
        ],
        "frame-attributes": {}
      },
      {
        "frame-number": 1,
        "frame-name": "1.txt.pcd",
        "annotations": [
          {
            "label-category-attributes": {
              "Type": "Sedan"
            },
            "object-name": "Car:1",
            "class-id": 0,
            "center-x": -1.6841480600695489,
            "center-y": 126.20198882749516,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "505b39e0-97a4-11ed-8903-dd5b8b903715"
          },
          {
            "label-category-attributes": {},
            "object-name": "Car:4",
            "class-id": 0,
            "center-x": 17.192725195301094,
            "center-y": 114.55705365827872,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "1afcb670-97a9-11ed-9a84-ff627d099e16"
          }
        ],
        "frame-attributes": {}
      },
      {
        "frame-number": 2,
        "frame-name": "2.txt.pcd",
        "annotations": [
          {
            "label-category-attributes": {
              "Type": "Sedan"
            },
            "object-name": "Car:1",
            "class-id": 0,
            "center-x": -1.6841480600695489,
            "center-y": 126.20198882749516,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "505b39e0-97a4-11ed-8903-dd5b8b903715"
          },
          {
            "label-category-attributes": {},
            "object-name": "Car:4",
            "class-id": 0,
            "center-x": 17.192725195301094,
            "center-y": 114.55705365827872,
            "center-z": 1.590525771183712,
            "length": 4,
            "width": 2,
            "height": 2,
            "roll": 0,
            "pitch": 0,
            "yaw": 0,
            "object-id": "1afcb670-97a9-11ed-9a84-ff627d099e16"
          }
        ],
        "frame-attributes": {}
      }
    ]
  },
  "camera-0": {
    "tracking-annotations": [
      {
        "frame-no": 0,
        "frame": "0.txt.pcd",
        "annotations": [
          {
            "label-category-attributes": {
              "Occlusion": "Partial"
            },
            "object-name": "Car:2",
            "class-id": 0,
            "width": 223,
            "height": 164,
            "top": 225,
            "left": 486,
            "object-id": "5229df60-97a4-11ed-8903-dd5b8b903715"
          }
        ],
        "frame-attributes": {}
      },
      {
        "frame-no": 1,
        "frame": "1.txt.pcd",
        "annotations": [
          {
            "label-category-attributes": {},
            "object-name": "Car:4",
            "class-id": 0,
            "width": 252,
            "height": 246,
            "top": 237,
            "left": 473,
            "object-id": "1afcb670-97a9-11ed-9a84-ff627d099e16"
          }
        ],
        "frame-attributes": {}
      }
    ]
  }
}
```

The cuboid and bounding box for an object are linked through a common object-id.

# Enhanced data labeling
<a name="sms-data-labeling"></a>

Amazon SageMaker Ground Truth manages sending your data objects to workers to be labeled. Labeling each data object is a *task*. Workers complete each task until the entire labeling job is complete. Ground Truth divides the total number of tasks into smaller *batches* that are sent to workers. A new batch is sent to workers when the previous one is finished.

Ground Truth provides two features that help improve the accuracy of your data labels and reduce the total cost of labeling your data:
+ *Annotation consolidation* helps to improve the accuracy of your data object labels. It combines the results of multiple workers' annotation tasks into one high-fidelity label.
+ *Automated data labeling* uses machine learning to label portions of your data automatically without having to send them to human workers.

**Topics**
+ [Control the flow of data objects sent to workers](sms-batching.md)
+ [Annotation consolidation](sms-annotation-consolidation.md)
+ [Automate data labeling](sms-automated-labeling.md)
+ [Chaining labeling jobs](sms-reusing-data.md)

# Control the flow of data objects sent to workers
<a name="sms-batching"></a>

Depending on the type of labeling job you create, Amazon SageMaker Ground Truth sends data objects to workers in batches or in a streaming fashion. You can control the flow of data objects to workers in the following ways:
+ For both types of labeling jobs, you can use `MaxConcurrentTaskCount` to control the total number of data objects available to all workers at a given point in time when the labeling job is running.
+ For streaming labeling jobs, you can control the flow of data objects to workers by monitoring and controlling the number of data objects sent to the Amazon SQS associated with your labeling job. 

Use the following sections to learn more about these options.

**Topics**
+ [Use MaxConcurrentTaskCount to control the flow of data objects](#sms-batching-maxconcurrenttaskcount)
+ [Use Amazon SQS to control the flow of data objects to streaming labeling jobs](#sms-batching-streaming-sqs)

## Use MaxConcurrentTaskCount to control the flow of data objects
<a name="sms-batching-maxconcurrenttaskcount"></a>

[https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount) defines the maximum number of data objects available at one time in the worker-portal task queue. If you use the console, this parameter is set to 1,000. If you use `CreateLabelingJob`, you can set this parameter to any integer between 1 and 5,000, inclusive.

Use the following example to better understand how the number of entries in your manifest file, the `NumberOfHumanWorkersPerDataObject`, and the `MaxConcurrentTaskCount` define what tasks workers see in their task queue in the worker-portal UI.

1. You have an input manifest files with 600 entries.

1. For each entry in your input manifest file, you can use `NumberOfHumanWorkersPerDataObject` to define the number of human workers that will label an entry from your input manifest file. In this example, you set `NumberOfHumanWorkersPerDataObject` equal to 3. This will create 3 different tasks for each entry in your input manifest file. Also, to be marked as successfully labeled, at least 3 different workers must label the object. This creates a total of 1,800 tasks (600 x 3) to be completed by workers.

1. You want workers to only see 100 tasks at a time in their queue in the worker portal UI. To do this, you set `MaxConcurrentTaskCount` equal to 100. Ground Truth will then fill the worker-portal task queue with 100 tasks per worker.

1. What happens next depends on the type of labeling job you are creating, and if it is a streaming labeling job.
   + **Streaming labeling job**: As long as the total number of objects available to workers is equal to `MaxConcurrentTaskCount`, all remaining dataset objects in your input manifest file and that you send in real time using Amazon SNS are placed on an Amazon SQS queue. When the total number of objects available to workers falls below `MaxConcurrentTaskCount` minus `NumberOfHumanWorkersPerDataObject`, a new data object from the queue is used to create`NumberOfHumanWorkersPerDataObject`-tasks, which are sent to workers in real time.
   + **Non-streaming labeling job**: As workers finish labeling one set of objects, up to `MaxConcurrentTaskCount` times `NumberOfHumanWorkersPerDataObject` number of new tasks will be sent to workers. This process is repeated until all data objects in the input manifest file are labeled.

## Use Amazon SQS to control the flow of data objects to streaming labeling jobs
<a name="sms-batching-streaming-sqs"></a>

When you create a streaming labeling job, an Amazon SQS queue is automatically created in your account. Data objects are only added to the Amazon SQS queue when the total number of objects sent to workers is above `MaxConcurrentTaskCount`. Otherwise, objects are sent directly to workers.

You can use this queue to manage the flow of data objects to your labeling job. To learn more, see [Manage labeling requests with an Amazon SQS queue](sms-streaming-how-it-works-sqs.md).

# Annotation consolidation
<a name="sms-annotation-consolidation"></a>

An *annotation* is the result of a single worker's labeling task. *Annotation consolidation* combines the annotations of two or more workers into a single label for your data objects. A label, which is assigned to each object in the dataset, is a probabilistic estimate of what the true label should be. Each object in the dataset typically has multiple annotations, but only one label or set of labels.

You decide how many workers annotate each object in your dataset. Using more workers can increase the accuracy of your labels, but also increases the cost of labeling. To learn more about Ground Truth pricing, see [Amazon SageMaker Ground Truth pricing ](https://aws.amazon.com/sagemaker/groundtruth/pricing/).

If you use the Amazon SageMaker AI console to create a labeling job, the following are the defaults for the number of workers who can annotate objects: 
+ Text classification—3 workers
+ Image classification—3 workers
+ Bounding boxes—5 workers
+ Semantic segmentation—3 workers
+ Named entity recognition—3 workers

When you use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation, you set the number of workers to annotate each data object with the `NumberOfHumanWorkersPerDataObject` parameter. You can override the default number of workers that annotate a data object using the console or the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation.

Ground Truth provides an annotation consolidation function for each of its predefined labeling tasks: bounding box, image classification, name entity recognition, semantic segmentation, and text classification. These are the functions:
+ Multi-class annotation consolidation for image and text classification uses a variant of the [Expectation Maximization](https://en.wikipedia.org/wiki/Expectation-maximization_algorithm) approach to annotations. It estimates parameters for each worker and uses Bayesian inference to estimate the true class based on the class annotations from individual workers. 
+ Bounding box annotation consolidates bounding boxes from multiple workers. This function finds the most similar boxes from different workers based on the [Jaccard index](https://en.wikipedia.org/wiki/Jaccard_index), or intersection over union, of the boxes and averages them. 
+ Semantic segmentation annotation consolidation treats each pixel in a single image as a multi-class classification. This function treats the pixel annotations from workers as "votes," with more information from surrounding pixels incorporated by applying a smoothing function to the image.
+ Named entity recognition clusters text selections by Jaccard similarity and calculates selection boundaries based on the mode, or the median if the mode isn't clear. The label resolves to the most assigned entity label in the cluster, breaking ties by random selection.

You can use other algorithms to consolidate annotations. For information, see [Annotation consolidation function creation](consolidation-lambda.md). 

# Annotation consolidation function creation
<a name="consolidation-lambda"></a>

You can choose to use your own annotation consolidation function to determine the final labels for your labeled objects. There are many possible approaches for writing a function and the approach that you take depends on the nature of the annotations to consolidate. Broadly, consolidation functions look at the annotations from workers, measure the similarity between them, and then use some form of probabilistic judgment to determine what the most probable label should be.

If you want to use other algorithms to create annotation consolidations functions, you can find the worker responses in the `[project-name]/annotations/worker-response` folder of the Amazon S3 bucket where you direct the job output.

## Assess similarity
<a name="consolidation-assessing"></a>

To assess the similarity between labels, you can use one of the following strategies, or you can use one that meets your data labeling needs:
+ For label spaces that consist of discrete, mutually exclusive categories, such as multi-class classification, assessing similarity can be straightforward. Discrete labels either match or do not match. 
+ For label spaces that don't have discrete values, such as bounding box annotations, find a broad measure of similarity. For bounding boxes, one such measure is the Jaccard index. This measures the ratio of the intersection of two boxes with the union of the boxes to assess how similar they are. For example, if there are three annotations, then there can be a function that determines which annotations represent the same object and should be consolidated.

## Assess the most probable label
<a name="consolidation-probable-label"></a>

With one of the strategies detailed in the previous sections in mind, make some sort of probabilistic judgment on what the consolidated label should be. In the case of discrete, mutually exclusive categories, this can be straightforward. One of the most common ways to do this is to take the results of a majority vote between the annotations. This weights the annotations equally. 

Some approaches attempt to estimate the accuracy of different annotators and weight their annotations in proportion to the probability of correctness. An example of this is the Expectation Maximization method, which is used in the default Ground Truth consolidation function for multi-class annotations. 

For more information about creating an annotation consolidation function, see [Processing data in a custom labeling workflow with AWS Lambda](sms-custom-templates-step3.md).

# Automate data labeling
<a name="sms-automated-labeling"></a>

If you choose, Amazon SageMaker Ground Truth can use active learning to automate the labeling of your input data for certain built-in task types. *Active learning* is a machine learning technique that identifies data that should be labeled by your workers. In Ground Truth, this functionality is called automated data labeling. Automated data labeling helps to reduce the cost and time that it takes to label your dataset compared to using only humans. When you use automated labeling, you incur SageMaker training and inference costs. 

We recommend using automated data labeling on large datasets because the neural networks used with active learning require a significant amount of data for every new dataset. Typically, as you provide more data, the potential for high accuracy predictions goes up. Data will only be auto-labeled if the neural network used in the auto-labeling model can achieve an acceptably high level of accuracy. Therefore, with larger datasets, there is more potential to automatically label the data because the neural network can achieve high enough accuracy for auto-labeling. Automated data labeling is most appropriate when you have thousands of data objects. The minimum number of objects allowed for automated data labeling is 1,250, but we strongly suggest providing a minimum of 5,000 objects.

Automated data labeling is available only for the following Ground Truth built-in task types: 
+ [Create an image classification job (Single Label)](sms-image-classification.md)
+ [Identify image contents using semantic segmentation](sms-semantic-segmentation.md)
+ Object detection ([Classify image objects using a bounding box](sms-bounding-box.md))
+ [Categorize text with text classification (Single Label)](sms-text-classification.md)

[Streaming labeling jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-streaming-labeling-job.html) do not support automated data labeling.

To learn how to create a custom active learning workflow using your own model, see [Set up an active learning workflow with your own model](#samurai-automated-labeling-byom).

Input data quotas apply for automated data labeling jobs. See [Input Data Quotas](input-data-limits.md) for information about dataset size, input data size and resolution limits.

**Note**  
Before you use an the automated-labeling model in production, you need to fine-tune or test it, or both. You might fine-tune the model (or create and tune another supervised model of your choice) on the dataset produced by your labeling job to optimize the model’s architecture and hyperparameters. If you decide to use the model for inference without fine-tuning it, we strongly recommend making sure that you evaluate its accuracy on a representative (for example, randomly selected) subset of the dataset labeled with Ground Truth and that it matches your expectations.

## How it works
<a name="sms-automated-labeling-how-it-works"></a>

You enable automated data labeling when you create a labeling job. This is how it works:

1. When Ground Truth starts an automated data labeling job, it selects a random sample of input data objects and sends them to human workers. If more than 10% of these data objects fail, the labeling job will fail. If the labeling job fails, in addition to reviewing any error message Ground Truth returns, check that your input data is displaying correctly in the worker UI, instructions are clear, and that you have given workers enough time to complete tasks.

1. When the labeled data is returned, it is used to create a training set and a validation set. Ground Truth uses these datasets to train and validate the model used for auto-labeling.

1. Ground Truth runs a batch transform job, using the validated model for inference on the validation data. Batch inference produces a confidence score and quality metric for each object in the validation data.

1. The auto labeling component will use these quality metrics and confidence scores to create a *confidence score threshold* that ensures quality labels. 

1. Ground Truth runs a batch transform job on the unlabeled data in the dataset, using the same validated model for inference. This produces a confidence score for each object. 

1. The Ground Truth auto labeling component determines if the confidence score produced in step 5 for each object meets the required threshold determined in step 4. If the confidence score meets the threshold, the expected quality of automatically labeling exceeds the requested level of accuracy and that object is considered auto-labeled. 

1. Step 6 produces a dataset of unlabeled data with confidence scores. Ground Truth selects data points with low confidence scores from this dataset and sends them to human workers. 

1. Ground Truth uses the existing human-labeled data and this additional labeled data from human workers to update the model.

1. The process is repeated until the dataset is fully labeled or until another stopping condition is met. For example, auto-labeling stops if your human annotation budget is reached.

The preceding steps happen in iterations. Select each tab in the following table to see an example of the processes that happen in each iteration for an object detection automated labeling job. The number of data objects used in a given step in these images (for example, 200) is specific to this example. If there are fewer than 5,000 objects to label, the validation set size is 20% of the whole dataset. If there are more than 5,000 objects in your input dataset, the validation set size is 10% of the whole dataset. You can control the number of human labels collected per active learning iteration by changing the value for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-MaxConcurrentTaskCount) when using the API operation [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html). This value is set to 1,000 when you create a labeling job using the console. In the active learning flow illustrated under the **Active Learning** tab, this value is set to 200.

------
#### [ Model Training ]

![\[Example process of model training.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/auto-labeling/sagemaker-gt-annotate-data-3.png)


------
#### [ Automated Labeling ]

![\[Example process of automated labeling.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/auto-labeling/sagemaker-gt-annotate-data-4.png)


------
#### [ Active Learning ]

![\[Example process of active learning.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/auto-labeling/sagemaker-gt-annotate-data-5.png)


------

### Accuracy of automated labels
<a name="sms-automated-labeling-accuracy"></a>

The definition of *accuracy* depends on the built-in task type that you use with automated labeling. For all task types, these accuracy requirements are pre-determined by Ground Truth and cannot be manually configured.
+ For image classification and text classification, Ground Truth uses logic to find a label-prediction confidence level that corresponds to at least 95% label accuracy. This means Ground Truth expects the accuracy of the automated labels to be at least 95% when compared to the labels that human labelers would provide for those examples.
+ For bounding boxes, the expected mean [Intersection Over Union (IoU) ](https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/) of the auto-labeled images is 0.6. To find the mean IoU, Ground Truth calculates the mean IoU of all the predicted and missed boxes on the image for every class, and then averages these values across classes.
+ For semantic segmentation, the expected mean IoU of the auto-labeled images is 0.7. To find the mean IoU, Ground Truth takes the mean of the IoU values of all the classes in the image (excluding the background).

At every iteration of Active Learning (steps 3-6 in the list above), the confidence threshold is found using the human-annotated validation set so that the expected accuracy of the auto-labeled objects satisfies certain predefined accuracy requirements.

## Create an automated data labeling job (console)
<a name="sms-create-automated-labeling-console"></a>

To create a labeling job that uses automated labeling in the SageMaker AI console, use the following procedure.

**To create an automated data labeling job (console)**

1. Open the Ground Truth **Labeling jobs** section of the SageMaker AI console: [https://console.aws.amazon.com/sagemaker/groundtruth](https://console.aws.amazon.com/sagemaker/groundtruth).

1. Using [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) as a guide, complete the **Job overview** and **Task type** sections. Note that auto labeling is not supported for custom task types.

1. Under **Workers**, choose your workforce type. 

1. In the same section, choose **Enable automated data labeling**. 

1. Using [Configure the Bounding Box Tool](sms-getting-started.md#sms-getting-started-step4) as a guide, create worker instructions in the section ***Task Type* labeling tool**. For example, if you chose **Semantic segmentation** as your labeling job type, this section is called **Semantic segmentation labeling tool**.

1. To preview your worker instructions and dashboard, choose **Preview**.

1. Choose **Create**. This creates and starts your labeling job and the auto labeling process. 

You can see your labeling job appear in the **Labeling jobs** section of the SageMaker AI console. Your output data appears in the Amazon S3 bucket that you specified when creating the labeling job. For more information about the format and file structure of your labeling job output data, see [Labeling job output data](sms-data-output.md).

## Create an automated data labeling job (API)
<a name="sms-create-automated-labeling-api"></a>

To create an automated data labeling job using the SageMaker API, use the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobAlgorithmsConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobAlgorithmsConfig.html) parameter of the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation. To learn how to start a labeling job using the `CreateLabelingJob` operation, see [Create a Labeling Job (API)](sms-create-labeling-job-api.md).

Specify the Amazon Resource Name (ARN) of the algorithm that you are using for automated data labeling in the [LabelingJobAlgorithmSpecificationArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobAlgorithmsConfig.html#SageMaker-Type-LabelingJobAlgorithmsConfig-LabelingJobAlgorithmSpecificationArn) parameter. Choose from one of the four Ground Truth built-in algorithms that are supported with automated labeling:
+ [Create an image classification job (Single Label)](sms-image-classification.md)
+ [Identify image contents using semantic segmentation](sms-semantic-segmentation.md)
+ Object detection ([Classify image objects using a bounding box](sms-bounding-box.md)) 
+ [Categorize text with text classification (Single Label)](sms-text-classification.md)

When an automated data labeling job finishes, Ground Truth returns the ARN of the model it used for the automated data labeling job. Use this model as the starting model for similar auto-labeling job types by providing the ARN, in string format, in the [InitialActiveLearningModelArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobAlgorithmsConfig.html#SageMaker-Type-LabelingJobAlgorithmsConfig-InitialActiveLearningModelArn) parameter. To retrieve the model's ARN, use an AWS Command Line Interface (AWS CLI) command similar to the following. 

```
# Fetch the mARN of the model trained in the final iteration of the previous labeling job.Ground Truth
pretrained_model_arn = sagemaker_client.describe_labeling_job(LabelingJobName=job_name)['LabelingJobOutput']['FinalActiveLearningModelArn']
```

To encrypt data on the storage volume attached to the ML compute instance(s) that are used in automated labeling, include an AWS Key Management Service (AWS KMS) key in the `VolumeKmsKeyId` parameter. For information about AWS KMS keys, see [What is AWS Key Management Service?](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html) in the *AWS Key Management Service Developer Guide*.

For an example that uses the [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) operation to create an automated data labeling job, see the **object\$1detection\$1tutorial** example in the **SageMaker AI Examples**, **Ground Truth Labeling Jobs** section of a SageMaker AI notebook instance. To learn how to create and open a notebook instance, see [Create an Amazon SageMaker notebook instance](howitworks-create-ws.md).

## Amazon EC2 instances required for automated data labeling
<a name="sms-auto-labeling-ec2"></a>

The following table lists the Amazon Elastic Compute Cloud (Amazon EC2) instances that you need to run automated data labeling for training and batch inference jobs.


| Automated Data Labeling Job Type | Training Instance Type | Inference Instance Type | 
| --- | --- | --- | 
|  Image classification  |  ml.p3.2xlarge\$1  |  ml.c5.xlarge  | 
|  Object detection (bounding box)  |  ml.p3.2xlarge\$1  |  ml.c5.4xlarge  | 
|  Text classification  |  ml.c5.2xlarge  |  ml.m4.xlarge  | 
|  Semantic segmentation  |  ml.p3.2xlarge\$1  |  ml.p3.2xlarge\$1  | 

\$1 In the Asia Pacific (Mumbai) Region (ap-south-1) use ml.p2.8xlarge instead.

 Ground Truth manages the instances that you use for automated data labeling jobs. It creates, configures, and terminates the instances as needed to perform your job. These instances don't appear in your Amazon EC2 instance dashboard.

## Set up an active learning workflow with your own model
<a name="samurai-automated-labeling-byom"></a>

You can create an active learning workflow with your own algorithm to run training and inferences in that workflow to auto-label your data. The notebook bring\$1your\$1own\$1model\$1for\$1sagemaker\$1labeling\$1workflows\$1with\$1active\$1learning.ipynb demonstrates this using the SageMaker AI built-in algorithm, [BlazingText](https://docs.aws.amazon.com/sagemaker/latest/dg/blazingtext.html). This notebook provides an CloudFormation stack that you can use to execute this workflow using AWS Step Functions. You can find the notebook and supporting files in this [GitHub repository](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/ground_truth_labeling_jobs/bring_your_own_model_for_sagemaker_labeling_workflows_with_active_learning).

# Chaining labeling jobs
<a name="sms-reusing-data"></a>

Amazon SageMaker Ground Truth can reuse datasets from prior jobs in two ways: cloning and chaining.

*Cloning* copies the setup of a prior labeling job and allows you to make additional changes before setting it to run.

*Chaining* uses not only the setup of the prior job, but also the results. This allows you to continue an incomplete job and add labels or data objects to a completed job. Chaining is a more complex operation. 

For data processing: 
+  Cloning uses the prior job's *input* manifest, with optional modifications, as the new job's input manifest. 
+  Chaining uses the prior job's *output* manifest as the new job's input manifest. 

Chaining is useful when you need to:
+ Continue a labeling job that was manually stopped.
+ Continue a labeling job that failed mid-job, after fixing issues.
+ Switch to automated data labeling after manually labeling part of a job (or the other way around).
+ Add more data objects to a completed job and start the job from there.
+ Add another annotation to a completed job. For example, you have a collection of phrases labeled for topic, then want to run the set again, categorizing them by the topic's implied audience.

In Amazon SageMaker Ground Truth you can configure a chained labeling job with either the console or the API.

## Key term: label attribute name
<a name="sms-reusing-data-LAN"></a>

The *label attribute name* (`LabelAttributeName` in the API) is a string used as the key for the key-value pair formed with the label that a worker assigns to the data object.

The following rules apply for the label attribute name:
+ It can't end with `-metadata`.
+ The names `source` and `source-ref` are reserved and can't be used.
+ For semantic segmentation labeling jobs, , it must end with `-ref`. For all other labeling jobs, it *can't* end with `-ref`. If you use the console to create the job, Amazon SageMaker Ground Truth automatically appends `-ref` to all label attribute names except for semantic segmentation jobs.
+ For a chained labeling job, if you're using the same label attribute name from the originating job and you configure the chained job to use auto-labeling, then if it had been in auto-labeling mode at any point, Ground Truth uses the model from the originating job.

In an output manifest, the label attribute name appears similar to the following.

```
  "source-ref": "<S3 URI>",
  "<label attribute name>": {
    "annotations": [{
      "class_id": 0,
      "width": 99,
      "top": 87,
      "height": 62,
      "left": 175
    }],
    "image_size": [{
      "width": 344,
      "depth": 3,
      "height": 234
    }]
  },
  "<label attribute name>-metadata": {
    "job-name": "<job name>",
    "class-map": {
      "0": "<label attribute name>"
    },
    "human-annotated": "yes",
    "objects": [{
      "confidence": 0.09
    }],
    "creation-date": "<timestamp>",
    "type": "groundtruth/object-detection"
  }
```

If you're creating a job in the console and don't explicitly set the label attribute name value, Ground Truth uses the job name as the label attribute name for the job.

## Start a chained job (console)
<a name="sms-reusing-data-console"></a>

Choose a stopped, failed, or completed labeling job from the list of your existing jobs. This enables the **Actions** menu.

From the **Actions** menu, choose **Chain**.

### Job overview panel
<a name="sms-reusing-data-console-job-panel"></a>

In the **Job overview** panel, a new **Job name** is set based on the title of the job from which you are chaining this one. You can change it.

You may also specify a label attribute name different from the labeling job name.

If you're chaining from a completed job, the label attribute name uses the name of the new job you're configuring. To change the name, select the check box.

If you're chaining from a stopped or failed job, the label attribute name uses to the name of the job from which you're chaining. It's easy to see and edit the value because the name check box is checked.

**Attribute label naming considerations**  
**The default** uses the label attribute name Ground Truth has selected. All data objects without data connected to that label attribute name are labeled.
**Using a label attribute name** not present in the manifest causes the job to process *all* the objects in the dataset.

The **input dataset location** in this case is automatically selected as the output manifest of the chained job. The input field is not available, so you cannot change it.

**Adding data objects to a labeling job**  
You cannot specify an alternate manifest file. Manually edit the output manifest from the previous job to add new items before starting a chained job. The Amazon S3 URI helps you locate where you are storing the manifest in your Amazon S3 bucket. Download the manifest file from there, edit it locally on your computer, and then upload the new version to replace it. Make sure you are not introducing errors during editing. We recommend you use JSON linter to check your JSON. Many popular text editors and IDEs have linter plugins available.

## Start a chained job (API)
<a name="sms-reusing-data-API"></a>

The procedure is almost the same as setting up a new labeling job with `CreateLabelingJob`, except for two primary differences:
+ **Manifest location:** Rather than use your original manifest from the prior job, the value for the `ManifestS3Uri` in the `DataSource` should point to the Amazon S3 URI of the *output manifest* from the prior labeling job.
+ **Label attribute name:** Setting the correct `LabelAttributeName` value is important here. This is the key portion of a key-value pair where labeling data is the value. Sample use cases include:
  + **Adding new or more specific labels to a completed job** — Set a new label attribute name.
  + **Labeling the unlabeled items from a prior job** — Use the label attribute name from the prior job.

## Use a partially labeled dataset
<a name="sms-reusing-data-newdata"></a>

You can get some chaining benefits if you use an augmented manifest that has already been partially labeled. Check the **Label attribute name** check box and set the name so that it matches the name in your manifest.

If you're using the API, the instructions are the same as those for starting a chained job. However, be sure to upload your manifest to an Amazon S3 bucket and use it instead of using the output manifest from a prior job.

The **Label attribute name** value in the manifest has to conform to the naming considerations discussed earlier.

# Ground Truth Security and Permissions
<a name="sms-security-general"></a>

Use the topics on this page to learn about Ground Truth security features and how to configure AWS Identity and Access Management (IAM) permissions to allow a user or role to create a labeling job. Additionally, learn how to create an *execution role*. An execution role is the role that you specify when you create a labeling job. This role is used to start your labeling job.

If you are a new user and want to get started quickly, or if you do not require granular permissions, see [Use IAM Managed Policies with Ground Truth](sms-security-permissions-get-started.md).

For more information about IAM users and roles, see [Identities (Users, Groups, and Roles)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id.html) in the IAM User Guide. 

To learn more about using IAM with SageMaker AI, see [AWS Identity and Access Management for Amazon SageMaker AI](security-iam.md).

**Topics**
+ [CORS Requirement for Input Image Data](sms-cors-update.md)
+ [Assign IAM Permissions to Use Ground Truth](sms-security-permission.md)
+ [Using Amazon SageMaker Ground Truth in an Amazon Virtual Private Cloud](sms-vpc.md)
+ [Output Data and Storage Volume Encryption](sms-security.md)
+ [Workforce Authentication and Restrictions](sms-security-workforce-authentication.md)

# CORS Requirement for Input Image Data
<a name="sms-cors-update"></a>

Earlier in 2020, widely used browsers like Chrome and Firefox changed their default behavior for rotating images based on image metadata, referred to as [EXIF data](https://en.wikipedia.org/wiki/Exif). Previously, browsers would always display images in exactly the manner in which they are stored on disk, which is typically unrotated. After the change, images now rotate according to a piece of image metadata called *orientation value*. This has important implications for the entire machine learning (ML) community. For example, if applications that annotate images do not consider the EXIF orientation, they may display images in unexpected orientations, resulting in incorrect labels. 

Starting with Chrome 89, AWS can no longer automatically prevent the rotation of images because the web standards group W3C has decided that the ability to control rotation of images violates the web’s Same-origin Policy. Therefore, to ensure human workers annotate your input images in a predictable orientation when you submit requests to create a labeling job, you must add a CORS header policy to the Amazon S3 buckets that contain your input images.

**Important**  
If you do not add a CORS configuration to the Amazon S3 buckets that contain your input data, labeling tasks for those input data objects will fail.

If you create a job through the Ground Truth console, CORS is enabled by default. If all of your input data is *not* located in the same Amazon S3 bucket as your input manifest file, you must add a CORS configuration to all Amazon S3 buckets that contain input data using the following instructions.

If you are using the `CreateLabelingJob` API to create a Ground Truth labeling job, you can add a CORS policy to an Amazon S3 bucket that contains input data in the S3 console. To set the required CORS headers on the Amazon S3 bucket that contain your input images in the Amazon S3 console, follow the directions detailed in [How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html). Use the following CORS configuration code for the buckets that host your images. If you use the Amazon S3 console to add the policy to your bucket, you must use the JSON format.

**Important**  
If you create a 3D point cloud or video frame labeling job, you must add additional rules to your CORS configuration. To learn more, see [3D point cloud labeling job permission requirements](sms-security-permission-3d-point-cloud.md) and [Video frame job permission requirements](sms-video-overview.md#sms-security-permission-video-frame) respectively. 

**JSON**

```
[{
   "AllowedHeaders": [],
   "AllowedMethods": ["GET"],
   "AllowedOrigins": ["*"],
   "ExposeHeaders": ["Access-Control-Allow-Origin"]
}]
```

**XML**

```
<CORSConfiguration>
 <CORSRule>
   <AllowedOrigin>*</AllowedOrigin>
   <AllowedMethod>GET</AllowedMethod>
   <ExposeHeader>Access-Control-Allow-Origin</ExposeHeader>
 </CORSRule>
</CORSConfiguration>
```

The following GIF demonstrates the instructions found in the Amazon S3 documentation to add a CORS header policy using the Amazon S3 console. For written instructions, see **Using the Amazon S3 console** on the documentation page [How do I add cross-domain resource sharing with CORS?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-cors-configuration.html) in the Amazon Simple Storage Service User Guide.

![\[Gif on how to add a CORS header policy using the Amazon S3 console.\]](http://docs.aws.amazon.com/sagemaker/latest/dg/images/sms/gifs/cors-config.gif)


# Assign IAM Permissions to Use Ground Truth
<a name="sms-security-permission"></a>

Use the topics in this section to learn how to use AWS Identity and Access Management (IAM) managed and custom policies to manage access to Ground Truth and associated resources. 

You can use the sections on this page to learn the following: 
+ How to create IAM policies that grant a user or role permission to create a labeling job. Administrators can use IAM policies to restrict access to Amazon SageMaker AI and other AWS services that are specific to Ground Truth.
+ How to create a SageMaker AI *execution role*. An execution role is the role that you specify when you create a labeling job. The role is used to start and manage your labeling job.

The following is an overview of the topics you'll find on this page: 
+ If you are getting started using Ground Truth, or you do not require granular permissions for your use case, it is recommended that you use the IAM managed policies described in [Use IAM Managed Policies with Ground Truth](sms-security-permissions-get-started.md).
+ Learn about the permissions required to use the Ground Truth console in [Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console](sms-security-permission-console-access.md). This section includes policy examples that grant an IAM entity permission to create and modify private work teams, subscribe to vendor work teams, and create custom labeling workflows.
+ When you create a labeling job, you must provide an execution role. Use [Create a SageMaker AI Execution Role for a Ground Truth Labeling Job](sms-security-permission-execution-role.md) to learn about the permissions required for this role.

# Use IAM Managed Policies with Ground Truth
<a name="sms-security-permissions-get-started"></a>

SageMaker AI and Ground Truth provide AWS managed policies that you can use to create a labeling job. If you are getting started using Ground Truth and you do not require granular permissions for your use case, it is recommended that you use the following policies:
+ `[AmazonSageMakerFullAccess](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerFullAccess)` – Use this policy to give a user or role permission to create a labeling job. This is a broad policy that grants a entity permission to use SageMaker AI features, as well as features of necessary AWS services through the console and API. This policy gives the entity permission to create a labeling job and to create and manage workforces using Amazon Cognito. To learn more, see [AmazonSageMakerFullAccess Policy](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonSageMakerFullAccess).
+ `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` – To create an *execution role*, you can attach the policy `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` to a role. An execution role is the role that you specify when you create a labeling job and it is used to start your labeling job. This policy allows you to create both streaming and non-streaming labeling jobs, and to create a labeling job using any task type. Note the following limits of this managed policy.
  + **Amazon S3 permissions**: This policy grants an execution role permission to access Amazon S3 buckets with the following strings in the name: `GroundTruth`, `Groundtruth`, `groundtruth`, `SageMaker`, `Sagemaker`, and `sagemaker` or a bucket with an [object tag](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) that includes `SageMaker` in the name (case insensitive). Make sure your input and output bucket names include these strings, or add additional permissions to your execution role to [grant it permission to access your Amazon S3 buckets](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket.html). You must give this role permission to perform the following actions on your Amazon S3 buckets: `AbortMultipartUpload`, `GetObject`, and `PutObject`.
  + **Custom Workflows**: When you create a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html), this execution role is restricted to invoking AWS Lambda functions with one of the following strings as part of the function name: `GtRecipe`, `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction`. This applies to both your pre-annotation and post-annotation Lambda functions. If you choose to use names without those strings, you must explicitly provide `lambda:InvokeFunction` permission to the execution role used to create the labeling job.

To learn how to attach an AWS managed policy to a user or role, refer to [Adding and removing IAM identity permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console) in the IAM User Guide.

# Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console
<a name="sms-security-permission-console-access"></a>

To use the Ground Truth area of the SageMaker AI console, you need to grant permission to an entity to access SageMaker AI and other AWS services that Ground Truth interacts with. Required permissions to access other AWS services depends on your use-case: 
+ Amazon S3 permissions are required for all use cases. These permissions must grant access to the Amazon S3 buckets that contain input and output data. 
+ AWS Marketplace permissions are required to use a vendor workforce.
+ Amazon Cognito permission are required for private work team setup.
+ AWS KMS permissions are required to view available AWS KMS keys that can be used for output data encryption.
+ IAM permissions are required to either list pre-existing execution roles, or to create a new one. Additionally, you must use add a `PassRole` permission to allow SageMaker AI to use the execution role chosen to start the labeling job.

The following sections list policies you may want to grant to a role to use one or more functions of Ground Truth. 

**Topics**
+ [Ground Truth Console Permissions](#sms-security-permissions-console-all)
+ [Custom Labeling Workflow Permissions](#sms-security-permissions-custom-workflow)
+ [Private Workforce Permissions](#sms-security-permission-workforce-creation)
+ [Vendor Workforce Permissions](#sms-security-permissions-workforce-creation-vendor)

## Ground Truth Console Permissions
<a name="sms-security-permissions-console-all"></a>

To grant permission to a user or role to use the Ground Truth area of the SageMaker AI console to create a labeling job, attach the following policy to the user or role. The following policy will give an IAM role permission to create a labeling job using a [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) task type. If you want to create a custom labeling workflow, add the policy in [Custom Labeling Workflow Permissions](#sms-security-permissions-custom-workflow) to the following policy. Each `Statement` included in the following policy is described below this code block.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "SageMakerApis",
            "Effect": "Allow",
            "Action": [
                "sagemaker:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "KmsKeysForCreateForms",
            "Effect": "Allow",
            "Action": [
                "kms:DescribeKey",
                "kms:ListAliases"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AccessAwsMarketplaceSubscriptions",
            "Effect": "Allow",
            "Action": [
                "aws-marketplace:ViewSubscriptions"
            ],
            "Resource": "*"
        },
        {
            "Sid": "SecretsManager",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:CreateSecret",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecrets"
            ],
            "Resource": "*"
        },
        {
            "Sid": "ListAndCreateExecutionRoles",
            "Effect": "Allow",
            "Action": [
                "iam:ListRoles",
                "iam:CreateRole",
                "iam:CreatePolicy",
                "iam:AttachRolePolicy"
            ],
            "Resource": "*"
        },
        {
            "Sid": "PassRoleForExecutionRoles",
            "Effect": "Allow",
            "Action": [
                "iam:PassRole"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "sagemaker.amazonaws.com"
                }
            }
        },
        {
            "Sid": "GroundTruthConsole",
            "Effect": "Allow",
            "Action": [
                "groundtruthlabeling:*",
                "lambda:InvokeFunction",
                "lambda:ListFunctions",
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetBucketCors",
                "s3:PutBucketCors",
                "s3:ListAllMyBuckets",
                "cognito-idp:AdminAddUserToGroup",
                "cognito-idp:AdminCreateUser",
                "cognito-idp:AdminDeleteUser",
                "cognito-idp:AdminDisableUser",
                "cognito-idp:AdminEnableUser",
                "cognito-idp:AdminRemoveUserFromGroup",
                "cognito-idp:CreateGroup",
                "cognito-idp:CreateUserPool",
                "cognito-idp:CreateUserPoolClient",
                "cognito-idp:CreateUserPoolDomain",
                "cognito-idp:DescribeUserPool",
                "cognito-idp:DescribeUserPoolClient",
                "cognito-idp:ListGroups",
                "cognito-idp:ListIdentityProviders",
                "cognito-idp:ListUsers",
                "cognito-idp:ListUsersInGroup",
                "cognito-idp:ListUserPoolClients",
                "cognito-idp:ListUserPools",
                "cognito-idp:UpdateUserPool",
                "cognito-idp:UpdateUserPoolClient"
            ],
            "Resource": "*"
        }
    ]
}
```

------

This policy includes the following statements. You can scope down any of these statements by adding specific resourses to the `Resource` list for that statement.

`SageMakerApis`

This statement includes `sagemaker:*`, which allows the user to perform all [SageMaker AI API actions](sagemaker/latest/APIReference/API_Operations.html). You can reduce the scope of this policy by restricting users from performing actions that are not used to create and monitoring a labeling job. 

**`KmsKeysForCreateForms`**

You only need to include this statement if you want to grant a user permission to list and select AWS KMS keys in the Ground Truth console to use for output data encryption. The policy above grants a user permission to list and select any key in the account in AWS KMS. To restrict the keys that a user can list and select, specify those key ARNs in `Resource`.

**`SecretsManager`**

This statement gives the user permission to describe, list, and create resources in AWS Secrets Manager required to create the labeling job.

`ListAndCreateExecutionRoles`

This statement gives a user permission to list (`ListRoles`) and create (`CreateRole`) IAM roles in your account. It also grants the user permission to create (`CreatePolicy`) policies and attach (`AttachRolePolicy`) policies to entities. These are required to list, select, and if required, create an execution role in the console. 

If you have already created an execution role, and want to narrow the scope of this statement so that users can only select that role in the console, specify the ARNs of the roles you want the user to have permission to view in `Resource` and remove the actions `CreateRole`, `CreatePolicy`, and `AttachRolePolicy`.

`AccessAwsMarketplaceSubscriptions`

These permissions are required to view and choose vendor work teams that you are already subscribed to when creating a labeling job. To give the user permission to *subscribe* to vendor work teams, add the statement in [Vendor Workforce Permissions](#sms-security-permissions-workforce-creation-vendor) to the policy above

`PassRoleForExecutionRoles`

This is required to give the labeling job creator permission to preview the worker UI and verify that input data, labels, and instructions display correctly. This statement gives an entity permissions to pass the IAM execution role used to create the labeling job to SageMaker AI to render and preview the worker UI. To narrow the scope of this policy, add the role ARN of the execution role used to create the labeling job under `Resource`.

**`GroundTruthConsole`**
+ `groundtruthlabeling` – This allows a user to perform actions required to use certain features of the Ground Truth console. These include permissions to describe the labeling job status (`DescribeConsoleJob`), list all dataset objects in the input manifest file (`ListDatasetObjects`), filter the dataset if dataset sampling is selected (`RunFilterOrSampleDatasetJob`), and to generate input manifest files if automated data labeling is used (`RunGenerateManifestByCrawlingJob`). These actions are only available when using the Ground Truth console and cannot be called directly using an API.
+ `lambda:InvokeFunction` and `lambda:ListFunctions` – these actions give users permission to list and invoke Lambda functions that are used to run a custom labeling workflow.
+ `s3:*` – All Amazon S3 permissions included in this statement are used to view Amazon S3 buckets for [automated data setup](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-console-create-manifest-file.html) (`ListAllMyBuckets`), access input data in Amazon S3 (`ListBucket`, `GetObject`), check for and create a CORS policy in Amazon S3 if needed (`GetBucketCors` and `PutBucketCors`), and write labeling job output files to S3 (`PutObject`).
+ `cognito-idp` – These permissions are used to create, view and manage and private workforce using Amazon Cognito. To learn more about these actions, refer to the [Amazon Cognito API References](https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-reference.html).

## Custom Labeling Workflow Permissions
<a name="sms-security-permissions-custom-workflow"></a>

Add the following statement to a policy similar to the one in [Ground Truth Console Permissions](#sms-security-permissions-console-all) to give a user permission to select pre-existing pre-annotation and post-annotation Lambda functions while [creating a custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html).

```
{
    "Sid": "GroundTruthConsoleCustomWorkflow",
    "Effect": "Allow",
    "Action": [
        "lambda:InvokeFunction",
        "lambda:ListFunctions"
    ],
    "Resource": "*"
}
```

To learn how to give an entity permission to create and test pre-annotation and post-annotation Lambda functions, see [Required Permissions To Use Lambda With Ground Truth](http://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates-step3-lambda-permissions.html).

## Private Workforce Permissions
<a name="sms-security-permission-workforce-creation"></a>

When added to a permissions policy, the following permission grants access to create and manage a private workforce and work team using Amazon Cognito. These permissions are not required to use an [OIDC IdP workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-create-private-oidc.html#sms-workforce-create-private-oidc-next-steps).

```
{
    "Effect": "Allow",
    "Action": [
        "cognito-idp:AdminAddUserToGroup",
        "cognito-idp:AdminCreateUser",
        "cognito-idp:AdminDeleteUser",
        "cognito-idp:AdminDisableUser",
        "cognito-idp:AdminEnableUser",
        "cognito-idp:AdminRemoveUserFromGroup",
        "cognito-idp:CreateGroup",
        "cognito-idp:CreateUserPool",
        "cognito-idp:CreateUserPoolClient",
        "cognito-idp:CreateUserPoolDomain",
        "cognito-idp:DescribeUserPool",
        "cognito-idp:DescribeUserPoolClient",
        "cognito-idp:ListGroups",
        "cognito-idp:ListIdentityProviders",
        "cognito-idp:ListUsers",
        "cognito-idp:ListUsersInGroup",
        "cognito-idp:ListUserPoolClients",
        "cognito-idp:ListUserPools",
        "cognito-idp:UpdateUserPool",
        "cognito-idp:UpdateUserPoolClient"
        ],
    "Resource": "*"
}
```

To learn more about creating private workforce using Amazon Cognito, see [Amazon Cognito Workforces](sms-workforce-private-use-cognito.md). 

## Vendor Workforce Permissions
<a name="sms-security-permissions-workforce-creation-vendor"></a>

You can add the following statement to the policy in [Grant IAM Permission to Use the Amazon SageMaker Ground Truth Console](#sms-security-permission-console-access) to grant an entity permission to subscribe to a [vendor workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management-vendor.html).

```
{
    "Sid": "AccessAwsMarketplaceSubscriptions",
    "Effect": "Allow",
    "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
    ],
    "Resource": "*"
}
```

# Create a SageMaker AI Execution Role for a Ground Truth Labeling Job
<a name="sms-security-permission-execution-role"></a>

When you configure your labeling job, you need to provide an *execution role*, which is a role that SageMaker AI has permission to assume to start and run your labeling job.

This role must give Ground Truth permission to access the following: 
+ Amazon S3 to retrieve your input data and write output data to an Amazon S3 bucket. You can either grant permission for an IAM role to access an entire bucket by providing the bucket ARN, or you can grant access to the role to access specific resources in a bucket. For example, the ARN for a bucket may look similar to `arn:aws:s3:::amzn-s3-demo-bucket1` and the ARN of a resource in an Amazon S3 bucket may look similar to `arn:aws:s3:::amzn-s3-demo-bucket1/prefix/file-name.png`. To apply an action to all resources in an Amazon S3 bucket, you can use the wild card: `*`. For example, `arn:aws:s3:::amzn-s3-demo-bucket1/prefix/*`. For more information, see [Amazon Amazon S3 Resources](https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-arn-format.html) in the Amazon Simple Storage Service User Guide.
+ CloudWatch to log worker metrics and labeling job statuses.
+ AWS KMS for data encryption. (Optional)
+ AWS Lambda for processing input and output data when you create a custom workflow. 

Additionally, if you create a [streaming labeling job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-streaming-labeling-job.html), this role must have permission to access:
+ Amazon SQS to create an interact with an SQS queue used to [manage labeling requests](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-streaming-labeling-job.html#sms-streaming-how-it-works-sqs).
+ Amazon SNS to subscribe to and retrieve messages from your Amazon SNS input topic and to send messages to your Amazon SNS output topic.

All of these permissions can be granted with the `[AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution)` managed policy *except*:
+ Data and storage volume encryption of your Amazon S3 buckets. To learn how to configure these permissions, see [Encrypt Output Data and Storage Volume with AWS KMS](sms-security-kms-permissions.md).
+ Permission to select and invoke Lambda functions that do not include `GtRecipe`, `SageMaker`, `Sagemaker`, `sagemaker`, or `LabelingFunction` in the function name.
+ Amazon S3 buckets that do not include either `GroundTruth`, `Groundtruth`, `groundtruth`, `SageMaker`, `Sagemaker`, and `sagemaker` in the prefix or bucket name or an [object tag](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) that includes `SageMaker` in the name (case insensitive).

If you require more granular permissions than the ones provided in `AmazonSageMakerGroundTruthExecution`, use the following policy examples to create an execution role that fits your specific use case.

**Topics**
+ [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt)
+ [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming)
+ [Execution Role Requirements for Custom Task Types](#sms-security-permission-execution-role-custom-tt)
+ [Automated Data Labeling Permission Requirements](#sms-security-permission-execution-role-custom-auto-labeling)

## Built-In Task Types (Non-streaming) Execution Role Requirements
<a name="sms-security-permission-execution-role-built-in-tt"></a>

The following policy grants permission to create a labeling job for a [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html). This execution policy does not include permissions for AWS KMS data encryption or decryption. Replace each red, italicized ARN with your own Amazon S3 ARNs.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "S3ViewBuckets",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::<input-bucket-name>",
                "arn:aws:s3:::<output-bucket-name>"
            ]
        },
        {
            "Sid": "S3GetPutObjects",
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<input-bucket-name>/*",
                "arn:aws:s3:::<output-bucket-name>/*"
            ]
        },
        {
            "Sid": "CloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}
```

------

## Built-In Task Types (Streaming) Execution Role Requirements
<a name="sms-security-permission-execution-role-built-in-tt-streaming"></a>

If you create a streaming labeling job, you must add a policy similar to the following to the execution role you use to create the labeling job. To narrow the scope of the policy, replace the `*` in `Resource` with specific AWS resources that you want to grant the IAM role permission to access and use.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket/*",
                "arn:aws:s3:::amzn-s3-demo-bucket2/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": "*",
            "Condition": {
                "StringEqualsIgnoreCase": {
                    "s3:ExistingObjectTag/SageMaker": "true"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket2"
            ]
        },
        {
            "Sid": "CloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        },
        {
            "Sid": "StreamingQueue",
            "Effect": "Allow",
            "Action": [
                "sqs:CreateQueue",
                "sqs:DeleteMessage",
                "sqs:GetQueueAttributes",
                "sqs:GetQueueUrl",
                "sqs:ReceiveMessage",
                "sqs:SendMessage",
                "sqs:SetQueueAttributes"
            ],
            "Resource": "arn:aws:sqs:*:*:*GroundTruth*"
        },
        {
            "Sid": "StreamingTopicSubscribe",
            "Effect": "Allow",
            "Action": "sns:Subscribe",
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ],
            "Condition": {
                "StringEquals": {
                    "sns:Protocol": "sqs"
                },
                "StringLike": {
                    "sns:Endpoint": "arn:aws:sns:us-east-1:111122223333:*GroundTruth*"
                }
            }
        },
        {
            "Sid": "StreamingTopic",
            "Effect": "Allow",
            "Action": [
                "sns:Publish"
            ],
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ]
        },
        {
            "Sid": "StreamingTopicUnsubscribe",
            "Effect": "Allow",
            "Action": [
                "sns:Unsubscribe"
            ],
            "Resource": [
                "arn:aws:sns:us-east-1:111122223333:input-topic-name",
                "arn:aws:sns:us-east-1:111122223333:output-topic-name"
            ]
        }
    ]
}
```

------

## Execution Role Requirements for Custom Task Types
<a name="sms-security-permission-execution-role-custom-tt"></a>

If you want to create a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html), add the following statement to an execution role policy like the ones found in [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt) or [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming).

This policy gives the execution role permission to `Invoke` your pre-annotation and post-annotation Lambda functions.

```
{
    "Sid": "LambdaFunctions",
    "Effect": "Allow",
    "Action": [
        "lambda:InvokeFunction"
    ],
    "Resource": [
        "arn:aws:lambda:<region>:<account-id>:function:<pre-annotation-lambda-name>",
        "arn:aws:lambda:<region>:<account-id>:function:<post-annotation-lambda-name>"
    ]
}
```

## Automated Data Labeling Permission Requirements
<a name="sms-security-permission-execution-role-custom-auto-labeling"></a>

If you want to create a labeling job with [automated data labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) enabled, you must 1) add one policy to the IAM policy attached to the execution role and 2) update the trust policy of the execution role. 

The following statement allows the IAM execution role to be passed to SageMaker AI so that it can be used to run the training and inference jobs used for active learning and automated data labeling respectively. Add this statement to an execution role policy like the ones found in [Built-In Task Types (Non-streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt) or [Built-In Task Types (Streaming) Execution Role Requirements](#sms-security-permission-execution-role-built-in-tt-streaming). Replace `arn:aws:iam::<account-number>:role/<role-name>` with the execution role ARN. You can find your IAM role ARN in the IAM console under **Roles**. 

```
{
    "Effect": "Allow",
    "Action": [
        "iam:PassRole"
    ],
    "Resource": "arn:aws:iam::<account-number>:role/<execution-role-name>",
    "Condition": {
        "StringEquals": {
            "iam:PassedToService": [
                "sagemaker.amazonaws.com"
            ]
        }
    }
}
```

The following statement allows SageMaker AI to assume the execution role to create and manage the SageMaker training and inference jobs. This policy must be added to the trust relationship of the execution role. To learn how to add or modify an IAM role trust policy, see [Modifying a role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage_modify.html) in the IAM User Guide.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": {
        "Effect": "Allow",
        "Principal": {"Service": "sagemaker.amazonaws.com" },
        "Action": "sts:AssumeRole"
    }
}
```

------



# Encrypt Output Data and Storage Volume with AWS KMS
<a name="sms-security-kms-permissions"></a>

You can use AWS Key Management Service (AWS KMS) to encrypt output data from a labeling job by specifying a [customer managed key](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#master_keys) when you create the labeling job. If you use the API operation `CreateLabelingJob` to create a labeling job that uses automated data labeling, you can also use a customer managed key to encrypt the storage volume attached to the ML compute instances to run the training and inference jobs.

This section describes the IAM policies you must attach to your customer managed key to enable output data encryption and the policies you must attach to your customer managed key and execution role to use storage volume encryption. To learn more about these options, see [Output Data and Storage Volume Encryption](sms-security.md).

## Encrypt Output Data using KMS
<a name="sms-security-kms-permissions-output-data"></a>

If you specify an AWS KMS customer managed key to encrypt output data, you must add an IAM policy similar to the following to that key. This policy gives the IAM execution role that you use to create your labeling job permission to use this key to perform all of the actions listed in `"Action"`. To learn more about these actions, see [AWS KMS permissions](https://docs.aws.amazon.com/kms/latest/developerguide/kms-api-permissions-reference.html) in the AWS Key Management Service Developer Guide.

To use this policy, replace the IAM service-role ARN in `"Principal"` with the ARN of the execution role you use to create the labeling job. When you create a labeling job in the console, this is the role you specify for **IAM Role** under the **Job overview** section. When you create a labeling job using `CreateLabelingJob`, this is ARN you specify for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn).

```
{
    "Sid": "AllowUseOfKmsKey",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/service-role/example-role"
    },
    "Action": [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:DescribeKey"
    ],
    "Resource": "*"
}
```

## Encrypt Automated Data Labeling ML Compute Instance Storage Volume
<a name="sms-security-kms-permissions-storage-volume"></a>

If you specify a [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId) to encrypt the storage volume attached to the ML compute instance used for automated data labeling training and inference, you must do the following:
+ Attach permissions described in [Encrypt Output Data using KMS](#sms-security-kms-permissions-output-data) to the customer managed key.
+ Attach a policy similar to the following to the IAM execution role you use to create your labeling job. This is the IAM role you specify for [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-RoleArn) in `CreateLabelingJob`. To learn more about the `"kms:CreateGrant"` action that this policy permits, see [https://docs.aws.amazon.com/kms/latest/APIReference/API_CreateGrant.html](https://docs.aws.amazon.com/kms/latest/APIReference/API_CreateGrant.html) in the AWS Key Management Service API Reference.

------
#### [ JSON ]

****  

```
{
"Version":"2012-10-17",		 	 	  
"Statement": 
 [  
   {
    "Effect": "Allow",
    "Action": [
       "kms:CreateGrant"
    ],
    "Resource": "*"
  }
]
}
```

------

To learn more about Ground Truth storage volume encryption, see [Use Your KMS Key to Encrypt Automated Data Labeling Storage Volume (API Only)](sms-security.md#sms-security-kms-storage-volume).

# Using Amazon SageMaker Ground Truth in an Amazon Virtual Private Cloud
<a name="sms-vpc"></a>

 With [Amazon Virtual Private Cloud](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html) (Amazon VPC) you can launch AWS resources in a logically isolated virtual network that you define. Ground Truth supports running labeling jobs inside an Amazon VPC instead of connecting over the internet. When you launch a labeling job in an Amazon VPC, communication between your VPC and Ground Truth is conducted entirely and securely within the AWS network.

This guide shows how you can use Ground Truth in an Amazon VPC in the following ways:

1. [Run an Amazon SageMaker Ground Truth Labeling Job in an Amazon Virtual Private Cloud](samurai-vpc-labeling-job.md)

1. [Use Amazon VPC Mode from a Private Worker Portal](samurai-vpc-worker-portal.md)

# Run an Amazon SageMaker Ground Truth Labeling Job in an Amazon Virtual Private Cloud
<a name="samurai-vpc-labeling-job"></a>

Ground Truth supports the following functionalities in Amazon VPC.
+ You can use Amazon S3 bucket policies to control access to buckets from specific Amazon VPC endpoints, or specific VPCs. If you launch a labeling job and your input data is located in an Amazon S3 bucket that is restricted to users in your VPC, you can add a bucket policy to also grant a Ground Truth endpoint permission to access the bucket. To learn more, see [Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets](#sms-vpc-permissions-s3).
+ You can launch an [automated data labeling job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) in your VPC. You use a VPC configuration to specify VPC subnets and security groups. SageMaker AI uses this configuration to launch the training and inference jobs used for automated data labeling in your VPC. To learn more, see [Create an Automated Data Labeling Job in a VPC](#sms-vpc-permissions-automated-labeling).

You may want to use these options in any of the following ways.
+ You can use both of these methods to launch a labeling job using a VPC-protected Amazon S3 bucket with automated data labeling enabled.
+ You can launch a labeling job using any [built-in task type](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html) using a VPC-protected bucket.
+ You can launch a [custom labeling workflow](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html) using a VPC-protected bucket. Ground Truth interacts with your pre-annotation and post-annotation Lambda functions using an [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/endpoint-services-overview.html) endpoint.

We recommend that you review [Prerequisites for running a Ground Truth labeling job in a VPC](#sms-vpc-gt-prereq) before you create a labeling job in an Amazon VPC.

## Prerequisites for running a Ground Truth labeling job in a VPC
<a name="sms-vpc-gt-prereq"></a>

Review the following prerequisites before you create a Ground Truth labeling job in an Amazon VPC. 
+ If you are a new user of Ground Truth, review [Getting started](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-getting-started.html) to learn how to create a labeling job.
+ If your input data is located in a VPC-protected Amazon S3 bucket, your workers must access the worker portal from your VPC. VPC based labeling jobs require the use of a private work team. To learn more about creating a private work team, see [Use a Private Workforce](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private.html).
+ The following prerequisites are specific to launching a labeling job in your VPC.
  + Use the instructions in [Create an Amazon S3 VPC Endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/train-vpc.html#train-vpc-s3). Training and inference containers used in the automated data labeling workflow use this endpoint to communicate with your buckets in Amazon S3.
  + Review [Automate Data Labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html) to learn more about this feature. Note that automated data labeling is supported for the following [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html): [Image Classification (Single Label)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-image-classification.html), [Image Semantic Segmentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-semantic-segmentation.html), [Bounding Box](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-bounding-box.html), and [Text Classification (Single Label)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-text-classification.html). Streaming labeling jobs do not support automated data labeling.
+ Review the [Ground Truth Security and Permissions](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-security-general.html) section and ensure that you have met the following conditions.
  + The user creating the labeling job has all necessary permissions
  + You have created an IAM execution role with required permissions. If you do not require fine-tuned permissions for your use case, we recommend you use the IAM managed policies described in [Grant General Permissions To Get Started Using Ground Truth](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-security-permission.html#sms-security-permissions-get-started).
  + Allow your VPC to have access to the `sagemaker-labeling-data-region` and `sm-bxcb-region-saved-task-states` S3 buckets. These are system owned regionalized S3 buckets that are accessed from worker portal when worker is working on a task. We use these buckets to interact with system managed data.

## Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets
<a name="sms-vpc-permissions-s3"></a>

The following sections provide details about the permissions Ground Truth requires to launch labeling jobs using Amazon S3 buckets that have access restricted to your VPC and VPC endpoints. To learn how to restrict access to an Amazon S3 bucket to a VPC, see [Controlling access from VPC endpoints with bucket policies](https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies-vpc-endpoint.html) in the Amazon Simple Storage Service User Guide guide. To learn how to add a policy to an S3 bucket, see [Adding a bucket policy using the Amazon S3 console](https://docs.aws.amazon.com/AmazonS3/latest/userguide/add-bucket-policy.html).

**Note**  
Modifying policies on existing buckets can cause `IN_PROGRESS` Ground Truth jobs to fail. We recommend you start new jobs using a new bucket. If you want to continue using the same bucket, you can do one of the following.  
Wait for an `IN_PROGRESS` job to finish.
Terminate the job using the console or the AWS CLI.

You can restrict Amazon S3 bucket access to users in your VPC using an [AWS PrivateLink](https://aws.amazon.com/privatelink/) endpoint. For example, the following S3 bucket policy allows access to a specific bucket, `<bucket-name>`, from `<vpc>` and the endpoint `<vpc-endpoint>` only. When you modify this policy, you must replace all *red-italized text* with your resources and specifications.

**Note**  
The following policy *denies* all entities *other than* users within a VPC to perform the actions listed in `Action`. If you do not include actions in this list, they are still accessible to any entity that has access to this bucket and permission to perform those actions. For example, if a user has permission to perform `GetBucketLocation` on your Amazon S3 bucket, the policy below does not restrict the user from performing this action outside of your VPC.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "Policy1415115909152",
    "Statement": [
        {
            "Sid": "AccessToSpecificVPCEOnly",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Effect": "Deny",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:sourceVpce": [
                        "vpce-12345678",
                        "vpce-12345678901234567"
                    ]
                }
            }
        }
    ]
}
```

------

Ground Truth must be able to perform the following Amazon S3 actions on the S3 buckets you use to configure the labeling job.

```
"s3:AbortMultipartUpload",
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketLocation"
```

You can do this by adding a Ground Truth endpoint to the bucket policy like the one previously mentioned. The following table includes Ground Truth service endpoints for each AWS Region. Add an endpoint in the same [AWS Region](https://docs.aws.amazon.com/general/latest/gr/rande.html) you use to run your labeling job to your bucket policy.


****  

| AWS Region | Ground Truth endpoint | 
| --- | --- | 
| us-east-2 | vpce-02569ba1c40aad0bc | 
| us-east-1 | vpce-08408e335ebf95b40 | 
| us-west-2 | vpce-0ea07aa498eb78469 | 
| ca-central-1 | vpce-0d46ea4c9ff55e1b7 | 
| eu-central-1 | vpce-0865e7194a099183d | 
| eu-west-2 | vpce-0bccd56798f4c5df0 | 
| eu-west-1 | vpce-0788e7ed8628e595d | 
| ap-south-1 | vpce-0d7fcda14e1783f11 | 
| ap-southeast-2 | vpce-0b7609e6f305a77d4 | 
| ap-southeast-1 | vpce-0e7e67b32e9efed27 | 
| ap-northeast-2 | vpce-007893f89e05f2bbf | 
| ap-northeast-1 | vpce-0247996a1a1807dbd | 

For example, the following policy restricts `GetObject` and `PutObject` actions on:
+ An Amazon S3 bucket to users in a VPC (`<vpc>`)
+ A VPC endpoint (`<vpc-endpoint>`)
+ A Ground Truth service endpoint (`<ground-truth-endpoint>`)

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "1",
    "Statement": [
        {
            "Sid": "DenyAccessFromNonGTandCustomerVPC",
            "Effect": "Deny",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket-name",
                "arn:aws:s3:::bucket-name/*"
            ],
            "Condition": {
              "StringNotEquals": {
                "aws:SourceVpc": "vpc-12345678",
                "aws:sourceVpce": [
                  "vpce-12345678",
                  "vpce-12345678"
                ] 
             }
           }
        }
    ]
}
```

------

If you want a user to have permission to launch a labeling job using the Ground Truth console, you must also add the user's ARN to the bucket policy using the `aws:PrincipalArn` condition. This user must also have permission to perform the following Amazon S3 actions on the bucket you use to launch the labeling job.

```
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:GetBucketCors",
"s3:PutBucketCors",
"s3:ListAllMyBuckets",
```

The following code is an example of a bucket policy that restricts permission to perform the actions listed in `Action` on the S3 bucket `<bucket-name>` to the following.
+ *<role-name>*
+ The VPC endpoints listed in `aws:sourceVpce`
+ Users within the VPC named *<vpc>*

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "1",
    "Statement": [
        {
            "Sid": "DenyAccessFromNonGTandCustomerVPC",
            "Effect": "Deny",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket-name/*",
                "arn:aws:s3:::bucket-name"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:SourceVpc": "vpc-12345678",
                    "aws:PrincipalArn": "arn:aws:iam::111122223333:role/role-name"
                },
                "StringNotEquals": {
                    "aws:sourceVpce": [
                        "vpce-12345678",
                        "vpce-12345678"
                    ]
                }
            }
        }
    ]
}
```

------

**Note**  
The Amazon VPC interface endpoints and the protected Amazon S3 buckets you use for input and output data must be located in the same AWS Region that you use to create the labeling job.

After you have granted Ground Truth permission to access your Amazon S3 buckets, you can use one of the topics in [Create a Labeling Job](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job.html) to launch a labeling job. Specify the VPC-restricted Amazon S3 buckets for your input and output data buckets.

## Create an Automated Data Labeling Job in a VPC
<a name="sms-vpc-permissions-automated-labeling"></a>

To create an automated data labeling job using an Amazon VPC, you provide a VPC configuration using the Ground Truth console or `CreateLabelingJob` API operation. SageMaker AI uses the subnets and security groups you provide to launch the training and inferences jobs used for automated labeling. 

**Important**  
Before you launch an automated data labeling job with a VPC configuration, make sure you have created an Amazon S3 VPC endpoint using the VPC you want to use for the labeling job. To learn how, see [Create an Amazon S3 VPC Endpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/train-vpc.html#train-vpc-s3).  
Additionally, if you create an automated data labeling job using a VPC-restricted Amazon S3 bucket, you must follow the instructions in [Allow Ground Truth to Access VPC Restricted Amazon S3 Buckets](#sms-vpc-permissions-s3) to give Ground Truth permission to access the bucket.

Use the following procedures to learn how to add a VPC configuration to your labeling job request.

**Add a VPC configuration to an automated data labeling job (console):**

1. Follow the instructions in [Create a Labeling Job (Console)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-console.html) and complete each step in the procedure, up to step 15.

1. In the **Workers** section, select the checkbox next to **Enable automated data labeling**.

1. Maximize the **VPC configuration** section of the console by selecting the arrow.

1. Specify the **Virtual private cloud (VPC)** that you want to use for your automated data labeling job.

1. Choose the dropdown list under **Subnets** and select one or more subnets.

1. Choose the dropdown list under **Security groups** and select one or more groups.

1. Complete all remaining steps of the procedure in [Create a Labeling Job (Console)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-console.html).

**Add a VPC configuration to an automated data labeling job (API):**  
To configure a labeling job using the Ground Truth API operation, `CreateLabelingJob`, follow the instructions in [Create an Automated Data Labeling Job (API)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html#sms-create-automated-labeling-api) to configure your request. In addition to the parameters described in this documentation, you must include a `VpcConfig` parameter in `LabelingJobResourceConfig` to specify one or more subnets and security groups using the following schema.

```
"LabelingJobAlgorithmsConfig": { 
      "InitialActiveLearningModelArn": "string",
      "LabelingJobAlgorithmSpecificationArn": "string",
      "LabelingJobResourceConfig": { 
         "VolumeKmsKeyId": "string",
         "VpcConfig": { 
            "SecurityGroupIds": [ "string" ],
            "Subnets": [ "string" ]
         }
      }
}
```

The following is an example of an [AWS Python SDK (Boto3) request](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_labeling_job) to create an automated data labeling job in the US East (N. Virginia) Region using a private workforce. Replace all *red-italicized text* with your labeling job resources and specifications. To learn more about the `CreateLabelingJob` operation, see the [Create a Labeling Job (API)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-create-labeling-job-api.html) tutorial and [CreateLabelingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) API documentation.

```
import boto3
client = boto3.client(service_name='sagemaker')

response = client.create_labeling_job(
    LabelingJobName="example-labeling-job",
    LabelAttributeName="label",
    InputConfig={
        'DataSource': {
            'S3DataSource': {
                'ManifestS3Uri': "s3://bucket/path/manifest-with-input-data.json"
            }
        }
    },
    "LabelingJobAlgorithmsConfig": {
      "LabelingJobAlgorithmSpecificationArn": "arn:aws:sagemaker:us-east-1:027400017018:labeling-job-algorithm-specification/tasktype",
      "LabelingJobResourceConfig": { 
         "VpcConfig": { 
            "SecurityGroupIds": [ "sg-01233456789", "sg-987654321" ],
            "Subnets": [ "subnet-e0123456", "subnet-e7891011" ]
         }
      }
    },
    OutputConfig={
        'S3OutputPath': "s3://bucket/path/file-to-store-output-data",
        'KmsKeyId': "string"
    },
    RoleArn="arn:aws:iam::*:role/*,
    LabelCategoryConfigS3Uri="s3://bucket/path/label-categories.json",
    StoppingConditions={
        'MaxHumanLabeledObjectCount': 123,
        'MaxPercentageOfInputDatasetLabeled': 123
    },
    HumanTaskConfig={
        'WorkteamArn': "arn:aws:sagemaker:region:*:workteam/private-crowd/*",
        'UiConfig': {
            'UiTemplateS3Uri': "s3://bucket/path/custom-worker-task-template.html"
        },
        'PreHumanTaskLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:PRE-tasktype",
        'TaskKeywords': [
            "Images",
            "Classification",
            "Multi-label"
        ],
        'TaskTitle': "Add task title here",
        'TaskDescription': "Add description of task here for workers",
        'NumberOfHumanWorkersPerDataObject': 1,
        'TaskTimeLimitInSeconds': 3600,
        'TaskAvailabilityLifetimeInSeconds': 21600,
        'MaxConcurrentTaskCount': 1000,
        'AnnotationConsolidationConfig': {
            'AnnotationConsolidationLambdaArn': "arn:aws:lambda:us-east-1:432418664414:function:ACS-tasktype"
        },
    Tags=[
        {
            'Key': "string",
            'Value': "string"
        },
    ]
)
```

# Use Amazon VPC Mode from a Private Worker Portal
<a name="samurai-vpc-worker-portal"></a>

To restrict worker portal access to labelers working inside of your Amazon VPC, you can add a VPC configuration when you create a Ground Truth private workforce. You can also add a VPC configuration to an existing private workforce. Ground Truth automatically creates VPC interface endpoints in your VPC and sets up AWS PrivateLink between your VPC endpoint and the Ground Truth services. The worker portal URL associated with the workforce can be accessed from your VPC. The worker portal URL can also be accessed from public internet until you set the restriction on the public internet. When you delete the workforce or remove the VPC configuration from your workforce, Ground Truth automatically deletes the VPC endpoints associated with the workforce.

**Note**  
There can be only one VPC supported for a workforce.

[Point Cloud](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud.html) and [video](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-video.html) tasks do not support loading through a VPC.

The guide demonstrates how to complete the necessary steps to add and delete an Amazon VPC configuration to your workforce, and satisfy the prerequisites.

## Prerequisites
<a name="samurai-vpc-getting-started-prerequisites"></a>

To run a Ground Truth labeling job in Amazon VPC, review the following prerequisites.
+ You have an Amazon VPC configured that you can use. If you have not configured a VPC, follow these instructions for [creating a VPC](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html#interface-endpoint-shared-subnets).
+ Depending on how a [Worker Task Template](https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-instructions-overview.html) is written, labeling data stored in an Amazon S3 bucket may be accessed directly from Amazon S3 during labeling tasks. In these cases, the VPC network must be configured to allow traffic from the device used by the human labeler to the S3 bucket containing labeling data.
+ Follow [View and update DNS attributes for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html#vpc-dns-updating) to enable DNS hostnames and DNS resolution for your VPC.

**Note**  
There are two ways to configure your VPC for your workforce. You can do this through the [console](https://console.aws.amazon.com/sagemaker) or the AWS SageMaker AI [CLI](https://aws.amazon.com/cli/).

# Using the SageMaker AI console to manage a VPC config
<a name="samurai-vpc-workforce-console"></a>

You can use the [SageMaker AI console](https://console.aws.amazon.com/sagemaker) to add or remove a VPC configuration. You can also delete an existing workforce.

## Adding a VPC configuration to your workforce
<a name="samurai-add-vpc-workforce"></a>

### Create a private workforce
<a name="samurai-vpc-create-workforce"></a>
+ [Create a private workforce using Amazon Cognito](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private-use-cognito.html)
+ [Create a private workforce using OpenID Connect (OIDC) Identity Provider(IdP)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-private-use-oidc.html).

After you have created your private workforce, add a VPC configuration to it.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Select **Private** to access your private workforce. After your **Workforce status** is **Active**, select **Add** next to **VPC**.

1. When you are prompted to configure your VPC, provide the following:

   1. Your **VPC**

   1. **Subnets**

      1. Ensure that your VPC has an existing subnet

   1. **Security groups**

      1. 
**Note**  
You cannot select more than 5 security groups.

   1. After filling in this information, choose **Confirm**.

1. After you choose **Confirm**, you are redirected back to the **Private** page under **Labeling workforces**. You should see a green banner at the top that reads **Your private workforce update with VPC configuration was successfully initialized.** The workforce status is **Updating**. Next to the **Delete workforce** button is the **Refresh** button, which can be used to retrieve the latest **Workforce status**. After the workforce status has changed to **Active**, the VPC endpoint ID is updated as well.

## Removing a VPC configuration from your workforce
<a name="samurai-remove-vpc-workforce"></a>

Use the following information to remove a VPC configuration from your workforce using the console.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Find and select your workforce.

1. Under **Private workforce summary**, find **VPC** and choose **Remove** next to it.

1. Select **Remove**.

## Deleting a workforce through the console
<a name="samurai-delete-vpc-workforce"></a>

If you delete a workforce, you should not have any teams associated with it. You can delete a workforce only if the workforce status is **Active** or **Failed**.

Use the following information to delete a workforce using the console.

1. Navigate to [Amazon SageMaker Runtime](https://console.aws.amazon.com/sagemaker) in your console.

1. Select **Labeling workforces** in the left panel.

1. Find and select your workforce.

1. Choose **Delete workforce**.

1. Choose **Delete**.

# Using the SageMaker AI AWS API to manage a VPC config
<a name="samurai-vpc-workforce-cli"></a>

Use the following sections to learn more about managing a VPCs configuration, while maintaining the right level of access to the work team.

## Create a workforce with a VPC configuration
<a name="samurai-create-vpc-cli"></a>

If the account already has a workforce, then you must delete it first. You can also update the workforce with VPC configuration.

```
aws sagemaker create-workforce --cognito-config '{"ClientId": "app-client-id","UserPool": "Pool_ID",}' --workforce-vpc-config \       
" {\"VpcId\": \"vpc-id\", \"SecurityGroupIds\": [\"sg-0123456789abcdef0\"], \"Subnets\": [\"subnet-0123456789abcdef0\"]}" --workforce-name workforce-name
{
    "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name"
}
```

Describe the workforce and make sure the status is `Initializing`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "WorkforceVpcConfig": {
            "VpcId": "vpc-id",
            "SecurityGroupIds": [
                "sg-0123456789abcdef0"
            ],
            "Subnets": [
                "subnet-0123456789abcdef0"
            ]
        },
        "Status": "Initializing"
    }
}
```

Navigate to the Amazon VPC console. Select **Endpoints** from the left panel. There should be two VPC endpoints created in your account.

## Adding a VPC configuration your workforce
<a name="samurai-add-vpc-cli"></a>

Update a non-VPC private workforce with a VPC configuration using the following command.

```
aws sagemaker update-workforce --workforce-name workforce-name\
--workforce-vpc-config "{\"VpcId\": \"vpc-id\", \"SecurityGroupIds\": [\"sg-0123456789abcdef0\"], \"Subnets\": [\"subnet-0123456789abcdef0\"]}"
```

Describe the workforce and make sure the status is `Updating`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "WorkforceVpcConfig": {
            "VpcId": "vpc-id",
            "SecurityGroupIds": [
                "sg-0123456789abcdef0"
            ],
            "Subnets": [
                "subnet-0123456789abcdef0"
            ]
        },
        "Status": "Updating"
    }
}
```

Navigate to your Amazon VPC console. Select **Endpoints** from the left panel. There should be two VPC endpoints created in your account.

## Removing a VPC configuration from your workforce
<a name="samurai-remove-vpc-cli"></a>

Update a VPC private workforce with an empty VPC configuration to remove VPC resources.

```
aws sagemaker update-workforce --workforce-name workforce-name\ 
--workforce-vpc-config "{}"
```

Describe the workforce and make sure the status is `Updating`.

```
aws sagemaker describe-workforce --workforce-name workforce-name
{
    "Workforce": {
        "WorkforceName": "workforce-name",
        "WorkforceArn": "arn:aws:sagemaker:us-west-2:xxxxxxxxx:workforce/workforce-name",
        "LastUpdatedDate": 1622151252.451,
        "SourceIpConfig": {
            "Cidrs": []
        },
        "SubDomain": "subdomain.us-west-2.sagamaker.aws.com",
        "CognitoConfig": {
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        "CreateDate": 1622151252.451,
        "Status": "Updating"
    }
}
```

Naviagate to your Amazon VPC console. Select **Endpoints** from the left panel. The two VPC endpoints should be deleted.

## Restrict public access to the worker portal while maintaining access through a VPC
<a name="public-access-vpc"></a>

 The workers in a VPC or non-VPC worker portal are be able to see the labeling job tasks assigned to them. The assignment comes from assigning workers in a work team through OIDC groups. It is the customer’s responsibility to restrict the access to their public worker portal by setting the `sourceIpConfig` in their workforce. 

**Note**  
You can restrict access to the worker portal only through the SageMaker API. This cannot be done through the console.

Use the following command to restrict public access to the worker portal.

```
aws sagemaker update-workforce --region us-west-2 \
--workforce-name workforce-demo --source-ip-config '{"Cidrs":["10.0.0.0/16"]}'
```

After the `sourceIpConfig` is set on the workforce, the workers can access the worker portal in VPC but not through public internet.

**Note**  
You can not set the `sourceIP` restriction for worker portal in VPC.

# Output Data and Storage Volume Encryption
<a name="sms-security"></a>

With Amazon SageMaker Ground Truth, you can label highly sensitive data, stay in control of your data, and employ security best practices. While your labeling job is running, Ground Truth encrypts data in transit and at rest. Additionally, you can use AWS Key Management Service (AWS KMS) with Ground Truth to do the following:
+ Use a [customer managed key](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#master_keys) to encrypt your output data. 
+ Use AWS KMS customer managed key with your automated data labeling job to encrypt the storage volume attached to the compute instance used for model training and inference. 

Use the topics on this page to learn more about these Ground Truth security features.

## Use Your KMS Key to Encrypt Output Data
<a name="sms-security-kms-output-data"></a>

Optionally, you can provide an AWS KMS customer managed key when you create a labeling job, which Ground Truth uses to encrypt your output data. 

If you don't provide a customer managed key, Amazon SageMaker AI uses the default AWS managed key for Amazon S3 for your role's account to encrypt your output data.

If you provide a customer managed key, you must add the required permissions to the key described in [Encrypt Output Data and Storage Volume with AWS KMS](sms-security-kms-permissions.md). When you use the API operation `CreateLabelingJob`, you can specify your customer managed key ID using the parameter `[KmsKeyId](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobOutputConfig.html#sagemaker-Type-LabelingJobOutputConfig-KmsKeyId)`. See the following procedure to learn how to add a customer managed key when you create a labeling job using the console.

**To add an AWS KMS key to encrypt output data (console):**

1. Complete the first 7 steps in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md).

1. In step 8, select the arrow next to **Additional configuration** to expand this section.

1. For **Encryption key**, select the AWS KMS key that you want to use to encrypt output data.

1. Complete the rest of steps in [Create a Labeling Job (Console)](sms-create-labeling-job-console.md) to create a labeling job.

## Use Your KMS Key to Encrypt Automated Data Labeling Storage Volume (API Only)
<a name="sms-security-kms-storage-volume"></a>

When you create a labeling job with automated data labeling using the `CreateLabelingJob` API operation, you have the option to encrypt the storage volume attached to the ML compute instances that run the training and inference jobs. To add encryption to your storage volume, use the parameter `VolumeKmsKeyId` to input an AWS KMS customer managed key. For more information about this parameter, see `[LabelingJobResourceConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_LabelingJobResourceConfig.html#sagemaker-Type-LabelingJobResourceConfig-VolumeKmsKeyId)`.

If you specify a key ID or ARN for `VolumeKmsKeyId`, your SageMaker AI execution role must include permissions to call `kms:CreateGrant`. To learn how to add this permission to an execution role, see [Create a SageMaker AI Execution Role for a Ground Truth Labeling Job](sms-security-permission-execution-role.md).

**Note**  
If you specify an AWS KMS customer managed key when you create a labeling job in the console, that key is *only* used to encrypt your output data. It is not used to encrypt the storage volume attached to the ML compute instances used for automated data labeling.

# Workforce Authentication and Restrictions
<a name="sms-security-workforce-authentication"></a>

Ground Truth enables you to use your own private workforce to work on labeling jobs. A *private workforce* is an abstract concept which refers to a set of people who work for you. Each labeling job is created using a work team, composed of workers in your workforce. Ground Truth supports private workforce creation using Amazon Cognito. 

A Ground Truth workforce maps to a Amazon Cognito user pool. A Ground Truth work team maps to a Amazon Cognito user group. Amazon Cognito manages the worker authentication. Amazon Cognito supports Open ID connection (OIDC) and customers can set up Amazon Cognito federation with their own identity provider (IdP). 

Ground Truth only allows one workforce per account per AWS Region. Each workforce has a dedicated Ground Truth work portal login URL. 

You can also restrict workers to a Classless Inter-Domain Routing (CIDR) block/IP address range. This means annotators must be on a specific network to access the annotation site. You can add up to ten CIDR blocks for one workforce. To learn more, see [Private workforce management using the Amazon SageMaker API](sms-workforce-management-private-api.md).

To learn how you can create a private workforce, see [Create a Private Workforce (Amazon Cognito)](sms-workforce-create-private.md).

## Restrict Access to Workforce Types
<a name="sms-security-permission-condition-keys"></a>

Amazon SageMaker Ground Truth work teams fall into one of three [workforce types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management.html): public (with Amazon Mechanical Turk), private, and vendor. To restrict user access to a specific work team using one of these types or the work team ARN, use the `sagemaker:WorkteamType` and/or the `sagemaker:WorkteamArn` condition keys. For the `sagemaker:WorkteamType` condition key, use [string condition operators](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html#Conditions_String). For the `sagemaker:WorkteamArn` condition key, use [Amazon Resource Name (ARN) condition operators](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html#Conditions_ARN). If the user attempts to create a labeling job with a restricted work team, SageMaker AI returns an access denied error. 

The policies below demonstrate different ways to use the `sagemaker:WorkteamType` and `sagemaker:WorkteamArn` condition keys with appropriate condition operators and valid condition values. 

The following example uses the `sagemaker:WorkteamType` condition key with the `StringEquals` condition operator to restrict access to a public work team. It accepts condition values in the following format: `workforcetype-crowd`, where *workforcetype* can equal `public`, `private`, or `vendor`.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "sagemaker:WorkteamType": "public-crowd"
                }
            }
        }
    ]
}
```

------

The following policies show how to restrict access to a public work team using the `sagemaker:WorkteamArn` condition key. The first shows how to use it with a valid IAM regex-variant of the work team ARN and the `ArnLike` condition operator. The second shows how to use it with the `ArnEquals` condition operator and the work team ARN.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "ArnLike": {
                    "sagemaker:WorkteamArn": "arn:aws:sagemaker:*:*:workteam/public-crowd/*"
                }
            }
        }
    ]
}
```

------

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictWorkteamType",
            "Effect": "Deny",
            "Action": "sagemaker:CreateLabelingJob",
            "Resource": "*",
            "Condition": {
                "ArnEquals": {
                    "sagemaker:WorkteamArn": "arn:aws:sagemaker:us-west-2:394669845002:workteam/public-crowd/default"
                }
            }
        }
    ]
}
```

------

# Monitor Labeling Job Status
<a name="sms-monitor-cloud-watch"></a>

To monitor the status of your labeling jobs, you can set up an [Amazon CloudWatch Events](https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-cloudwatch.html#cloudwatch-metrics-ground-truth) (CloudWatch Events) rule for Amazon SageMaker Ground Truth (Ground Truth) to send an event to CloudWatch Events when a labeling job status changes to `Completed`, `Failed`, or `Stopped` or when a worker accepts, declines, submits, or returns a task. 

Once you create a rule, you can add a *target* to it. CloudWatch Events uses this target to invoke another AWS service to process the event. For example, you can create a target using a Amazon Simple Notification Service (Amazon SNS) topic to send a notification to your email when a labeling job status changes.

**Prerequisites:**

To create a CloudWatch Events rule, you will need an AWS Identity and Access Management (IAM) role with an events.amazonaws.com trust policy attached. The following is an example of an events.amazonaws.com trust policy.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "events.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

------

**Topics**
+ [Send Events to CloudWatch Events](#sms-cloud-watch-event-rule-setup)
+ [Set Up a Target to Process Events](#sms-cloud-watch-events-labelingjob-notifications)
+ [Labeling Job Expiration](#sms-labeling-job-expiration)
+ [Declining Tasks](#sms-decline-tasks)

## Send Events to CloudWatch Events
<a name="sms-cloud-watch-event-rule-setup"></a>

To configure a CloudWatch Events rule to get status updates, or *events*, for your Ground Truth labeling jobs, use the AWS Command Line Interface (AWS CLI) [https://docs.aws.amazon.com/cli/latest/reference/events/put-rule.html](https://docs.aws.amazon.com/cli/latest/reference/events/put-rule.html) command. You can filter events that are sent to your rule by status change. For example, you can create a rule that notifies you only if a labeling job status changes to `Completed`. When using the `put-rule` command, specify the following to receive labeling job statuses: 
+ `\"source\":[\"aws.sagemaker\"]`
+ `\"detail-type\":[\"SageMaker Ground Truth Labeling Job State Change\"]`

To configure a CloudWatch Events rule to watch for all status changes, use the following command and replace the placeholder text. For example, replace `"GTLabelingJobStateChanges"` with a unique CloudWatch Events rule name and *`"arn:aws:iam::111122223333:role/MyRoleForThisRule"`* with the Amazon Resource Number (ARN) of an IAM role with an events.amazonaws.com trust policy attached. 

```
aws events put-rule --name "GTLabelingJobStateChanges" 
    --event-pattern "{\"source\":[\"aws.sagemaker\"],\"detail-type\":[\"SageMaker Ground Truth Labeling Job State Change\"]}" 
    --role-arn "arn:aws:iam::111122223333:role/MyRoleForThisRule" 
    --region "region"
```

To filter by job status, use the `\"detail\":{\"LabelingJobStatus\":[\"Status\"]}}"` syntax. Valid values for `Status` are `Completed`, `Failed`, and `Stopped`. 

The following example creates a CloudWatch Events rule that notifies you when a labeling job in us-west-2 (Oregon) changes to `Completed`.

```
aws events put-rule --name "LabelingJobCompleted" 
    --event-pattern "{\"source\":[\"aws.sagemaker\"],\"detail-type\":[\"SageMaker Ground Truth Labeling Job State Change\"], \"detail\":{\"LabelingJobStatus\":[\"Completed\"]}}"  
    --role-arn "arn:aws:iam::111122223333:role/MyRoleForThisRule" 
    --region us-west-2
```

The following example creates a CloudWatch Events rule that notifies you when a labeling job in us-east-1 (Virginia) changes to `Completed` or `Failed`.

```
aws events put-rule --name "LabelingJobCompletedOrFailed" 
    --event-pattern "{\"source\":[\"aws.sagemaker\"],\"detail-type\":[\"SageMaker Ground Truth Labeling Job State Change\"], \"detail\":{\"LabelingJobStatus\":[\"Completed\", \"Failed\"]}}"  
    --role-arn "arn:aws:iam::111122223333:role/MyRoleForThisRule" 
    --region us-east-1
```

 To learn more about the `put-rule` request, see [Event Patterns in CloudWatch Events](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatchEventsandEventPatterns.html) in the *Amazon CloudWatch Events User Guide*.

## Set Up a Target to Process Events
<a name="sms-cloud-watch-events-labelingjob-notifications"></a>

After you have created a rule, events similar to the following are sent to CloudWatch Events. In this example, the labeling job `test-labeling-job`'s status changed to `Completed`.

```
{
    "version": "0",
    "id": "111e1111-11d1-111f-b111-1111b11dcb11",
    "detail-type": "SageMaker Ground Truth Labeling Job State Change",
    "source": "aws.sagemaker",
    "account": "111122223333",
    "time": "2018-10-06T12:26:13Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:sagemaker:us-east-1:111122223333:labeling-job/test-labeling-job"
    ],
    "detail": {      
        "LabelingJobStatus": "Completed"
    }
}
```

To process events, you need to set up a target. For example, if you want to receive an email when your labeling job status changes, use a procedure in [Setting Up Amazon SNS Notifications](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SetupSNS.html) in the *Amazon CloudWatch User Guide *to set up an Amazon SNS topic and subscribe your email to it. Once you have create a topic, you can use it to create a target. 

**To add a target to your CloudWatch Events rule**

1. Open the CloudWatch console: [https://console.aws.amazon.com/cloudwatch/home](https://console.aws.amazon.com/cloudwatch/home)

1. In the navigation pane, choose **Rules**.

1. Choose the rule that you want to add a target to. 

1. Choose **Actions**, and then choose **Edit**.

1. Under **Targets**, choose **Add Target** and choose the AWS service you want to act when a labeling job status change event is detected. 

1. Configure your target. For instructions, see the topic for configuring a target in the [AWS documentation for that service](https://docs.aws.amazon.com/index.html).

1. Choose **Configure details**.

1. For **Name**, enter a name and, optionally, provide details about the purpose of the rule in **Description**. 

1. Make sure that the check box next to **State** is selected so that your rule is listed as **Enabled**. 

1. Choose **Update rule**.

## Labeling Job Expiration
<a name="sms-labeling-job-expiration"></a>

If your labeling job is not completed after 30 days, it will expire. If your labeling job expires, you can chain the job to create a new labeling job that will only send unlabeled data to workers. For more information, and to learn how to create a labeling job using chaining, see [Chaining labeling jobs](sms-reusing-data.md).

## Declining Tasks
<a name="sms-decline-tasks"></a>

Workers are able to decline tasks. 

Workers decline a task if the instructions are not clear, input data is not displaying correctly, or if they encounter some other issue with the task. If the number of workers per dataset object ([https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html#sagemaker-Type-HumanTaskConfig-NumberOfHumanWorkersPerDataObject)) decline the task, the data object is marked as expired and will not be sent to additional workers.