Adding a dataset to a project - Rekognition

Adding a dataset to a project

You can add a training dataset or a test dataset to an existing project. If you want to replace an existing dataset, first delete the existing dataset. For more information, see Deleting a dataset. Then, add the new dataset.

Adding a dataset to a project (Console)

You can add a training or test dataset to a project by using the Amazon Rekognition Custom Labels console.

To add a dataset to a project
  1. Open the Amazon Rekognition console at https://console.aws.amazon.com/rekognition/.

  2. In the left pane, choose Use Custom Labels. The Amazon Rekognition Custom Labels landing page is shown.

  3. In the left navigation pane, choose Projects. The Projects view is shown.

  4. Choose the project to which you want to add a dataset.

  5. In the left navigation pane, under the project name, choose Datasets.

  6. If the project doesn't have an existing dataset, the Create dataset page is shown. Do the following:

    1. On the Create dataset page, enter the image source information. For more information, see Creating training and test datasets with images.

    2. Choose Create dataset to create the dataset.

  7. If the project has an existing dataset (training or test), the project details page is shown. Do the following:

    1. On the project details page, choose Actions.

    2. If you want to add a training dataset, choose Create training dataset.

    3. If you want to add a test dataset, choose Create test dataset.

    4. On the Create dataset page, enter the image source information. For more information, see Creating training and test datasets with images.

    5. Choose Create dataset to create the dataset.

  8. Add images to your dataset. For more information, see Adding more images (console).

  9. Add labels to your dataset. For more information, see Add new labels (Console).

  10. Add labels to your images. If you're adding image-level labels, see Assigning image-level labels to an image. If you're adding bounding boxes, see Labeling objects with bounding boxes. For more information, see Purposing datasets.

Adding a dataset to a project (SDK)

You can add a train or test dataset to an existing project in the following ways:

Topics
    To add a dataset to a project (SDK)
    1. If you haven't already done so, install and configure the AWS CLI and the AWS SDKs. For more information, see Step 4: Set up the AWS CLI and AWS SDKs.

    2. Use the following examples to add JSON lines to a dataset.

      CLI

      Replace project_arn with the project that you want to add the dataset set to. Replace dataset_type with TRAIN to create a training dataset, or TEST to create a test dataset.

      aws rekognition create-dataset --project-arn project_arn \ --dataset-type dataset_type \ --profile custom-labels-access
      Python

      Use the following code to create a dataset. Supply the following command line options:

      • project_arn — the ARN of the project that you want to add the test dataset to.

      • type — the type of dataset that you want to create (train or test)

      # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 import argparse import logging import time import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) def create_empty_dataset(rek_client, project_arn, dataset_type): """ Creates an empty Amazon Rekognition Custom Labels dataset. :param rek_client: The Amazon Rekognition Custom Labels Boto3 client. :param project_arn: The ARN of the project in which you want to create a dataset. :param dataset_type: The type of the dataset that you want to create (train or test). """ try: #Create the dataset. logger.info("Creating empty %s dataset for project %s", dataset_type, project_arn) dataset_type=dataset_type.upper() response = rek_client.create_dataset( ProjectArn=project_arn, DatasetType=dataset_type ) dataset_arn=response['DatasetArn'] logger.info("dataset ARN: %s", dataset_arn) finished=False while finished is False: dataset=rek_client.describe_dataset(DatasetArn=dataset_arn) status=dataset['DatasetDescription']['Status'] if status == "CREATE_IN_PROGRESS": logger.info(("Creating dataset: %s ", dataset_arn)) time.sleep(5) continue if status == "CREATE_COMPLETE": logger.info("Dataset created: %s", dataset_arn) finished=True continue if status == "CREATE_FAILED": error_message = f"Dataset creation failed: {status} : {dataset_arn}" logger.exception(error_message) raise Exception(error_message) error_message = f"Failed. Unexpected state for dataset creation: {status} : {dataset_arn}" logger.exception(error_message) raise Exception(error_message) return dataset_arn except ClientError as err: logger.exception("Couldn't create dataset: %s", err.response['Error']['Message']) raise def add_arguments(parser): """ Adds command line arguments to the parser. :param parser: The command line parser. """ parser.add_argument( "project_arn", help="The ARN of the project in which you want to create the empty dataset." ) parser.add_argument( "dataset_type", help="The type of the empty dataset that you want to create (train or test)." ) def main(): logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") try: # Get command line arguments. parser = argparse.ArgumentParser(usage=argparse.SUPPRESS) add_arguments(parser) args = parser.parse_args() print(f"Creating empty {args.dataset_type} dataset for project {args.project_arn}") # Create the empty dataset. session = boto3.Session(profile_name='custom-labels-access') rekognition_client = session.client("rekognition") dataset_arn=create_empty_dataset(rekognition_client, args.project_arn, args.dataset_type.lower()) print(f"Finished creating empty dataset: {dataset_arn}") except ClientError as err: logger.exception("Problem creating empty dataset: %s", err) print(f"Problem creating empty dataset: {err}") except Exception as err: logger.exception("Problem creating empty dataset: %s", err) print(f"Problem creating empty dataset: {err}") if __name__ == "__main__": main()
      Java V2

      Use the following code to create a dataset. Supply the following command line options:

      • project_arn — the ARN of the project that you want to add the test dataset to.

      • type — the type of dataset that you want to create (train or test)

      /* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: Apache-2.0 */ package com.example.rekognition; import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider; import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.rekognition.RekognitionClient; import software.amazon.awssdk.services.rekognition.model.CreateDatasetRequest; import software.amazon.awssdk.services.rekognition.model.CreateDatasetResponse; import software.amazon.awssdk.services.rekognition.model.DatasetDescription; import software.amazon.awssdk.services.rekognition.model.DatasetStatus; import software.amazon.awssdk.services.rekognition.model.DatasetType; import software.amazon.awssdk.services.rekognition.model.DescribeDatasetRequest; import software.amazon.awssdk.services.rekognition.model.DescribeDatasetResponse; import software.amazon.awssdk.services.rekognition.model.RekognitionException; import java.net.URI; import java.util.logging.Level; import java.util.logging.Logger; public class CreateEmptyDataset { public static final Logger logger = Logger.getLogger(CreateEmptyDataset.class.getName()); public static String createMyEmptyDataset(RekognitionClient rekClient, String projectArn, String datasetType) throws Exception, RekognitionException { try { logger.log(Level.INFO, "Creating empty {0} dataset for project : {1}", new Object[] { datasetType.toString(), projectArn }); DatasetType requestDatasetType = null; switch (datasetType) { case "train": requestDatasetType = DatasetType.TRAIN; break; case "test": requestDatasetType = DatasetType.TEST; break; default: logger.log(Level.SEVERE, "Unrecognized dataset type: {0}", datasetType); throw new Exception("Unrecognized dataset type: " + datasetType); } CreateDatasetRequest createDatasetRequest = CreateDatasetRequest.builder().projectArn(projectArn) .datasetType(requestDatasetType).build(); CreateDatasetResponse response = rekClient.createDataset(createDatasetRequest); boolean created = false; //Wait until updates finishes do { DescribeDatasetRequest describeDatasetRequest = DescribeDatasetRequest.builder() .datasetArn(response.datasetArn()).build(); DescribeDatasetResponse describeDatasetResponse = rekClient.describeDataset(describeDatasetRequest); DatasetDescription datasetDescription = describeDatasetResponse.datasetDescription(); DatasetStatus status = datasetDescription.status(); logger.log(Level.INFO, "Creating dataset ARN: {0} ", response.datasetArn()); switch (status) { case CREATE_COMPLETE: logger.log(Level.INFO, "Dataset created"); created = true; break; case CREATE_IN_PROGRESS: Thread.sleep(5000); break; case CREATE_FAILED: String error = "Dataset creation failed: " + datasetDescription.statusAsString() + " " + datasetDescription.statusMessage() + " " + response.datasetArn(); logger.log(Level.SEVERE, error); throw new Exception(error); default: String unexpectedError = "Unexpected creation state: " + datasetDescription.statusAsString() + " " + datasetDescription.statusMessage() + " " + response.datasetArn(); logger.log(Level.SEVERE, unexpectedError); throw new Exception(unexpectedError); } } while (created == false); return response.datasetArn(); } catch (RekognitionException e) { logger.log(Level.SEVERE, "Could not create dataset: {0}", e.getMessage()); throw e; } } public static void main(String args[]) { String datasetType = null; String datasetArn = null; String projectArn = null; final String USAGE = "\n" + "Usage: " + "<project_arn> <dataset_type>\n\n" + "Where:\n" + " project_arn - the ARN of the project that you want to add copy the datast to.\n\n" + " dataset_type - the type of the empty dataset that you want to create (train or test).\n\n"; if (args.length != 2) { System.out.println(USAGE); System.exit(1); } projectArn = args[0]; datasetType = args[1]; try { // Get the Rekognition client RekognitionClient rekClient = RekognitionClient.builder() .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access")) .region(Region.US_WEST_2) .build(); // Create the dataset datasetArn = createMyEmptyDataset(rekClient, projectArn, datasetType); System.out.println(String.format("Created dataset: %s", datasetArn)); rekClient.close(); } catch (RekognitionException rekError) { logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage()); System.exit(1); } catch (Exception rekError) { logger.log(Level.SEVERE, "Error: {0}", rekError.getMessage()); System.exit(1); } } }
    3. Add images to the dataset. For more information, see Adding more images (SDK).