本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
發佈訓練資料集 (SDK)
Amazon Rekognition 自訂標籤需要訓練資料集和測試資料集來訓練您的模型。
如果您使用 API,則可以使用 DistributeDatasetEntriesAPI 將 20% 的訓練資料集散發到空白的測試資料集中。如果您只有一個資訊清單檔案可用,則散佈訓練資料集會很有用。使用單一資訊清單檔案建立訓練資料集。然後創建一個空的測試數據集並用DistributeDatasetEntries
於填充測試數據集。
注意
如果您使用 Amazon Rekognition 自訂標籤主控台並從單一資料集專案開始,Amazon Rekognition 自訂標籤會在訓練期間分割 (分配) 訓練資料集,以建立測試資料集。20% 的訓練資料集項目會移至測試資料集。
散發訓練資料集 (SDK)
-
若果您尚未這樣做,請先完成安裝AWS CLI並設定和AWS SDK。如需詳細資訊,請參閱步驟 4:設定 AWS CLI 以及 AWS SDKs。
-
建立專案。如需詳細資訊,請參閱建立 Amazon Rekognition 自訂標籤專案 (SDK)。
-
建立您的訓練資料集。如需資料集的相關資訊,請參閱建立培訓和測試資料集。
-
建立空的測試資料集。
-
使用下列範例程式碼,將 20% 的訓練資料集項目散佈至測試資料集。您可以透過呼叫取得專案資料集的 Amazon 資源名稱 (ARN) DescribeProjects。如需範例程式碼,請參閱描述一個項目(SDK)。
- AWS CLI
-
使用您要使
training_dataset-arn
test_dataset_arn
用的資料集的 ARNS 來變更與的值。aws rekognition distribute-dataset-entries --datasets ['{"Arn": "
training_dataset_arn
"}, {"Arn": "test_dataset_arn
"}'] \ --profile custom-labels-access - Python
-
使用下列程式碼。提供以下命令行參數:
-
訓練資料集 — 您從中發佈項目之訓練資料集的 ARN。
-
test_dataset_arn — 您分發項目的測試資料集的 ARN。
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 import argparse import logging import time import json import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) def check_dataset_status(rek_client, dataset_arn): """ Checks the current status of a dataset. :param rek_client: The Amazon Rekognition Custom Labels Boto3 client. :param dataset_arn: The dataset that you want to check. :return: The dataset status and status message. """ finished = False status = "" status_message = "" while finished is False: dataset = rek_client.describe_dataset(DatasetArn=dataset_arn) status = dataset['DatasetDescription']['Status'] status_message = dataset['DatasetDescription']['StatusMessage'] if status == "UPDATE_IN_PROGRESS": logger.info("Distributing dataset: %s ", dataset_arn) time.sleep(5) continue if status == "UPDATE_COMPLETE": logger.info( "Dataset distribution complete: %s : %s : %s", status, status_message, dataset_arn) finished = True continue if status == "UPDATE_FAILED": logger.exception( "Dataset distribution failed: %s : %s : %s", status, status_message, dataset_arn) finished = True break logger.exception( "Failed. Unexpected state for dataset distribution: %s : %s : %s", status, status_message, dataset_arn) finished = True status_message = "An unexpected error occurred while distributing the dataset" break return status, status_message def distribute_dataset_entries(rek_client, training_dataset_arn, test_dataset_arn): """ Distributes 20% of the supplied training dataset into the supplied test dataset. :param rek_client: The Amazon Rekognition Custom Labels Boto3 client. :param training_dataset_arn: The ARN of the training dataset that you distribute entries from. :param test_dataset_arn: The ARN of the test dataset that you distribute entries to. """ try: # List dataset labels. logger.info("Distributing training dataset entries (%s) into test dataset (%s).", training_dataset_arn,test_dataset_arn) datasets = json.loads( '[{"Arn" : "' + str(training_dataset_arn) + '"},{"Arn" : "' + str(test_dataset_arn) + '"}]') rek_client.distribute_dataset_entries( Datasets=datasets ) training_dataset_status, training_dataset_status_message = check_dataset_status( rek_client, training_dataset_arn) test_dataset_status, test_dataset_status_message = check_dataset_status( rek_client, test_dataset_arn) if training_dataset_status == 'UPDATE_COMPLETE' and test_dataset_status == "UPDATE_COMPLETE": print("Distribution complete") else: print("Distribution failed:") print( f"\ttraining dataset: {training_dataset_status} : {training_dataset_status_message}") print( f"\ttest dataset: {test_dataset_status} : {test_dataset_status_message}") except ClientError as err: logger.exception( "Couldn't distribute dataset: %s",err.response['Error']['Message'] ) raise def add_arguments(parser): """ Adds command line arguments to the parser. :param parser: The command line parser. """ parser.add_argument( "training_dataset_arn", help="The ARN of the training dataset that you want to distribute from." ) parser.add_argument( "test_dataset_arn", help="The ARN of the test dataset that you want to distribute to." ) def main(): logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") try: # Get command line arguments. parser = argparse.ArgumentParser(usage=argparse.SUPPRESS) add_arguments(parser) args = parser.parse_args() print( f"Distributing training dataset entries ({args.training_dataset_arn}) "\ f"into test dataset ({args.test_dataset_arn}).") # Distribute the datasets. session = boto3.Session(profile_name='custom-labels-access') rekognition_client = session.client("rekognition") distribute_dataset_entries(rekognition_client, args.training_dataset_arn, args.test_dataset_arn) print("Finished distributing datasets.") except ClientError as err: logger.exception("Problem distributing datasets: %s", err) print(f"Problem listing dataset labels: {err}") except Exception as err: logger.exception("Problem distributing datasets: %s", err) print(f"Problem distributing datasets: {err}") if __name__ == "__main__": main()
-
- Java V2
-
使用下列程式碼。提供以下命令行參數:
-
訓練資料集 — 您從中發佈項目之訓練資料集的 ARN。
-
test_dataset_arn — 您分發項目的測試資料集的 ARN。
/* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: Apache-2.0 */ package com.example.rekognition; import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider; import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.rekognition.RekognitionClient; import software.amazon.awssdk.services.rekognition.model.DatasetDescription; import software.amazon.awssdk.services.rekognition.model.DatasetStatus; import software.amazon.awssdk.services.rekognition.model.DescribeDatasetRequest; import software.amazon.awssdk.services.rekognition.model.DescribeDatasetResponse; import software.amazon.awssdk.services.rekognition.model.DistributeDataset; import software.amazon.awssdk.services.rekognition.model.DistributeDatasetEntriesRequest; import software.amazon.awssdk.services.rekognition.model.RekognitionException; import java.util.ArrayList; import java.util.logging.Level; import java.util.logging.Logger; public class DistributeDatasetEntries { public static final Logger logger = Logger.getLogger(DistributeDatasetEntries.class.getName()); public static DatasetStatus checkDatasetStatus(RekognitionClient rekClient, String datasetArn) throws Exception, RekognitionException { boolean distributed = false; DatasetStatus status = null; // Wait until distribution completes do { DescribeDatasetRequest describeDatasetRequest = DescribeDatasetRequest.builder().datasetArn(datasetArn) .build(); DescribeDatasetResponse describeDatasetResponse = rekClient.describeDataset(describeDatasetRequest); DatasetDescription datasetDescription = describeDatasetResponse.datasetDescription(); status = datasetDescription.status(); logger.log(Level.INFO, " dataset ARN: {0} ", datasetArn); switch (status) { case UPDATE_COMPLETE: logger.log(Level.INFO, "Dataset updated"); distributed = true; break; case UPDATE_IN_PROGRESS: Thread.sleep(5000); break; case UPDATE_FAILED: String error = "Dataset distribution failed: " + datasetDescription.statusAsString() + " " + datasetDescription.statusMessage() + " " + datasetArn; logger.log(Level.SEVERE, error); break; default: String unexpectedError = "Unexpected distribution state: " + datasetDescription.statusAsString() + " " + datasetDescription.statusMessage() + " " + datasetArn; logger.log(Level.SEVERE, unexpectedError); } } while (distributed == false); return status; } public static void distributeMyDatasetEntries(RekognitionClient rekClient, String trainingDatasetArn, String testDatasetArn) throws Exception, RekognitionException { try { logger.log(Level.INFO, "Distributing {0} dataset to {1} ", new Object[] { trainingDatasetArn, testDatasetArn }); DistributeDataset distributeTrainingDataset = DistributeDataset.builder().arn(trainingDatasetArn).build(); DistributeDataset distributeTestDataset = DistributeDataset.builder().arn(testDatasetArn).build(); ArrayList<DistributeDataset> datasets = new ArrayList(); datasets.add(distributeTrainingDataset); datasets.add(distributeTestDataset); DistributeDatasetEntriesRequest distributeDatasetEntriesRequest = DistributeDatasetEntriesRequest.builder() .datasets(datasets).build(); rekClient.distributeDatasetEntries(distributeDatasetEntriesRequest); DatasetStatus trainingStatus = checkDatasetStatus(rekClient, trainingDatasetArn); DatasetStatus testStatus = checkDatasetStatus(rekClient, testDatasetArn); if (trainingStatus == DatasetStatus.UPDATE_COMPLETE && testStatus == DatasetStatus.UPDATE_COMPLETE) { logger.log(Level.INFO, "Successfully distributed dataset: {0}", trainingDatasetArn); } else { throw new Exception("Failed to distribute dataset: " + trainingDatasetArn); } } catch (RekognitionException e) { logger.log(Level.SEVERE, "Could not distribute dataset: {0}", e.getMessage()); throw e; } } public static void main(String[] args) { String trainingDatasetArn = null; String testDatasetArn = null; final String USAGE = "\n" + "Usage: " + "<training_dataset_arn> <test_dataset_arn>\n\n" + "Where:\n" + " training_dataset_arn - the ARN of the dataset that you want to distribute from.\n\n" + " test_dataset_arn - the ARN of the dataset that you want to distribute to.\n\n"; if (args.length != 2) { System.out.println(USAGE); System.exit(1); } trainingDatasetArn = args[0]; testDatasetArn = args[1]; try { // Get the Rekognition client. RekognitionClient rekClient = RekognitionClient.builder() .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access")) .region(Region.US_WEST_2) .build(); // Distribute the dataset distributeMyDatasetEntries(rekClient, trainingDatasetArn, testDatasetArn); System.out.println("Datasets distributed."); rekClient.close(); } catch (RekognitionException rekError) { logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage()); System.exit(1); } catch (Exception rekError) { logger.log(Level.SEVERE, "Error: {0}", rekError.getMessage()); System.exit(1); } } }
-