向您的数据集中添加图像 - Amazon Lookout for Vision

终止支持通知:2025年10月31日, AWS 将停止对亚马逊 Lookout for Vision 的支持。2025 年 10 月 31 日之后,你将无法再访问 Lookout for Vision 主机或 Lookout for Vision 资源。如需更多信息,请访问此博客文章

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

向您的数据集中添加图像

创建数据集后,您可能会希望向数据集中添加更多图像。例如,如果模型评估表明模型性能不佳,则可以通过添加更多图像来提高模型的质量。如果您创建了测试数据集,添加更多图像可以提高模型性能指标的准确度。

更新数据集后重新训练您的模型。

添加更多图像

您可以从本地计算机上传图像,从而向您的数据集中添加更多图像。要使用添加更多带标签的图像SDK,请使用UpdateDatasetEntries操作。

向数据集中添加更多图像(控制台)
  1. 选择操作,然后选择要向其添加图像的数据集。

  2. 选择要上传到数据集的图像。可以从本地计算机拖动或选择要上传的图像。一次最多可以上传 30 张图像。

  3. 选择上传图像

  4. 选择 Save changes(保存更改)

添加完更多图像后,您需要标注它们,以便它们可用来训练模型。有关更多信息,请参阅 图像分类(控制台)

添加更多图片 (SDK)

要使用添加更多带标签的图像SDK,请使用UpdateDatasetEntries操作。您需要提供一个清单文件,其中包含要添加的图像。您还可以通过在清单文件中该JSON行的source-ref字段中指定图像来更新现有图像。有关更多信息,请参阅 创建清单文件

向数据集添加更多图像 (SDK)
  1. 如果您尚未这样做,请安装并配置 AWS CLI 和 AWS SDKs。有关更多信息,请参阅 第 4 步:设置 AWS CLI 和 AWS SDKs

  2. 使用以下示例代码向数据集中添加更多图像。

    CLI

    更改以下值:

    • project-name 更改为包含要更新的数据集的项目的名称。

    • dataset-type 更改为要更新的数据集类型(traintest)。

    • changes 更改为包含数据集更新的清单文件的位置。

    aws lookoutvision update-dataset-entries\ --project-name project\ --dataset-type train or test\ --changes fileb://manifest file \ --profile lookoutvision-access
    Python

    此代码取自 AWS 文档SDK示例 GitHub 存储库。请在此处查看完整示例。

    @staticmethod def update_dataset_entries(lookoutvision_client, project_name, dataset_type, updates_file): """ Adds dataset entries to an Amazon Lookout for Vision dataset. :param lookoutvision_client: The Amazon Rekognition Custom Labels Boto3 client. :param project_name: The project that contains the dataset that you want to update. :param dataset_type: The type of the dataset that you want to update (train or test). :param updates_file: The manifest file of JSON Lines that contains the updates. """ try: status = "" status_message = "" manifest_file = "" # Update dataset entries logger.info(f"""Updating {dataset_type} dataset for project {project_name} with entries from {updates_file}.""") with open(updates_file) as f: manifest_file = f.read() lookoutvision_client.update_dataset_entries( ProjectName=project_name, DatasetType=dataset_type, Changes=manifest_file, ) finished = False while finished == False: dataset = lookoutvision_client.describe_dataset(ProjectName=project_name, DatasetType=dataset_type) status = dataset['DatasetDescription']['Status'] status_message = dataset['DatasetDescription']['StatusMessage'] if status == "UPDATE_IN_PROGRESS": logger.info( (f"Updating {dataset_type} dataset for project {project_name}.")) time.sleep(5) continue if status == "UPDATE_FAILED_ROLLBACK_IN_PROGRESS": logger.info( (f"Update failed, rolling back {dataset_type} dataset for project {project_name}.")) time.sleep(5) continue if status == "UPDATE_COMPLETE": logger.info( f"Dataset updated: {status} : {status_message} : {dataset_type} dataset for project {project_name}.") finished = True continue if status == "UPDATE_FAILED_ROLLBACK_COMPLETE": logger.info( f"Rollback complated after update failure: {status} : {status_message} : {dataset_type} dataset for project {project_name}.") finished = True continue logger.exception( f"Failed. Unexpected state for dataset update: {status} : {status_message} : {dataset_type} dataset for project {project_name}.") raise Exception( f"Failed. Unexpected state for dataset update: {status} : {status_message} :{dataset_type} dataset for project {project_name}.") logger.info(f"Added entries to dataset.") return status, status_message except ClientError as err: logger.exception( f"Couldn't update dataset: {err.response['Error']['Message']}") raise
    Java V2

    此代码取自 AWS 文档SDK示例 GitHub 存储库。请在此处查看完整示例。

    /** * Updates an Amazon Lookout for Vision dataset from a manifest file. * Returns after Lookout for Vision updates the dataset. * * @param lfvClient An Amazon Lookout for Vision client. * @param projectName The name of the project in which you want to update a * dataset. * @param datasetType The type of the dataset that you want to update (train or * test). * @param manifestFile The name and location of a local manifest file that you want to * use to update the dataset. * @return DatasetStatus The status of the updated dataset. */ public static DatasetStatus updateDatasetEntries(LookoutVisionClient lfvClient, String projectName, String datasetType, String updateFile) throws FileNotFoundException, LookoutVisionException, InterruptedException { logger.log(Level.INFO, "Updating {0} dataset for project {1}", new Object[] { datasetType, projectName }); InputStream sourceStream = new FileInputStream(updateFile); SdkBytes sourceBytes = SdkBytes.fromInputStream(sourceStream); UpdateDatasetEntriesRequest updateDatasetEntriesRequest = UpdateDatasetEntriesRequest.builder() .projectName(projectName) .datasetType(datasetType) .changes(sourceBytes) .build(); lfvClient.updateDatasetEntries(updateDatasetEntriesRequest); boolean finished = false; DatasetStatus status = null; // Wait until update completes. do { DescribeDatasetRequest describeDatasetRequest = DescribeDatasetRequest.builder() .projectName(projectName) .datasetType(datasetType) .build(); DescribeDatasetResponse describeDatasetResponse = lfvClient .describeDataset(describeDatasetRequest); DatasetDescription datasetDescription = describeDatasetResponse.datasetDescription(); status = datasetDescription.status(); switch (status) { case UPDATE_COMPLETE: logger.log(Level.INFO, "{0} Dataset updated for project {1}.", new Object[] { datasetType, projectName }); finished = true; break; case UPDATE_IN_PROGRESS: logger.log(Level.INFO, "{0} Dataset update for project {1} in progress.", new Object[] { datasetType, projectName }); TimeUnit.SECONDS.sleep(5); break; case UPDATE_FAILED_ROLLBACK_IN_PROGRESS: logger.log(Level.SEVERE, "{0} Dataset update failed for project {1}. Rolling back", new Object[] { datasetType, projectName }); TimeUnit.SECONDS.sleep(5); break; case UPDATE_FAILED_ROLLBACK_COMPLETE: logger.log(Level.SEVERE, "{0} Dataset update failed for project {1}. Rollback completed.", new Object[] { datasetType, projectName }); finished = true; break; default: logger.log(Level.SEVERE, "{0} Dataset update failed for project {1}. Unexpected error returned.", new Object[] { datasetType, projectName }); finished = true; } } while (!finished); return status; }
  3. 重复上一步并提供其他数据集类型的值。