将指标归因报告发布到 Amazon S3 - Amazon Personalize

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

将指标归因报告发布到 Amazon S3

对于所有批量数据,如果您在创建指标归因时提供了 Amazon S3 存储桶,则可以选择在每次为交互数据创建数据集导入作业时,将指标报告发布到您的 Amazon S3 存储桶。

要将指标发布到 Amazon S3,您需要在指标归因中提供指向 Amazon S3 存储桶的路径。然后,在创建数据集导入作业时,将报告发布到 Amazon S3。作业完成后,可以在 Amazon S3 存储桶中找到指标。每次发布指标时,Amazon Personalize 都会在您的 Amazon S3 存储桶中创建一个新文件。文件名包括导入方法和日期,如下所示:

AggregatedAttributionMetrics - ImportMethod - Timestamp.csv

以下是指标报告CSV文件前几行的显示示例。此示例中的指标报告两个不同推荐器在 15 分钟间隔内的总单击量。每个推荐者都通过其在 EVENT ATTRIBUTION _ SOURCE 列中的 Amazon 资源名称 (ARN) 进行标识。

METRIC_NAME,EVENT_TYPE,VALUE,MATH_FUNCTION,EVENT_ATTRIBUTION_SOURCE,TIMESTAMP COUNTWATCHES,WATCH,12.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666925124 COUNTWATCHES,WATCH,112.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666924224 COUNTWATCHES,WATCH,10.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666924224 COUNTWATCHES,WATCH,254.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666922424 COUNTWATCHES,WATCH,112.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666922424 COUNTWATCHES,WATCH,100.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666922424 ...... .....

将批量数据的指标发布到 Amazon S3(控制台)

要使用 Amazon Personalize 控制台将指标发布到 Amazon S3 存储桶,请创建数据集导入作业,然后在将事件指标发布到 S3 中选择发布此导入作业的指标

有关 step-by-step 说明,请参阅创建数据集导入任务(控制台)

将批量数据的指标发布到 Amazon S3 (AWS CLI)

要使用 AWS Command Line Interface (AWS CLI) 将指标发布到 Amazon S3 存储桶,请使用以下代码创建数据集导入任务并提供标publishAttributionMetricsToS3志。如果您不想发布特定作业的指标,请忽略标志。有关每个参数的信息,请参阅CreateDatasetImportJob

aws personalize create-dataset-import-job \ --job-name dataset import job name \ --dataset-arn dataset arn \ --data-source dataLocation=s3://amzn-s3-demo-bucket/filename \ --role-arn roleArn \ --import-mode INCREMENTAL \ --publish-attribution-metrics-to-s3

将批量数据的指标发布到 Amazon S3 (AWS SDKs)

要使用将指标发布到 Amazon S3 存储桶 AWS SDKs,请创建数据集导入任务并将其设置publishAttributionMetricsToS3为 true。有关每个参数的信息,请参阅CreateDatasetImportJob

SDK for Python (Boto3)
import boto3 personalize = boto3.client('personalize') response = personalize.create_dataset_import_job( jobName = 'YourImportJob', datasetArn = 'dataset_arn', dataSource = {'dataLocation':'s3://amzn-s3-demo-bucket/file.csv'}, roleArn = 'role_arn', importMode = 'INCREMENTAL', publishAttributionMetricsToS3 = True ) dsij_arn = response['datasetImportJobArn'] print ('Dataset Import Job arn: ' + dsij_arn) description = personalize.describe_dataset_import_job( datasetImportJobArn = dsij_arn)['datasetImportJob'] print('Name: ' + description['jobName']) print('ARN: ' + description['datasetImportJobArn']) print('Status: ' + description['status'])
SDK for Java 2.x
public static String createPersonalizeDatasetImportJob(PersonalizeClient personalizeClient, String jobName, String datasetArn, String s3BucketPath, String roleArn, ImportMode importMode, boolean publishToS3) { long waitInMilliseconds = 60 * 1000; String status; String datasetImportJobArn; try { DataSource importDataSource = DataSource.builder() .dataLocation(s3BucketPath) .build(); CreateDatasetImportJobRequest createDatasetImportJobRequest = CreateDatasetImportJobRequest.builder() .datasetArn(datasetArn) .dataSource(importDataSource) .jobName(jobName) .roleArn(roleArn) .importMode(importMode) .publishAttributionMetricsToS3(publishToS3) .build(); datasetImportJobArn = personalizeClient.createDatasetImportJob(createDatasetImportJobRequest) .datasetImportJobArn(); DescribeDatasetImportJobRequest describeDatasetImportJobRequest = DescribeDatasetImportJobRequest.builder() .datasetImportJobArn(datasetImportJobArn) .build(); long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60; while (Instant.now().getEpochSecond() < maxTime) { DatasetImportJob datasetImportJob = personalizeClient .describeDatasetImportJob(describeDatasetImportJobRequest) .datasetImportJob(); status = datasetImportJob.status(); System.out.println("Dataset import job status: " + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return datasetImportJobArn; } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
SDK for JavaScript v3
// Get service clients and commands using ES6 syntax. import { CreateDatasetImportJobCommand, PersonalizeClient } from "@aws-sdk/client-personalize"; // create personalizeClient const personalizeClient = new PersonalizeClient({ region: "REGION" }); // Set the dataset import job parameters. export const datasetImportJobParam = { datasetArn: 'DATASET_ARN', /* required */ dataSource: { dataLocation: 's3://amzn-s3-demo-bucket/<folderName>/<CSVfilename>.csv' /* required */ }, jobName: 'NAME', /* required */ roleArn: 'ROLE_ARN', /* required */ importMode: "FULL", /* optional, default is FULL */ publishAttributionMetricsToS3: true /* set to true to publish metrics to Amazon S3 bucket */ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetImportJobCommand(datasetImportJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();