本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
将指标归因报告发布到 Amazon S3
对于所有批量数据,如果您在创建指标归因时提供了 Amazon S3 存储桶,则可以选择在每次为交互数据创建数据集导入作业时,将指标报告发布到您的 Amazon S3 存储桶。
要将指标发布到 Amazon S3,您需要在指标归因中提供指向 Amazon S3 存储桶的路径。然后,在创建数据集导入作业时,将报告发布到 Amazon S3。作业完成后,可以在 Amazon S3 存储桶中找到指标。每次发布指标时,Amazon Personalize 都会在您的 Amazon S3 存储桶中创建一个新文件。文件名包括导入方法和日期,如下所示:
AggregatedAttributionMetrics - ImportMethod
-
Timestamp
.csv
以下是指标报告CSV文件前几行的显示示例。此示例中的指标报告两个不同推荐器在 15 分钟间隔内的总单击量。每个推荐者都通过其在 EVENT ATTRIBUTION _ SOURCE 列中的 Amazon 资源名称 (ARN) 进行标识。
METRIC_NAME,EVENT_TYPE,VALUE,MATH_FUNCTION,EVENT_ATTRIBUTION_SOURCE,TIMESTAMP
COUNTWATCHES,WATCH,12.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666925124
COUNTWATCHES,WATCH,112.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666924224
COUNTWATCHES,WATCH,10.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666924224
COUNTWATCHES,WATCH,254.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666922424
COUNTWATCHES,WATCH,112.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender1Name,1666922424
COUNTWATCHES,WATCH,100.0,samplecount,arn:aws:personalize:us-west-2:acctNum:recommender/recommender2Name,1666922424
......
.....
将批量数据的指标发布到 Amazon S3(控制台)
要使用 Amazon Personalize 控制台将指标发布到 Amazon S3 存储桶,请创建数据集导入作业,然后在将事件指标发布到 S3 中选择发布此导入作业的指标。
有关 step-by-step 说明,请参阅创建数据集导入任务(控制台)。
将批量数据的指标发布到 Amazon S3 (AWS CLI)
要使用 AWS Command Line Interface (AWS CLI) 将指标发布到 Amazon S3 存储桶,请使用以下代码创建数据集导入任务并提供标publishAttributionMetricsToS3
志。如果您不想发布特定作业的指标,请忽略标志。有关每个参数的信息,请参阅CreateDatasetImportJob。
aws personalize create-dataset-import-job \
--job-name dataset import job name
\
--dataset-arn dataset arn
\
--data-source dataLocation=s3://amzn-s3-demo-bucket
/filename
\
--role-arn roleArn
\
--import-mode INCREMENTAL
\
--publish-attribution-metrics-to-s3
将批量数据的指标发布到 Amazon S3 (AWS SDKs)
要使用将指标发布到 Amazon S3 存储桶 AWS SDKs,请创建数据集导入任务并将其设置publishAttributionMetricsToS3
为 true。有关每个参数的信息,请参阅CreateDatasetImportJob。
- SDK for Python (Boto3)
-
import boto3
personalize = boto3.client('personalize')
response = personalize.create_dataset_import_job(
jobName = 'YourImportJob
',
datasetArn = 'dataset_arn
',
dataSource = {'dataLocation':'s3://amzn-s3-demo-bucket/file.csv
'},
roleArn = 'role_arn
',
importMode = 'INCREMENTAL',
publishAttributionMetricsToS3 = True
)
dsij_arn = response['datasetImportJobArn']
print ('Dataset Import Job arn: ' + dsij_arn)
description = personalize.describe_dataset_import_job(
datasetImportJobArn = dsij_arn)['datasetImportJob']
print('Name: ' + description['jobName'])
print('ARN: ' + description['datasetImportJobArn'])
print('Status: ' + description['status'])
- SDK for Java 2.x
-
public static String createPersonalizeDatasetImportJob(PersonalizeClient personalizeClient,
String jobName,
String datasetArn,
String s3BucketPath,
String roleArn,
ImportMode importMode,
boolean publishToS3) {
long waitInMilliseconds = 60 * 1000;
String status;
String datasetImportJobArn;
try {
DataSource importDataSource = DataSource.builder()
.dataLocation(s3BucketPath)
.build();
CreateDatasetImportJobRequest createDatasetImportJobRequest = CreateDatasetImportJobRequest.builder()
.datasetArn(datasetArn)
.dataSource(importDataSource)
.jobName(jobName)
.roleArn(roleArn)
.importMode(importMode)
.publishAttributionMetricsToS3(publishToS3)
.build();
datasetImportJobArn = personalizeClient.createDatasetImportJob(createDatasetImportJobRequest)
.datasetImportJobArn();
DescribeDatasetImportJobRequest describeDatasetImportJobRequest = DescribeDatasetImportJobRequest.builder()
.datasetImportJobArn(datasetImportJobArn)
.build();
long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60;
while (Instant.now().getEpochSecond() < maxTime) {
DatasetImportJob datasetImportJob = personalizeClient
.describeDatasetImportJob(describeDatasetImportJobRequest)
.datasetImportJob();
status = datasetImportJob.status();
System.out.println("Dataset import job status: " + status);
if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) {
break;
}
try {
Thread.sleep(waitInMilliseconds);
} catch (InterruptedException e) {
System.out.println(e.getMessage());
}
}
return datasetImportJobArn;
} catch (PersonalizeException e) {
System.out.println(e.awsErrorDetails().errorMessage());
}
return "";
}
- SDK for JavaScript v3
// Get service clients and commands using ES6 syntax.
import { CreateDatasetImportJobCommand, PersonalizeClient } from
"@aws-sdk/client-personalize";
// create personalizeClient
const personalizeClient = new PersonalizeClient({
region: "REGION"
});
// Set the dataset import job parameters.
export const datasetImportJobParam = {
datasetArn: 'DATASET_ARN', /* required */
dataSource: {
dataLocation: 's3://amzn-s3-demo-bucket/<folderName>/<CSVfilename>.csv' /* required */
},
jobName: 'NAME', /* required */
roleArn: 'ROLE_ARN', /* required */
importMode: "FULL", /* optional, default is FULL */
publishAttributionMetricsToS3: true /* set to true to publish metrics to Amazon S3 bucket */
};
export const run = async () => {
try {
const response = await personalizeClient.send(new CreateDatasetImportJobCommand(datasetImportJobParam));
console.log("Success", response);
return response; // For unit tests.
} catch (err) {
console.log("Error", err);
}
};
run();