为批量建议准备输入数据 - Amazon Personalize

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

为批量建议准备输入数据

批量推理任务从 Amazon S3 存储桶导入您的批量输入JSON数据,使用您的自定义解决方案版本生成建议,然后将项目推荐导出到 Amazon S3 存储桶。您必须先准备好JSON文件并将其上传到 Amazon S3 存储桶,然后才能获得批量建议。我们建议您在 Amazon S3 存储桶中创建输出文件夹,或者使用单独的输出 Amazon S3 存储桶。然后,您可以使用相同的输入数据位置运行多个批量推理作业。

如果您使用带有占位符参数的过滤器(例如)$GENRE,则必须在输入JSON中为filterValues对象中的参数提供值。有关更多信息,请参阅 在输入中提供筛选器值 JSON

准备和导入数据
  1. 根据您的食谱设置批量输入数据的格式。您无法通过 Trending-Now 食谱获得批量建议。

    • 对于 USER _ PERSONALIZATION 食谱和 Populity-Count 配方,您的输入数据是一个包含JSON以下列表的文件 userIds

    • 对于 RELATED _ ITEMS 食谱,您的输入数据是一个列表 itemIds

    • 对于 PERSONALIZED _ RANKING 食谱,您的输入数据是一个列表userIds,每个列表都与一个集合配对 itemIds

    用新行分隔每行。有关输入数据的示例,请参阅Batch 推理作业输入和输出示例 JSON

  2. 将您的输入上传JSON到 Amazon S3 存储桶中的输入文件夹。有关更多信息,请参阅《Amazon Simple Storage Service 用户指南》中的使用拖放功能上传文件和文件夹

  3. 为输出数据创建一个单独的位置,可以是文件夹,也可以是其他 Amazon S3 存储桶。通过为输出创建单独的位置JSON,您可以使用相同的输入数据位置运行多个批量推理作业。

  4. 创建批量推理作业。Amazon Personalize 会将解决方案版本的建议输出到输出数据位置。

Batch 推理作业输入和输出示例 JSON

如何将输入数据格式设置为您使用的食谱的格式。如果您使用带有占位符参数的过滤器(例如)$GENRE,则必须在输入JSON中为filterValues对象中的参数提供值。有关更多信息,请参阅 在输入中提供筛选器值 JSON

以下各节列出了格式正确的批量推理作业的JSON输入和输出示例。您无法通过 Trending-Now 食谱获得批量建议。

USER_ PERSONALIZATION 食谱

下面显示了 USER _ PERSONALIZATION 配方格式正确的JSON输入和输出示例。如果您使用 Userpersonalization-v2,则每件推荐的商品都包含一份将该商品包含在推荐中的原因列表。此列表可以为空。有关可能原因的信息,请参阅 User-Personalization-v2 的推荐理由

Input

用新行分隔每个 userId,如下所示。

{"userId": "4638"} {"userId": "663"} {"userId": "3384"} ...
Output
{"input":{"userId":"4638"},"output":{"recommendedItems":["63992","115149","110102","148626","148888","31685","102445","69526","92535","143355","62374","7451","56171","122882","66097","91542","142488","139385","40583","71530","39292","111360","34048","47099","135137"],"scores":[0.0152238,0.0069081,0.0068222,0.006394,0.0059746,0.0055851,0.0049357,0.0044644,0.0042968,0.004015,0.0038805,0.0037476,0.0036563,0.0036178,0.00341,0.0033467,0.0033258,0.0032454,0.0032076,0.0031996,0.0029558,0.0029021,0.0029007,0.0028837,0.0028316]},"error":null} {"input":{"userId":"663"},"output":{"recommendedItems":["368","377","25","780","1610","648","1270","6","165","1196","1097","300","1183","608","104","474","736","293","141","2987","1265","2716","223","733","2028"],"scores":[0.0406197,0.0372557,0.0254077,0.0151975,0.014991,0.0127175,0.0124547,0.0116712,0.0091098,0.0085492,0.0079035,0.0078995,0.0075598,0.0074876,0.0072006,0.0071775,0.0068923,0.0066552,0.0066232,0.0062504,0.0062386,0.0061121,0.0060942,0.0060781,0.0059263]},"error":null} {"input":{"userId":"3384"},"output":{"recommendedItems":["597","21","223","2144","208","2424","594","595","920","104","520","367","2081","39","1035","2054","160","1370","48","1092","158","2671","500","474","1907"],"scores":[0.0241061,0.0119394,0.0118012,0.010662,0.0086972,0.0079428,0.0073218,0.0071438,0.0069602,0.0056961,0.0055999,0.005577,0.0054387,0.0051787,0.0051412,0.0050493,0.0047126,0.0045393,0.0042159,0.0042098,0.004205,0.0042029,0.0040778,0.0038897,0.0038809]},"error":null} ...

以下显示了 Populity-Count 配方格式正确的JSON输入和输出示例。您无法通过 Trending-Now 食谱获得批量建议。

Input

用新行分隔每个 userId,如下所示。

{"userId": "12"} {"userId": "105"} {"userId": "41"} ...
Output
{"input": {"userId": "12"}, "output": {"recommendedItems": ["105", "106", "441"]}} {"input": {"userId": "105"}, "output": {"recommendedItems": ["105", "106", "441"]}} {"input": {"userId": "41"}, "output": {"recommendedItems": ["105", "106", "441"]}} ...

PERSONALIZED_ RANKING 食谱

下面显示了 PERSONALIZED _ recip RANKING es 的正确格式JSON输入和输出示例。

Input

用新行分隔每个 userId 和要排名的 itemIds 列表,如下所示。

{"userId": "891", "itemList": ["27", "886", "101"]} {"userId": "445", "itemList": ["527", "55", "901"]} {"userId": "71", "itemList": ["27", "351", "101"]} ...
Output
{"input":{"userId":"891","itemList":["27","886","101"]},"output":{"recommendedItems":["27","101","886"],"scores":[0.48421,0.28133,0.23446]}} {"input":{"userId":"445","itemList":["527","55","901"]},"output":{"recommendedItems":["901","527","55"],"scores":[0.46972,0.31011,0.22017]}} {"input":{"userId":"71","itemList":["29","351","199"]},"output":{"recommendedItems":["351","29","199"],"scores":[0.68937,0.24829,0.06232]}} ...

下面显示了 RELATED _ recip ITEMS es 的正确格式JSON输入和输出示例。

Input

用新行分隔每个 itemId,如下所示。

{"itemId": "105"} {"itemId": "106"} {"itemId": "441"} ...
Output
{"input": {"itemId": "105"}, "output": {"recommendedItems": ["106", "107", "49"]}} {"input": {"itemId": "106"}, "output": {"recommendedItems": ["105", "107", "49"]}} {"input": {"itemId": "441"}, "output": {"recommendedItems": ["2", "442", "435"]}} ...

下面显示了带有主题的 Similar-Items 配方的正确格式JSON输入和输出示例。

Input

用新行分隔每个 itemId,如下所示。

{"itemId": "40"} {"itemId": "43"} ...
Output
{"input":{"itemId":"40"},"output":{"recommendedItems":["36","50","44","22","21","29","3","1","2","39"],"theme":"Movies with a strong female lead","itemsThemeRelevanceScores":[0.19994527,0.183059963,0.17478035,0.1618133,0.1574806,0.15468733,0.1499242,0.14353688,0.13531424,0.10291852]}} {"input":{"itemId":"43"},"output":{"recommendedItems":["50","21","36","3","17","2","39","1","10","5"],"theme":"The best movies of 1995","itemsThemeRelevanceScores":[0.184988,0.1795761,0.11143453,0.0989443,0.08258403,0.07952615,0.07115086,0.0621634,-0.138913,-0.188913]}} ...