AWS::Personalize::Dataset
Creates an empty dataset and adds it to the specified dataset group. Use CreateDatasetImportJob to import your training data to a dataset.
There are 5 types of datasets:
-
Item interactions
-
Items
-
Users
-
Action interactions (you can't use CloudFormation to create an Action interactions dataset)
-
Actions (you can't use CloudFormation to create an Actions dataset)
Each dataset type has an associated schema with required field types.
Only the Item interactions
dataset is required in order to train a
model (also referred to as creating a solution).
A dataset can be in one of the following states:
-
CREATE PENDING > CREATE IN_PROGRESS > ACTIVE -or- CREATE FAILED
-
DELETE PENDING > DELETE IN_PROGRESS
To get the status of the dataset, call DescribeDataset.
Related APIs
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "Type" : "AWS::Personalize::Dataset", "Properties" : { "DatasetGroupArn" :
String
, "DatasetImportJob" :DatasetImportJob
, "DatasetType" :String
, "Name" :String
, "SchemaArn" :String
} }
YAML
Type: AWS::Personalize::Dataset Properties: DatasetGroupArn:
String
DatasetImportJob:DatasetImportJob
DatasetType:String
Name:String
SchemaArn:String
Properties
DatasetGroupArn
-
The Amazon Resource Name (ARN) of the dataset group.
Required: Yes
Type: String
Pattern:
arn:([a-z\d-]+):personalize:.*:.*:.+
Maximum:
256
Update requires: Replacement
DatasetImportJob
-
Describes a job that imports training data from a data source (Amazon S3 bucket) to an Amazon Personalize dataset. If you specify a dataset import job as part of a dataset, all dataset import job fields are required.
Required: No
Type: DatasetImportJob
Update requires: No interruption
DatasetType
-
One of the following values:
-
Interactions
-
Items
-
Users
Note
You can't use CloudFormation to create an Action Interactions or Actions dataset.
Required: Yes
Type: String
Allowed values:
Interactions | Items | Users
Maximum:
256
Update requires: Replacement
-
Name
-
The name of the dataset.
Required: Yes
Type: String
Pattern:
^[a-zA-Z0-9][a-zA-Z0-9\-_]*
Minimum:
1
Maximum:
63
Update requires: Replacement
SchemaArn
-
The ARN of the associated schema.
Required: Yes
Type: String
Pattern:
arn:([a-z\d-]+):personalize:.*:.*:.+
Maximum:
256
Update requires: Replacement
Return values
Ref
When you pass the logical ID of this resource to the intrinsic Ref
function, Ref
returns the name of the resource.
For more information about using the Ref
function, see Ref
.
Fn::GetAtt
The Fn::GetAtt
intrinsic function returns a value for a specified attribute of this type. The following are the available attributes and sample return values.
For more information about using the Fn::GetAtt
intrinsic function, see Fn::GetAtt
.
DatasetArn
-
The Amazon Resource Name (ARN) of the dataset.
Examples
Creating a dataset
The following example creates an Amazon Personalize dataset and a dataset import job. The dataset import job imports data from an Amazon S3 bucket into the dataset.
JSON
{ "AWSTemplateFormatVersion": "2010-09-09", "Resources": { "MyDataset": { "Type": "AWS::Personalize::Dataset", "Properties": { "Name": "my-dataset-name", "DatasetType": "Interactions", "DatasetGroupArn": "arn:aws:personalize:us-west-2:123456789012:dataset-group/dataset-group-name", "SchemaArn": "arn:aws:personalize:us-west-2:123456789012:schema/schema-name", "DatasetImportJob": { "JobName": "my-import-job-name", "DataSource": { "DataLocation": "s3://bucket-name/file-name.csv" }, "RoleArn": "arn:aws:iam::123456789012:role/personalize-role" } } } } }
YAML
AWSTemplateFormatVersion: 2010-09-09 Resources: MyDataset: Type: 'AWS::Personalize::Dataset' Properties: Name: my-dataset-name DatasetType: Interactions DatasetGroupArn: 'arn:aws:personalize:us-west-2:123456789012:dataset-group/dataset-group-name' SchemaArn: 'arn:aws:personalize:us-west-2:123456789012:schema/schema-name' DatasetImportJob: JobName: my-import-job-name DataSource: DataLocation: 's3://bucket-name/file-name.csv' RoleArn: 'arn:aws:iam::123456789012:role/personalize-role'