CreateDataset
Creates a dataset to upload training or test data for a model associated with a flywheel. For more information about datasets, see Flywheel overview in the Amazon Comprehend Developer Guide.
Request Syntax
{
"ClientRequestToken": "string
",
"DatasetName": "string
",
"DatasetType": "string
",
"Description": "string
",
"FlywheelArn": "string
",
"InputDataConfig": {
"AugmentedManifests": [
{
"AnnotationDataS3Uri": "string
",
"AttributeNames": [ "string
" ],
"DocumentType": "string
",
"S3Uri": "string
",
"SourceDocumentsS3Uri": "string
"
}
],
"DataFormat": "string
",
"DocumentClassifierInputDataConfig": {
"LabelDelimiter": "string
",
"S3Uri": "string
"
},
"EntityRecognizerInputDataConfig": {
"Annotations": {
"S3Uri": "string
"
},
"Documents": {
"InputFormat": "string
",
"S3Uri": "string
"
},
"EntityList": {
"S3Uri": "string
"
}
}
},
"Tags": [
{
"Key": "string
",
"Value": "string
"
}
]
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- ClientRequestToken
-
A unique identifier for the request. If you don't set the client request token, Amazon Comprehend generates one.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
^[a-zA-Z0-9-]+$
Required: No
- DatasetName
-
Name of the dataset.
Type: String
Length Constraints: Maximum length of 63.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9])*$
Required: Yes
- DatasetType
-
The dataset type. You can specify that the data in a dataset is for training the model or for testing the model.
Type: String
Valid Values:
TRAIN | TEST
Required: No
- Description
-
Description of the dataset.
Type: String
Length Constraints: Maximum length of 2048.
Pattern:
^([a-zA-Z0-9_])[\\a-zA-Z0-9_@#%*+=:?./!\s-]*$
Required: No
- FlywheelArn
-
The Amazon Resource Number (ARN) of the flywheel of the flywheel to receive the data.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
arn:aws(-[^:]+)?:comprehend:[a-zA-Z0-9-]*:[0-9]{12}:flywheel/[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: Yes
- InputDataConfig
-
Information about the input data configuration. The type of input data varies based on the format of the input and whether the data is for a classifier model or an entity recognition model.
Type: DatasetInputDataConfig object
Required: Yes
- Tags
-
Tags for the dataset.
Type: Array of Tag objects
Required: No
Response Syntax
{
"DatasetArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- DatasetArn
-
The ARN of the dataset.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
arn:aws(-[^:]+)?:comprehend:[a-zA-Z0-9-]*:[0-9]{12}:flywheel/[a-zA-Z0-9](-*[a-zA-Z0-9])*/dataset/[a-zA-Z0-9](-*[a-zA-Z0-9])*
Errors
For information about the errors that are common to all actions, see Common Errors.
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- InvalidRequestException
-
The request is invalid.
HTTP Status Code: 400
- ResourceInUseException
-
The specified resource name is already in use. Use a different name and try your request again.
HTTP Status Code: 400
- ResourceLimitExceededException
-
The maximum number of resources per account has been exceeded. Review the resources, and then try your request again.
HTTP Status Code: 400
- ResourceNotFoundException
-
The specified resource ARN was not found. Check the ARN and try your request again.
HTTP Status Code: 400
- TooManyRequestsException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 400
- TooManyTagsException
-
The request contains more tags than can be associated with a resource (50 tags per resource). The maximum number of tags includes both existing tags and those included in your current request.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: