CreateDataset
Creates a new DataBrew dataset.
Request Syntax
POST /datasets HTTP/1.1
Content-type: application/json
{
"Format": "string
",
"FormatOptions": {
"Csv": {
"Delimiter": "string
",
"HeaderRow": boolean
},
"Excel": {
"HeaderRow": boolean
,
"SheetIndexes": [ number
],
"SheetNames": [ "string
" ]
},
"Json": {
"MultiLine": boolean
}
},
"Input": {
"DatabaseInputDefinition": {
"DatabaseTableName": "string
",
"GlueConnectionName": "string
",
"QueryString": "string
",
"TempDirectory": {
"Bucket": "string
",
"BucketOwner": "string
",
"Key": "string
"
}
},
"DataCatalogInputDefinition": {
"CatalogId": "string
",
"DatabaseName": "string
",
"TableName": "string
",
"TempDirectory": {
"Bucket": "string
",
"BucketOwner": "string
",
"Key": "string
"
}
},
"Metadata": {
"SourceArn": "string
"
},
"S3InputDefinition": {
"Bucket": "string
",
"BucketOwner": "string
",
"Key": "string
"
}
},
"Name": "string
",
"PathOptions": {
"FilesLimit": {
"MaxFiles": number
,
"Order": "string
",
"OrderedBy": "string
"
},
"LastModifiedDateCondition": {
"Expression": "string
",
"ValuesMap": {
"string
" : "string
"
}
},
"Parameters": {
"string
" : {
"CreateColumn": boolean
,
"DatetimeOptions": {
"Format": "string
",
"LocaleCode": "string
",
"TimezoneOffset": "string
"
},
"Filter": {
"Expression": "string
",
"ValuesMap": {
"string
" : "string
"
}
},
"Name": "string
",
"Type": "string
"
}
}
},
"Tags": {
"string
" : "string
"
}
}
URI Request Parameters
The request does not use any URI parameters.
Request Body
The request accepts the following data in JSON format.
- Input
-
Represents information on how DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.
Type: Input object
Required: Yes
- Name
-
The name of the dataset to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Required: Yes
- Format
-
The file format of a dataset that is created from an Amazon S3 file or folder.
Type: String
Valid Values:
CSV | JSON | PARQUET | EXCEL | ORC
Required: No
- FormatOptions
-
Represents a set of options that define the structure of either comma-separated value (CSV), Excel, or JSON input.
Type: FormatOptions object
Required: No
- PathOptions
-
A set of options that defines how DataBrew interprets an Amazon S3 path of the dataset.
Type: PathOptions object
Required: No
- Tags
-
Metadata tags to apply to this dataset.
Type: String to string map
Map Entries: Maximum number of 200 items.
Key Length Constraints: Minimum length of 1. Maximum length of 128.
Value Length Constraints: Maximum length of 256.
Required: No
Response Syntax
HTTP/1.1 200
Content-type: application/json
{
"Name": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- Name
-
The name of the dataset that you created.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
Access to the specified resource was denied.
HTTP Status Code: 403
- ConflictException
-
Updating or deleting a resource can cause an inconsistent state.
HTTP Status Code: 409
- ServiceQuotaExceededException
-
A service quota is exceeded.
HTTP Status Code: 402
- ValidationException
-
The input parameters for this request failed validation.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: