StartDataQualityRulesetEvaluationRun
Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (AWS Glue table). The evaluation computes results which you can retrieve with the GetDataQualityResult
API.
Request Syntax
{
"AdditionalDataSources": {
"string
" : {
"GlueTable": {
"AdditionalOptions": {
"string
" : "string
"
},
"CatalogId": "string
",
"ConnectionName": "string
",
"DatabaseName": "string
",
"TableName": "string
"
}
}
},
"AdditionalRunOptions": {
"CloudWatchMetricsEnabled": boolean
,
"CompositeRuleEvaluationMethod": "string
",
"ResultsS3Prefix": "string
"
},
"ClientToken": "string
",
"DataSource": {
"GlueTable": {
"AdditionalOptions": {
"string
" : "string
"
},
"CatalogId": "string
",
"ConnectionName": "string
",
"DatabaseName": "string
",
"TableName": "string
"
}
},
"NumberOfWorkers": number
,
"Role": "string
",
"RulesetNames": [ "string
" ],
"Timeout": number
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- AdditionalDataSources
-
A map of reference strings to additional data sources you can specify for an evaluation run.
Type: String to DataSource object map
Key Length Constraints: Minimum length of 1. Maximum length of 255.
Key Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Required: No
- AdditionalRunOptions
-
Additional run options you can specify for an evaluation run.
Type: DataQualityEvaluationRunAdditionalRunOptions object
Required: No
- ClientToken
-
Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Required: No
- DataSource
-
The data source (AWS Glue table) associated with this run.
Type: DataSource object
Required: Yes
- NumberOfWorkers
-
The number of
G.1X
workers to be used in the run. The default is 5.Type: Integer
Required: No
- Role
-
An IAM role supplied to encrypt the results of the run.
Type: String
Required: Yes
- RulesetNames
-
A list of ruleset names.
Type: Array of strings
Array Members: Minimum number of 1 item. Maximum number of 10 items.
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Required: Yes
- Timeout
-
The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters
TIMEOUT
status. The default is 2,880 minutes (48 hours).Type: Integer
Valid Range: Minimum value of 1.
Required: No
Response Syntax
{
"RunId": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- RunId
-
The unique run identifier associated with this run.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Errors
For information about the errors that are common to all actions, see Common Errors.
- ConflictException
-
The
CreatePartitions
API was called on a table that has indexes enabled.HTTP Status Code: 400
- EntityNotFoundException
-
A specified entity does not exist
HTTP Status Code: 400
- InternalServiceException
-
An internal service error occurred.
HTTP Status Code: 500
- InvalidInputException
-
The input provided was not valid.
HTTP Status Code: 400
- OperationTimeoutException
-
The operation timed out.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: