CreateDataQualityJobDefinition - Amazon SageMaker

CreateDataQualityJobDefinition

Creates a definition for a job that monitors data quality and drift. For information about model monitor, see Amazon SageMaker AI Model Monitor.

Request Syntax

{ "DataQualityAppSpecification": { "ContainerArguments": [ "string" ], "ContainerEntrypoint": [ "string" ], "Environment": { "string" : "string" }, "ImageUri": "string", "PostAnalyticsProcessorSourceUri": "string", "RecordPreprocessorSourceUri": "string" }, "DataQualityBaselineConfig": { "BaseliningJobName": "string", "ConstraintsResource": { "S3Uri": "string" }, "StatisticsResource": { "S3Uri": "string" } }, "DataQualityJobInput": { "BatchTransformInput": { "DataCapturedDestinationS3Uri": "string", "DatasetFormat": { "Csv": { "Header": boolean }, "Json": { "Line": boolean }, "Parquet": { } }, "EndTimeOffset": "string", "ExcludeFeaturesAttribute": "string", "FeaturesAttribute": "string", "InferenceAttribute": "string", "LocalPath": "string", "ProbabilityAttribute": "string", "ProbabilityThresholdAttribute": number, "S3DataDistributionType": "string", "S3InputMode": "string", "StartTimeOffset": "string" }, "EndpointInput": { "EndpointName": "string", "EndTimeOffset": "string", "ExcludeFeaturesAttribute": "string", "FeaturesAttribute": "string", "InferenceAttribute": "string", "LocalPath": "string", "ProbabilityAttribute": "string", "ProbabilityThresholdAttribute": number, "S3DataDistributionType": "string", "S3InputMode": "string", "StartTimeOffset": "string" } }, "DataQualityJobOutputConfig": { "KmsKeyId": "string", "MonitoringOutputs": [ { "S3Output": { "LocalPath": "string", "S3UploadMode": "string", "S3Uri": "string" } } ] }, "JobDefinitionName": "string", "JobResources": { "ClusterConfig": { "InstanceCount": number, "InstanceType": "string", "VolumeKmsKeyId": "string", "VolumeSizeInGB": number } }, "NetworkConfig": { "EnableInterContainerTrafficEncryption": boolean, "EnableNetworkIsolation": boolean, "VpcConfig": { "SecurityGroupIds": [ "string" ], "Subnets": [ "string" ] } }, "RoleArn": "string", "StoppingCondition": { "MaxRuntimeInSeconds": number }, "Tags": [ { "Key": "string", "Value": "string" } ] }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

DataQualityAppSpecification

Specifies the container that runs the monitoring job.

Type: DataQualityAppSpecification object

Required: Yes

DataQualityBaselineConfig

Configures the constraints and baselines for the monitoring job.

Type: DataQualityBaselineConfig object

Required: No

DataQualityJobInput

A list of inputs for the monitoring job. Currently endpoints are supported as monitoring inputs.

Type: DataQualityJobInput object

Required: Yes

DataQualityJobOutputConfig

The output configuration for monitoring jobs.

Type: MonitoringOutputConfig object

Required: Yes

JobDefinitionName

The name for the monitoring job definition.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 63.

Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}$

Required: Yes

JobResources

Identifies the resources to deploy for a monitoring job.

Type: MonitoringResources object

Required: Yes

NetworkConfig

Specifies networking configuration for the monitoring job.

Type: MonitoringNetworkConfig object

Required: No

RoleArn

The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker AI can assume to perform tasks on your behalf.

Type: String

Length Constraints: Minimum length of 20. Maximum length of 2048.

Pattern: ^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$

Required: Yes

StoppingCondition

A time limit for how long the monitoring job is allowed to run before stopping.

Type: MonitoringStoppingCondition object

Required: No

Tags

(Optional) An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.

Type: Array of Tag objects

Array Members: Minimum number of 0 items. Maximum number of 50 items.

Required: No

Response Syntax

{ "JobDefinitionArn": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

JobDefinitionArn

The Amazon Resource Name (ARN) of the job definition.

Type: String

Length Constraints: Maximum length of 256.

Pattern: .*

Errors

For information about the errors that are common to all actions, see Common Errors.

ResourceInUse

Resource being accessed is in use.

HTTP Status Code: 400

ResourceLimitExceeded

You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: