View a markdown version of this page

CreateAIBenchmarkJob - Amazon SageMaker

CreateAIBenchmarkJob

Creates a benchmark job that runs performance benchmarks against inference infrastructure using a predefined AI workload configuration. The benchmark job measures metrics such as latency, throughput, and cost for your generative AI inference endpoints.

Request Syntax

{ "AIBenchmarkJobName": "string", "AIWorkloadConfigIdentifier": "string", "BenchmarkTarget": { ... }, "NetworkConfig": { "VpcConfig": { "SecurityGroupIds": [ "string" ], "Subnets": [ "string" ] } }, "OutputConfig": { "S3OutputLocation": "string" }, "RoleArn": "string", "Tags": [ { "Key": "string", "Value": "string" } ] }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

AIBenchmarkJobName

The name of the AI benchmark job. The name must be unique within your AWS account in the current AWS Region.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 63.

Pattern: [a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

Required: Yes

AIWorkloadConfigIdentifier

The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this benchmark job.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 256.

Pattern: (arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:[a-z\-]*/)?([a-zA-Z0-9]([a-zA-Z0-9\-]){0,62})(?<!-)

Required: Yes

BenchmarkTarget

The target endpoint to benchmark. Specify a SageMaker endpoint by providing its name or Amazon Resource Name (ARN).

Type: AIBenchmarkTarget object

Note: This object is a Union. Only one member of this object can be specified or returned.

Required: Yes

NetworkConfig

The network configuration for the benchmark job, including VPC settings.

Type: AIBenchmarkNetworkConfig object

Required: No

OutputConfig

The output configuration for the benchmark job, including the Amazon S3 location where benchmark results are stored.

Type: AIBenchmarkOutputConfig object

Required: Yes

RoleArn

The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf.

Type: String

Length Constraints: Minimum length of 20. Maximum length of 2048.

Pattern: arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+

Required: Yes

Tags

The metadata that you apply to AWS resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define.

Type: Array of Tag objects

Array Members: Minimum number of 0 items. Maximum number of 50 items.

Required: No

Response Syntax

{ "AIBenchmarkJobArn": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

AIBenchmarkJobArn

The Amazon Resource Name (ARN) of the created benchmark job.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 256.

Pattern: arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:ai-benchmark-job/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}

Errors

For information about the errors that are common to all actions, see Common Error Types.

ResourceInUse

Resource being accessed is in use.

HTTP Status Code: 400

ResourceLimitExceeded

You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.

HTTP Status Code: 400

ResourceNotFound

Resource being access is not found.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: