CreateAIBenchmarkJob
Creates a benchmark job that runs performance benchmarks against inference infrastructure using a predefined AI workload configuration. The benchmark job measures metrics such as latency, throughput, and cost for your generative AI inference endpoints.
Request Syntax
{
"AIBenchmarkJobName": "string",
"AIWorkloadConfigIdentifier": "string",
"BenchmarkTarget": { ... },
"NetworkConfig": {
"VpcConfig": {
"SecurityGroupIds": [ "string" ],
"Subnets": [ "string" ]
}
},
"OutputConfig": {
"S3OutputLocation": "string"
},
"RoleArn": "string",
"Tags": [
{
"Key": "string",
"Value": "string"
}
]
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- AIBenchmarkJobName
-
The name of the AI benchmark job. The name must be unique within your AWS account in the current AWS Region.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}Required: Yes
- AIWorkloadConfigIdentifier
-
The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this benchmark job.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 256.
Pattern:
(arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:[a-z\-]*/)?([a-zA-Z0-9]([a-zA-Z0-9\-]){0,62})(?<!-)Required: Yes
- BenchmarkTarget
-
The target endpoint to benchmark. Specify a SageMaker endpoint by providing its name or Amazon Resource Name (ARN).
Type: AIBenchmarkTarget object
Note: This object is a Union. Only one member of this object can be specified or returned.
Required: Yes
- NetworkConfig
-
The network configuration for the benchmark job, including VPC settings.
Type: AIBenchmarkNetworkConfig object
Required: No
- OutputConfig
-
The output configuration for the benchmark job, including the Amazon S3 location where benchmark results are stored.
Type: AIBenchmarkOutputConfig object
Required: Yes
- RoleArn
-
The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern:
arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+Required: Yes
- Tags
-
The metadata that you apply to AWS resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define.
Type: Array of Tag objects
Array Members: Minimum number of 0 items. Maximum number of 50 items.
Required: No
Response Syntax
{
"AIBenchmarkJobArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- AIBenchmarkJobArn
-
The Amazon Resource Name (ARN) of the created benchmark job.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:ai-benchmark-job/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}
Errors
For information about the errors that are common to all actions, see Common Error Types.
- ResourceInUse
-
Resource being accessed is in use.
HTTP Status Code: 400
- ResourceLimitExceeded
-
You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.
HTTP Status Code: 400
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: