CreateProvisionedModelThroughput
Creates dedicated throughput for a base or custom model with the model units and for the duration that you specify. For pricing details, see Amazon Bedrock Pricing
Request Syntax
POST /provisioned-model-throughput HTTP/1.1
Content-type: application/json
{
"clientRequestToken": "string
",
"commitmentDuration": "string
",
"modelId": "string
",
"modelUnits": number
,
"provisionedModelName": "string
",
"tags": [
{
"key": "string
",
"value": "string
"
}
]
}
URI Request Parameters
The request does not use any URI parameters.
Request Body
The request accepts the following data in JSON format.
- clientRequestToken
-
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency in the Amazon S3 User Guide.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 256.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9])*$
Required: No
- commitmentDuration
-
The commitment duration requested for the Provisioned Throughput. Billing occurs hourly and is discounted for longer commitment terms. To request a no-commit Provisioned Throughput, omit this field.
Custom models support all levels of commitment. To see which base models support no commitment, see Supported regions and models for Provisioned Throughput in the Amazon Bedrock User Guide
Type: String
Valid Values:
OneMonth | SixMonths
Required: No
- modelId
-
The Amazon Resource Name (ARN) or name of the model to associate with this Provisioned Throughput. For a list of models for which you can purchase Provisioned Throughput, see Amazon Bedrock model IDs for purchasing Provisioned Throughput in the Amazon Bedrock User Guide.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 2048.
Pattern:
^arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:(([0-9]{12}:custom-model/[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}(([:][a-z0-9-]{1,63}){0,2})?/[a-z0-9]{12})|(:foundation-model/([a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([.]?[a-z0-9-]{1,63})([:][a-z0-9-]{1,63}){0,2})))|(([a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([.]?[a-z0-9-]{1,63})([:][a-z0-9-]{1,63}){0,2}))|(([0-9a-zA-Z][_-]?)+)$
Required: Yes
- modelUnits
-
Number of model units to allocate. A model unit delivers a specific throughput level for the specified model. The throughput level of a model unit specifies the total number of input and output tokens that it can process and generate within a span of one minute. By default, your account has no model units for purchasing Provisioned Throughputs with commitment. You must first visit the AWS support center
to request MUs. For model unit quotas, see Provisioned Throughput quotas in the Amazon Bedrock User Guide.
For more information about what an MU specifies, contact your AWS account manager.
Type: Integer
Valid Range: Minimum value of 1.
Required: Yes
- provisionedModelName
-
The name for this Provisioned Throughput.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
^([0-9a-zA-Z][_-]?)+$
Required: Yes
-
Tags to associate with this Provisioned Throughput.
Type: Array of Tag objects
Array Members: Minimum number of 0 items. Maximum number of 200 items.
Required: No
Response Syntax
HTTP/1.1 201
Content-type: application/json
{
"provisionedModelArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 201 response.
The following data is returned in JSON format by the service.
- provisionedModelArn
-
The Amazon Resource Name (ARN) for this Provisioned Throughput.
Type: String
Pattern:
^arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:[0-9]{12}:provisioned-model/[a-z0-9]{12}$
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
The request is denied because of missing access permissions.
HTTP Status Code: 403
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- ResourceNotFoundException
-
The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.
HTTP Status Code: 404
- ServiceQuotaExceededException
-
The number of requests exceeds the service quota. Resubmit your request later.
HTTP Status Code: 400
- ThrottlingException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 429
- TooManyTagsException
-
The request contains more tags than can be associated with a resource (50 tags per resource). The maximum number of tags includes both existing tags and those included in your current request.
HTTP Status Code: 400
- ValidationException
-
Input validation failed. Check your request parameters and retry the request.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: