Code examples for Provisioned Throughput
The following code examples demonstrate how to create, use, and manage a Provisioned Throughput with the AWS CLI and the Python SDK.
- AWS CLI
-
Create a no-commitment Provisioned Throughput called
MyPT
based off a custom model calledMyCustomModel
that was customized from the Anthropic Claude v2.1 model by running the following command in a terminal.aws bedrock create-provisioned-model-throughput \ --model-units 1 \ --provisioned-model-name MyPT \ --model-id arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel
The response returns a
provisioned-model-arn
. Allow some time for the creation to complete. To check its status, provide the name or ARN of the provisioned model as theprovisioned-model-id
in the following command.aws bedrock get-provisioned-model-throughput \ --provisioned-model-id MyPT
Change the name of the Provisioned Throughput and associate it with a different model customized from Anthropic Claude v2.1.
aws bedrock update-provisioned-model-throughput \ --provisioned-model-id MyPT \ --desired-provisioned-model-name MyPT2 \ --desired-model-id arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel2
Run inference with your updated provisioned model with the following command. You must provide the ARN of the provisioned model, returned in the
UpdateProvisionedModelThroughput
response, as themodel-id
. The output is written to a file namedoutput.txt
in your current folder.aws bedrock-runtime invoke-model \ --model-id
${provisioned-model-arn}
\ --body '{"inputText": "What is AWS?", "textGenerationConfig": {"temperature": 0.5}}' \ --cli-binary-format raw-in-base64-out \ output.txtDelete the Provisioned Throughput using the following command. You'll no longer be charged for the Provisioned Throughput.
aws bedrock delete-provisioned-model-throughput --provisioned-model-id MyPT2
- Python (Boto)
-
Create a no-commitment Provisioned Throughput called
MyPT
based off a custom model calledMyCustomModel
that was customized from the Anthropic Claude v2.1 model by running the following code snippet.import boto3 bedrock = boto3.client(service_name='bedrock') bedrock.create_provisioned_model_throughput( modelUnits=1, provisionedModelName='MyPT', modelId='arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel' )
The response returns a
provisionedModelArn
. Allow some time for the creation to complete. You can check its status with the following code snippet. You can provide either the name of the Provisioned Throughput or the ARN returned from the CreateProvisionedModelThroughput response as theprovisionedModelId
.bedrock.get_provisioned_model_throughput(provisionedModelId='MyPT')
Change the name of the Provisioned Throughput and associate it with a different model customized from Anthropic Claude v2.1. Then send a GetProvisionedModelThroughput request and save the ARN of the provisioned model to a variable to use for inference.
bedrock.update_provisioned_model_throughput( provisionedModelId='MyPT', desiredProvisionedModelName='MyPT2', desiredModelId='arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel2' ) arn_MyPT2 = bedrock.get_provisioned_model_throughput(provisionedModelId='MyPT2').get('provisionedModelArn')
Run inference with your updated provisioned model with the following command. You must provide the ARN of the provisioned model as the
modelId
.import json import logging import boto3 from botocore.exceptions import ClientError class ImageError(Exception): "Custom exception for errors returned by the model" def __init__(self, message): self.message = message logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) def generate_text(model_id, body): """ Generate text using your provisioned custom model. Args: model_id (str): The model ID to use. body (str) : The request body to use. Returns: response (json): The response from the model. """ logger.info( "Generating text with your provisioned custom model %s", model_id) brt = boto3.client(service_name='bedrock-runtime') accept = "application/json" content_type = "application/json" response = brt.invoke_model( body=body, modelId=model_id, accept=accept, contentType=content_type ) response_body = json.loads(response.get("body").read()) finish_reason = response_body.get("error") if finish_reason is not None: raise ImageError(f"Text generation error. Error is {finish_reason}") logger.info( "Successfully generated text with provisioned custom model %s", model_id) return response_body def main(): """ Entrypoint for example. """ try: logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") model_id = arn_myPT2 body = json.dumps({ "inputText": "what is AWS?" }) response_body = generate_text(model_id, body) print(f"Input token count: {response_body['inputTextTokenCount']}") for result in response_body['results']: print(f"Token count: {result['tokenCount']}") print(f"Output text: {result['outputText']}") print(f"Completion reason: {result['completionReason']}") except ClientError as err: message = err.response["Error"]["Message"] logger.error("A client error occurred: %s", message) print("A client error occured: " + format(message)) except ImageError as err: logger.error(err.message) print(err.message) else: print( f"Finished generating text with your provisioned custom model {model_id}.") if __name__ == "__main__": main()
Delete the Provisioned Throughput with the following code snippet. You'll no longer be charged for the Provisioned Throughput.
bedrock.delete_provisioned_model_throughput(provisionedModelId='MyPT2')