Create an application inference profile
You can create an application inference profile with one or more Regions to track usage and costs when invoking a model.
-
To create an application inference profile for one Region, specify a foundation model. Usage and costs for requests made to that Region with that model will be tracked.
-
To create an application inference profile for multiple Regions, specify a cross region (system-defined) inference profile. The inference profile will route requests to the Regions defined in the cross region (system-defined) inference profile that you choose. Usage and costs for requests made to the Regions in the inference profile will be tracked.
Currently, you can only create an inference profile using the Amazon Bedrock API.
To create an inference profile, send a CreateInferenceProfile request with an Amazon Bedrock control plane endpoint.
The following fields are required:
Field | Use case |
---|---|
inferenceProfileName | To specify a name for the inference profile. |
modelSource | To specify the foundation model or cross region (system-defined) inference profile that defines the model and Regions for which you want to track costs and usage. |
The following fields are optional:
Field | Use case |
---|---|
description | To provide a description for the inference profile. |
tags | To attach tags to the inference profile. For more information, see Tagging Amazon Bedrock resources and Organizing and tracking costs using AWS cost allocation tags. |
clientRequestToken | To ensure the API request completes only once. For more information, see Ensuring idempotency. |
The response returns an inferenceProfileArn
that can be used in other inference proflie-related actions and that can be used with model invocation and Amazon Bedrock resources.