Using the Converse API
To use the Converse API, you call the Converse
or
ConverseStream
operations to send messages to a model. To call
Converse
, you require permission for the
bedrock:InvokeModel
operation. To call ConverseStream
,
you require permission for the bedrock:InvokeModelWithResponseStream
operation.
Request
When you make a Converse request with an Amazon Bedrock runtime endpoint, you can include the following fields:
-
modelId – A required parameter in the header that lets you specify the resource to use for inference.
-
The following fields let you customize the prompt:
-
messages – Use to specify the content and role of the prompts.
-
system – Use to specify system prompts, which define instructions or context for the model.
-
inferenceConfig – Use to specify inference parameters that are common to all models. Inference parameters influence the generation of the response.
-
additionalmodelRequestFields – Use to specify inference parameters that are specific to the model that you run inference with.
-
promptVariables – (If you use a prompt from Prompt management) Use this field to define the variables in the prompt to fill in and the values with which to fill them.
-
-
The following fields let you customize how the response is returned:
-
guardrailConfig – Use this field to include a guardrail to apply to the entire prompt.
-
toolConfig – Use this field to include a tool to help a model generate responses.
-
additionalModelResponseFieldPaths – Use this field to specify fields to return as a JSON pointer object.
-
-
requestMetadata – Use this field to include metadata that can be filtered on when using invocation logs.
Note
The following restrictions apply when you use a Prompt management prompt with Converse
or ConverseStream
:
-
You can't include the
additionalModelRequestFields
,inferenceConfig
,system
, ortoolConfig
fields. -
If you include the
messages
field, the messages are appended after the messages defined in the prompt. -
If you include the
guardrailConfig
field, the guardrail is applied to the entire prompt. If you includeguardContent
blocks in the ContentBlock field, the guardrail will only be applied to those blocks.
Expand a section to learn more about a field in the Converse
request body:
The messages
field is an array of Message objects, each of which defines a message between the user and the model. A Message
object contains the following fields:
-
role – Defines whether the message is from the
user
(the prompt sent to the model) orassistant
(the model response). -
content – Defines the content in the prompt.
Note
Amazon Bedrock doesn't store any text, images, or documents that you provide as content. The data is only used to generate the response.
You can maintain conversation context by including all the messages in the conversation in subsequent Converse
requests and using the role
field to specify whether the message is from the user or the model.
The content
field maps to an array of ContentBlock objects. Within each ContentBlock, you can specify one of the following fields (to see what models support what modalities, see Supported models and model features):
Note
The following restrictions pertain to the content
field:
-
You can include up to 20 images. Each image's size, height, and width must be no more than 3.75 MB, 8,000 px, and 8,000 px, respectively.
-
You can include up to five documents. Each document's size must be no more than 4.5 MB.
-
You can only include images and documents if the
role
isuser
.
In the following messages
example, the user asks for a list of three pop songs, and the
model generates a list of songs.
[ { "role": "user", "content": [ { "text": "Create a list of 3 pop songs." } ] }, { "role": "assistant", "content": [ { "text": "Here is a list of 3 pop songs by artists from the United Kingdom:\n\n1. \"As It Was\" by Harry Styles\n2. \"Easy On Me\" by Adele\n3. \"Unholy\" by Sam Smith and Kim Petras" } ] } ]
A system prompt is a type of prompt that provides instructions or context to the
model about the task it should perform, or the persona it should adopt during the
conversation. You can specify a list of system prompts for the request in the
system
(SystemContentBlock) field, as shown in the following example.
[ { "text": "You are an app that creates play lists for a radio station that plays rock and pop music. Only return song names and the artist. " } ]
The Converse API supports a base set of inference parameters that you set in the
inferenceConfig
field (InferenceConfiguration). The base set
of inference parameters are:
maxTokens – The maximum number of tokens to allow in the generated response.
stopSequences – A list of stop sequences. A stop sequence is a sequence of characters that causes the model to stop generating the response.
temperature – The likelihood of the model selecting higher-probability options while generating a response.
topP – The percentage of most-likely candidates that the model considers for the next token.
For more information, see Influence response generation with inference parameters.
The following example JSON sets the temperature
inference
parameter.
{"temperature": 0.5}
If the model you are using has additional inference parameters, you can set those
parameters by specifying them as JSON in the
additionalModelRequestFields
field. The following example JSON
shows how to set top_k
, which is available in Anthropic Claude
models, but isn't a base inference parameter in the messages API.
{"top_k": 200}
If you specify a prompt from Prompt management in the modelId
as the resource to run inference on, use this field to fill in the prompt variables with actual values. The promptVariables
field maps to a JSON object with keys that correspond to variables defined in the prompts and values to replace the variables with.
For example, let's say that you have a prompt that says Make me a
. The prompt's ID is {{genre}}
playlist consisting of the following number of songs: {{number}}
.PROMPT12345
and its version is 1
. You could send the following Converse
request to replace the variables:
POST /model/arn:aws:bedrock:us-east-1:111122223333:prompt/PROMPT12345:1/converse HTTP/1.1 Content-type: application/json { "promptVariables": { "genre" : "pop", "number": 3 } }
You can apply a guardrail that you created with Amazon Bedrock Guardrails by including this field. To apply the guardrail to a specific message in the conversation, include the message in a GuardrailConverseContentBlock. If you don't include any GuardrailConverseContentBlock
s in the request body, the guardrail is applied to all the messages in the messages
field. For an example, see Include a guardrail with Converse API .
This field lets you define a tool for the model to use to help it generate a response. For more information, see Use a tool to complete an Amazon Bedrock model response.
You can specify the paths for additional model parameters in the
additionalModelResponseFieldPaths
field, as shown in the following
example.
[ "/stop_sequence" ]
The API returns the additional fields that you request in the
additionalModelResponseFields
field.
This field maps to a JSON object. You can specify metadata keys and values that they map to within this object. You can use request metadata to help you filter model invocation logs.
You can also optionally add cache checkpoints to the system
or tools
fields to use
prompt caching, depending on which model you're using. For more information, see
Prompt caching for faster model inference.
Note
Amazon Bedrock prompt caching is currently only available to a select number of customers.
To learn more about participating in the preview, see Amazon Bedrock prompt caching
Response
The response you get from the Converse API depends on which operation you call, Converse
or ConverseStream
.
Converse response
In the response from Converse
, the output
field (ConverseOutput) contains the message (Message) that the model generates.
The message content is in the content
(ContentBlock) field and the role (user
or
assistant
) that the message corresponds to is in the role
field.
If you used prompt caching, then in the usage field,
cacheReadInputTokensCount
and cacheWriteInputTokensCount
tell
you how many total tokens were read from the cache and written to the cache, respectively.
The metrics
field (ConverseMetrics) includes
metrics for the call. To determine why the model stopped generating content, check
the stopReason
field. You can get information about the tokens passed
to the model in the request, and the tokens generated in the response, by checking the
usage
field (TokenUsage). If you specified additional
response fields in the request, the API returns them as JSON in the
additionalModelResponseFields
field.
The following example shows the response from Converse
when you pass the prompt discussed in Request.
{ "output": { "message": { "role": "assistant", "content": [ { "text": "Here is a list of 3 pop songs by artists from the United Kingdom:\n\n1. \"Wannabe\" by Spice Girls\n2. \"Bitter Sweet Symphony\" by The Verve \n3. \"Don't Look Back in Anger\" by Oasis" } ] } }, "stopReason": "end_turn", "usage": { "inputTokens": 125, "outputTokens": 60, "totalTokens": 185 }, "metrics": { "latencyMs": 1175 } }
ConverseStream response
If you call ConverseStream
to stream the response from
a model, the stream is returned in the stream
response field. The
stream emits the following events in the following order.
-
messageStart
(MessageStartEvent). The start event for a message. Includes the role for the message. -
contentBlockStart
(ContentBlockStartEvent). A Content block start event. Tool use only. -
contentBlockDelta
(ContentBlockDeltaEvent). A Content block delta event. Includes one of the following:-
text
– The partial text that the model generates. -
reasoningContent
– The partial reasoning carried out by the model to generate the response. You must submit the returnedsignature
, in addition to all previous messages in subsequentConverse
requests. If any of the messages are changed, the response throws an error. -
toolUse
– The partial input JSON object for tool use.
-
-
contentBlockStop
(ContentBlockStopEvent). A Content block stop event. -
messageStop
(MessageStopEvent). The stop event for the message. Includes the reason why the model stopped generating output. -
metadata
(ConverseStreamMetadataEvent). Metadata for the request. The metadata includes the token usage inusage
(TokenUsage) and metrics for the call inmetrics
(ConverseStreamMetadataEvent).
ConverseStream streams a complete content block as a ContentBlockStartEvent
event, one or more ContentBlockDeltaEvent
events, and a
ContentBlockStopEvent
event. Use the contentBlockIndex
field as an index
to correlate the events that make up a content block.
The following example is a partial response from
ConverseStream
.
{'messageStart': {'role': 'assistant'}} {'contentBlockDelta': {'delta': {'text': ''}, 'contentBlockIndex': 0}} {'contentBlockDelta': {'delta': {'text': ' Title'}, 'contentBlockIndex': 0}} {'contentBlockDelta': {'delta': {'text': ':'}, 'contentBlockIndex': 0}} . . . {'contentBlockDelta': {'delta': {'text': ' The'}, 'contentBlockIndex': 0}} {'messageStop': {'stopReason': 'max_tokens'}} {'metadata': {'usage': {'inputTokens': 47, 'outputTokens': 20, 'totalTokens': 67}, 'metrics': {'latencyMs': 100.0}}}