Submit a single prompt with InvokeModel - Amazon Bedrock

Submit a single prompt with InvokeModel

Run inference on a model through the API by sending an InvokeModel or InvokeModelWithResponseStream request. To check if a model supports streaming, send a GetFoundationModel or ListFoundationModels request and check the value in the responseStreamingSupported field.

The following fields are required:

Field Use case
modelId To specify the model, inference profile, or prompt from Prompt management to use. To learn how to find this value, see Submit prompts and generate responses using the API.
body To specify the inference parameters for a model. To see inference parameters for different models, see Inference request parameters and response fields for foundation models. If you specify a prompt from Prompt management in the modelId field, omit this field (if you include it, it will be ignored).

The following fields are optional:

Field Use case
accept To specify the media type for the request body. For more information, see Media Types on the Swagger website.
contentType To specify the media type for the response body. For more information, see Media Types on the Swagger website.
explicitPromptCaching To specify whether prompt caching is enabled or disabled. For more information, see Prompt caching for faster model inference.
guardrailIdentifier To specify a guardrail to apply to the prompt and response. For more information, see Test a guardrail.
guardrailVersion To specify a guardrail to apply to the prompt and response. For more information, see Test a guardrail.
trace To specify whether to return the trace for the guardrail you specify. For more information, see Test a guardrail.

Invoke model code examples

The following examples show how to run inference with the InvokeModel API. For examples with different models, see the inference parameter reference for the desired model (Inference request parameters and response fields for foundation models).

CLI

The following example saves the generated response to the prompt story of two dogs to a file called invoke-model-output.txt.

aws bedrock-runtime invoke-model \ --model-id anthropic.claude-v2 \ --body '{"prompt": "\n\nHuman: story of two dogs\n\nAssistant:", "max_tokens_to_sample" : 300}' \ --cli-binary-format raw-in-base64-out \ invoke-model-output.txt
Python

The following example returns a generated response to the prompt explain black holes to 8th graders.

import boto3 import json brt = boto3.client(service_name='bedrock-runtime') body = json.dumps({ "prompt": "\n\nHuman: explain black holes to 8th graders\n\nAssistant:", "max_tokens_to_sample": 300, "temperature": 0.1, "top_p": 0.9, }) modelId = 'anthropic.claude-v2' accept = 'application/json' contentType = 'application/json' response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType) response_body = json.loads(response.get('body').read()) # text print(response_body.get('completion'))

Invoke model with streaming code example

Note

The AWS CLI does not support streaming.

The following example shows how to use the InvokeModelWithResponseStream API to generate streaming text with Python using the prompt write an essay for living on mars in 1000 words.

import boto3 import json brt = boto3.client(service_name='bedrock-runtime') body = json.dumps({ 'prompt': '\n\nHuman: write an essay for living on mars in 1000 words\n\nAssistant:', 'max_tokens_to_sample': 4000 }) response = brt.invoke_model_with_response_stream( modelId='anthropic.claude-v2', body=body ) stream = response.get('body') if stream: for event in stream: chunk = event.get('chunk') if chunk: print(json.loads(chunk.get('bytes').decode()))