InvokeModelWithResponseStreamCommand

Invoke the specified Amazon Bedrock model to run inference using the prompt and inference parameters provided in the request body. The response is returned in a stream.

To see if a model supports streaming, call GetFoundationModel  and check the responseStreamingSupported field in the response.

The CLI doesn't support streaming operations in Amazon Bedrock, including InvokeModelWithResponseStream.

For example code, see Invoke model with streaming code example in the Amazon Bedrock User Guide.

This operation requires permissions to perform the bedrock:InvokeModelWithResponseStream action.

To deny all inference access to resources that you specify in the modelId field, you need to deny access to the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream actions. Doing this also denies access to the resource through the Converse API actions (Converse  and ConverseStream ). For more information see Deny access for inference on specific models .

For troubleshooting some of the common errors you might encounter when using the InvokeModelWithResponseStream API, see Troubleshooting Amazon Bedrock API Error Codes  in the Amazon Bedrock User Guide

Example Syntax

Use a bare-bones client and the command you need to make an API call.

import { BedrockRuntimeClient, InvokeModelWithResponseStreamCommand } from "@aws-sdk/client-bedrock-runtime"; // ES Modules import
// const { BedrockRuntimeClient, InvokeModelWithResponseStreamCommand } = require("@aws-sdk/client-bedrock-runtime"); // CommonJS import
const client = new BedrockRuntimeClient(config);
const input = { // InvokeModelWithResponseStreamRequest
  body: new Uint8Array(), // e.g. Buffer.from("") or new TextEncoder().encode("")
  contentType: "STRING_VALUE",
  accept: "STRING_VALUE",
  modelId: "STRING_VALUE", // required
  trace: "ENABLED" || "DISABLED",
  guardrailIdentifier: "STRING_VALUE",
  guardrailVersion: "STRING_VALUE",
  performanceConfigLatency: "standard" || "optimized",
};
const command = new InvokeModelWithResponseStreamCommand(input);
const response = await client.send(command);
// { // InvokeModelWithResponseStreamResponse
//   body: { // ResponseStream Union: only one key present
//     chunk: { // PayloadPart
//       bytes: new Uint8Array(),
//     },
//     internalServerException: { // InternalServerException
//       message: "STRING_VALUE",
//     },
//     modelStreamErrorException: { // ModelStreamErrorException
//       message: "STRING_VALUE",
//       originalStatusCode: Number("int"),
//       originalMessage: "STRING_VALUE",
//     },
//     validationException: { // ValidationException
//       message: "STRING_VALUE",
//     },
//     throttlingException: { // ThrottlingException
//       message: "STRING_VALUE",
//     },
//     modelTimeoutException: { // ModelTimeoutException
//       message: "STRING_VALUE",
//     },
//     serviceUnavailableException: { // ServiceUnavailableException
//       message: "STRING_VALUE",
//     },
//   },
//   contentType: "STRING_VALUE", // required
//   performanceConfigLatency: "standard" || "optimized",
// };

InvokeModelWithResponseStreamCommand Input

See InvokeModelWithResponseStreamCommandInput for more details
InvokeModelWithResponseStreamCommandInput extends InvokeModelWithResponseStreamCommandInputType 

InvokeModelWithResponseStreamCommand Output

Parameter
Type
Description
$metadata
Required
ResponseMetadata
Metadata pertaining to this request.
body
Required
AsyncIterable<ResponseStream> | undefined

Inference response from the model in the format specified by the contentType header. To see the format and content of this field for different models, refer to Inference parameters .

contentType
Required
string | undefined

The MIME type of the inference result.

performanceConfigLatency
PerformanceConfigLatency | undefined

Model performance settings for the request.

Throws

Name
Fault
Details
AccessDeniedException
client

The request is denied because you do not have sufficient permissions to perform the requested action. For troubleshooting this error, see AccessDeniedException  in the Amazon Bedrock User Guide

InternalServerException
server

An internal server error occurred. For troubleshooting this error, see InternalFailure  in the Amazon Bedrock User Guide

ModelErrorException
client

The request failed due to an error while processing the model.

ModelNotReadyException
client

The model specified in the request is not ready to serve inference requests. The AWS SDK will automatically retry the operation up to 5 times. For information about configuring automatic retries, see Retry behavior  in the AWS SDKs and Tools reference guide.

ModelStreamErrorException
client

An error occurred while streaming the response. Retry your request.

ModelTimeoutException
client

The request took too long to process. Processing time exceeded the model timeout length.

ResourceNotFoundException
client

The specified resource ARN was not found. For troubleshooting this error, see ResourceNotFound  in the Amazon Bedrock User Guide

ServiceQuotaExceededException
client

Your request exceeds the service quota for your account. You can view your quotas at Viewing service quotas . You can resubmit your request later.

ServiceUnavailableException
server

The service isn't currently available. For troubleshooting this error, see ServiceUnavailable  in the Amazon Bedrock User Guide

ThrottlingException
client

Your request was denied due to exceeding the account quotas for Amazon Bedrock. For troubleshooting this error, see ThrottlingException  in the Amazon Bedrock User Guide

ValidationException
client

The input fails to satisfy the constraints specified by Amazon Bedrock. For troubleshooting this error, see ValidationError  in the Amazon Bedrock User Guide

BedrockRuntimeServiceException
Base exception class for all service exceptions from BedrockRuntime service.