RetrieveAndGenerate - Amazon Bedrock

RetrieveAndGenerate

Queries a knowledge base and generates responses based on the retrieved results and using the specified foundation model or inference profile. The response only cites sources that are relevant to the query.

Request Syntax

POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "string" }, "retrieveAndGenerateConfiguration": { "externalSourcesConfiguration": { "generationConfiguration": { "additionalModelRequestFields": { "string" : JSON value }, "guardrailConfiguration": { "guardrailId": "string", "guardrailVersion": "string" }, "inferenceConfig": { "textInferenceConfig": { "maxTokens": number, "stopSequences": [ "string" ], "temperature": number, "topP": number } }, "performanceConfig": { "latency": "string" }, "promptTemplate": { "textPromptTemplate": "string" } }, "modelArn": "string", "sources": [ { "byteContent": { "contentType": "string", "data": blob, "identifier": "string" }, "s3Location": { "uri": "string" }, "sourceType": "string" } ] }, "knowledgeBaseConfiguration": { "generationConfiguration": { "additionalModelRequestFields": { "string" : JSON value }, "guardrailConfiguration": { "guardrailId": "string", "guardrailVersion": "string" }, "inferenceConfig": { "textInferenceConfig": { "maxTokens": number, "stopSequences": [ "string" ], "temperature": number, "topP": number } }, "performanceConfig": { "latency": "string" }, "promptTemplate": { "textPromptTemplate": "string" } }, "knowledgeBaseId": "string", "modelArn": "string", "orchestrationConfiguration": { "additionalModelRequestFields": { "string" : JSON value }, "inferenceConfig": { "textInferenceConfig": { "maxTokens": number, "stopSequences": [ "string" ], "temperature": number, "topP": number } }, "performanceConfig": { "latency": "string" }, "promptTemplate": { "textPromptTemplate": "string" }, "queryTransformationConfiguration": { "type": "string" } }, "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { ... }, "implicitFilterConfiguration": { "metadataAttributes": [ { "description": "string", "key": "string", "type": "string" } ], "modelArn": "string" }, "numberOfResults": number, "overrideSearchType": "string", "rerankingConfiguration": { "bedrockRerankingConfiguration": { "metadataConfiguration": { "selectionMode": "string", "selectiveModeConfiguration": { ... } }, "modelConfiguration": { "additionalModelRequestFields": { "string" : JSON value }, "modelArn": "string" }, "numberOfRerankedResults": number }, "type": "string" } } } }, "type": "string" }, "sessionConfiguration": { "kmsKeyArn": "string" }, "sessionId": "string" }

URI Request Parameters

The request does not use any URI parameters.

Request Body

The request accepts the following data in JSON format.

input

Contains the query to be made to the knowledge base.

Type: RetrieveAndGenerateInput object

Required: Yes

retrieveAndGenerateConfiguration

Contains configurations for the knowledge base query and retrieval process. For more information, see Query configurations.

Type: RetrieveAndGenerateConfiguration object

Required: No

sessionConfiguration

Contains details about the session with the knowledge base.

Type: RetrieveAndGenerateSessionConfiguration object

Required: No

sessionId

The unique identifier of the session. When you first make a RetrieveAndGenerate request, Amazon Bedrock automatically generates this value. You must reuse this value for all subsequent requests in the same conversational session. This value allows Amazon Bedrock to maintain context and knowledge from previous interactions. You can't explicitly set the sessionId yourself.

Type: String

Length Constraints: Minimum length of 2. Maximum length of 100.

Pattern: ^[0-9a-zA-Z._:-]+$

Required: No

Response Syntax

HTTP/1.1 200 Content-type: application/json { "citations": [ { "generatedResponsePart": { "textResponsePart": { "span": { "end": number, "start": number }, "text": "string" } }, "retrievedReferences": [ { "content": { "byteContent": "string", "row": [ { "columnName": "string", "columnValue": "string", "type": "string" } ], "text": "string", "type": "string" }, "location": { "confluenceLocation": { "url": "string" }, "customDocumentLocation": { "id": "string" }, "kendraDocumentLocation": { "uri": "string" }, "s3Location": { "uri": "string" }, "salesforceLocation": { "url": "string" }, "sharePointLocation": { "url": "string" }, "sqlLocation": { "query": "string" }, "type": "string", "webLocation": { "url": "string" } }, "metadata": { "string" : JSON value } } ] } ], "guardrailAction": "string", "output": { "text": "string" }, "sessionId": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

citations

A list of segments of the generated response that are based on sources in the knowledge base, alongside information about the sources.

Type: Array of Citation objects

guardrailAction

Specifies if there is a guardrail intervention in the response.

Type: String

Valid Values: INTERVENED | NONE

output

Contains the response generated from querying the knowledge base.

Type: RetrieveAndGenerateOutput object

sessionId

The unique identifier of the session. When you first make a RetrieveAndGenerate request, Amazon Bedrock automatically generates this value. You must reuse this value for all subsequent requests in the same conversational session. This value allows Amazon Bedrock to maintain context and knowledge from previous interactions. You can't explicitly set the sessionId yourself.

Type: String

Length Constraints: Minimum length of 2. Maximum length of 100.

Pattern: ^[0-9a-zA-Z._:-]+$

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

The request is denied because of missing access permissions. Check your permissions and retry your request.

HTTP Status Code: 403

BadGatewayException

There was an issue with a dependency due to a server issue. Retry your request.

HTTP Status Code: 502

ConflictException

There was a conflict performing an operation. Resolve the conflict and retry your request.

HTTP Status Code: 409

DependencyFailedException

There was an issue with a dependency. Check the resource configurations and retry the request.

HTTP Status Code: 424

InternalServerException

An internal server error occurred. Retry your request.

HTTP Status Code: 500

ResourceNotFoundException

The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.

HTTP Status Code: 404

ServiceQuotaExceededException

The number of requests exceeds the service quota. Resubmit your request later.

HTTP Status Code: 400

ThrottlingException

The number of requests exceeds the limit. Resubmit your request later.

HTTP Status Code: 429

ValidationException

Input validation failed. Check your request parameters and retry the request.

HTTP Status Code: 400

Examples

Send a basic query

The following example uses the minimally required fields to generate a response after querying a knowledge base.

Sample Request

POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "What is AWS?" }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "knowledgeBaseId": "KB12345678", "modelArn": "anthropic.claude-v2:1" }, "type": "KNOWLEDGE_BASE" } }

Send a query and include filters

To include filters in a knowledge base query, at least one of the data source files must include a .metadata.json file. For example, if you had a data source of articles called articles.pdf, accompanied by a metadata file called articles.metadata.json, you could tag it for genre, year, and author. In the Retrieve request, you could apply the following filter to return all entertainment articles written after 2018, in addition to cooking or sports articles written by authors starting with C.

Sample Request

POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "What is AWS?", }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "knowledgeBaseId": "KB12345678", "modelArn": "anthropic.claude-v2:1", "retrievalConfiguration": { "vectorSearchConfiguration": { "numberOfResults": 5, "filter": { "orAll": [ { "andAll": [ { "equals": { "key": "genre", "value": "entertainment" } }, { "greaterThan": { "key": "year", "value": 2018 } } ] }, { "andAll": [ { "in": { "key": "genre", "value": ["cooking", "sports"] } }, { "startsWith": { "key": "author", "value": "C" } } ] } ] } } } }, "type": "KNOWLEDGE_BASE" } }

Use an inference profile when generating a response

The following example uses an inference profile in generating a response after querying a knowledge base.

Sample Request

POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "What is AWS?" }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "knowledgeBaseId": "KB12345678", "modelArn": "arn:aws:bedrock:us-west-2:123456789012:inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0" }, "type": "KNOWLEDGE_BASE" } }

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: