# Test your knowledge base with queries and responses After you set up your knowledge base, you can test its behavior in the following ways: + Send queries and retrieving relevant information from your data sources, by using the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) operation. + Send queries and generate responses to the queries based on the retrieved information from your data sources, by using the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) operation. + Use a reranking model over the default Amazon Bedrock Knowledge Bases reranking model to retrieve more relevant sources when using either [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) or [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html). + Use optional metadata filters with the `Retrieve` or `RetrieveAndGenerate` API to specify which documents in your data source can be used. When you are satisfied with your knowledge base's behavior, you can then set up your application to query the knowledge base or attach the knowledge base to an agent by proceeding to [Deploy your knowledge base for your AI application](knowledge-base-deploy.md). Select a topic to learn more about it. **Topics** + [Query a knowledge base and retrieve data](kb-test-retrieve.md) + [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md) + [Generate a query for structured data](knowledge-base-generate-query.md) + [Query a knowledge base connected to an Amazon Kendra GenAI index](kb-test-kendra.md) + [Query a knowledge base connected to an Amazon Neptune Analytics graph](kb-test-neptune.md) + [Configure and customize queries and response generation](kb-test-config.md) + [Configure response generation for reasoning models with Knowledge Bases](kb-test-configure-reasoning.md) # Query a knowledge base and retrieve data **Important** Guardrails are applied only to the input and the generated response from the LLM. They are not applied to the references retrieved from Knowledge Bases at runtime. After your knowledge base is set up, you can query it and retrieve chunks from your source data that is relevant to the query by using the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) API operation. You can also [use a reranking model](rerank.md) instead of the default Amazon Bedrock Knowledge Bases ranker to rank source chunks for relevance during retrieval. To learn how to query your knowledge base, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] **To test your knowledge base** 1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock). 1. In the left navigation pane, choose **Knowledge bases**. 1. In the **Knowledge bases** section, do one of the following actions: + Choose the radio button next to the knowledge base you want to test and select **Test knowledge base**. A test window expands from the right. + Choose the knowledge base that you want to test. A test window expands from the right. 1. In the test window, clear **Generate responses for your query** to return information retrieved directly from your knowledge base. 1. (Optional) Select the configurations icon (![\[Three horizontal sliders with adjustable circular controls for settings or parameters.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/configurations.png)) to open up **Configurations**. For information about configurations, see [Configure and customize queries and response generation](kb-test-config.md). 1. Enter a query in the text box in the chat window and select **Run** to return responses from the knowledge base. 1. The source chunks are returned directly in order of relevance. Images extracted from your data source can also be returned as a source chunk. 1. To see details about the returned chunks, select **Show source details**. + To see the configurations that you set for query, expand **Query configurations**. + To view details about a source chunk, expand it by choosing the right arrow (![\[Play button icon with a triangular shape pointing to the right.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/caret-right-filled.png)) next to it. You can see the following information: + The raw text from the source chunk. To copy this text, choose the copy icon (![\[Icon representing a crop or resize function, with two overlapping rectangles.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/copy.png)). If you used Amazon S3 to store your data, choose the external link icon (![\[Icon of a square with an arrow pointing outward from its top-right corner.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/external.png)) to navigate to the S3 object containing the file. + The metadata associated with the source chunk, if you used Amazon S3 to store your data. The attribute/field keys and values are defined in the `.metadata.json` file that's associated with the source document. For more information, see the **Metadata and filtering** section in [Configure and customize queries and response generation](kb-test-config.md). **Chat options** + Switch to generating responses based on the retrieved source chunks by turning on **Generate responses**. If you change the setting, the text in the chat window will be completely cleared. + To clear the chat window, select the broom icon (![\[Magnifying glass icon with a checkmark inside, symbolizing search or inspection.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/broom.png)). + To copy all the output in the chat window, select the copy icon (![\[Icon representing a crop or resize function, with two overlapping rectangles.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/copy.png)). ------ #### [ API ] To query a knowledge base and only return relevant text from data sources, send a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) request with an [Agents for Amazon Bedrock runtime endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-rt). The following fields are required: **** | Field | Basic description | | --- | --- | | knowledgeBaseId | To specify the knowledge base to query. | | retrievalQuery | Contains a text field to specify the query. | | guardrailsConfiguration | Include guardrailsConfiguration fields such as guardrailsId and guardrailsVersion to use your guardrail with the request | The following fields are optional: **** | Field | Use case | | --- | --- | | nextToken | To return the next batch of responses (see response fields below). | | retrievalConfiguration | To include [query configurations](kb-test-config.md) for customizing the vector search. See [KnowledgeBaseVectorSearchConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html) for more information. | You can use a reranking model over the default Amazon Bedrock Knowledge Bases ranking model by including the `rerankingConfiguration` field in the [KnowledgeBaseVectorSearchConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html). The `rerankingConfiguration` field maps to a [VectorSearchRerankingConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_VectorSearchRerankingConfiguration.html) object, in which you can specify the reranking model to use, any additional request fields to include, metadata attributes to filter out documents during reranking, and the number of results to return after reranking. For more information, see [VectorSearchRerankingConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_VectorSearchRerankingConfiguration.html). **Note** If you the `numberOfRerankedResults` value that you specify is greater than the `numberOfResults` value in the [KnowledgeBaseVectorSearchConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html), the maximum number of results that will be returned is the value for `numberOfResults`. An exception is if you use query decomposition (for more information, see the **Query modifications** section in [Configure and customize queries and response generation](kb-test-config.md). If you use query decomposition, the `numberOfRerankedResults` can be up to five times the `numberOfResults`. The response returns the source chunks from the data source as an array of [KnowledgeBaseRetrievalResult](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalResult.html) objects in the `retrievalResults` field. Each [KnowledgeBaseRetrievalResult](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalResult.html) contains the following fields: **** | Field | Description | | --- | --- | | content | Contains a text source chunk in the text or an image source chunk in the byteContent field. If the content is an image, the data URI of the base64-encoded content is returned in the following format: data:image/jpeg;base64,\$1\$1base64-encoded string\$1. | | metadata | Contains each metadata attribute as a key and the metadata value as a JSON value that the key maps to. | | location | Contains the URI or URL of the document that the source chunk belongs to. | | score | The relevancy score of the document. You can use this score to analyze the ranking of results. | If the number of source chunks exceeds what can fit in the response, a value is returned in the `nextToken` field. Use that value in another request to return the next batch of results. If the retrieved data contains images, the response also returns the following response headers, which contain metadata for source chunks returned in the response: + `x-amz-bedrock-kb-byte-content-source` – Contains the Amazon S3 URI of the image. + `x-amz-bedrock-kb-description` – Contains the base64-encoded string for the image. **Note** You can't filter on these metadata response headers when [configuring metadata filters](kb-test-config.md). **Multimodal queries** For knowledge bases using multimodal embedding models, you can query with either text or images. The `retrievalQuery` field supports a `multimodalInputList` field for image queries: **Note** For comprehensive guidance on setting up and working with multimodal knowledge bases, including choosing between Nova and BDA approaches, see [Build a knowledge base for multimodal content](kb-multimodal.md). You can query with images by using the `multimodalInputList` field: ``` { "knowledgeBaseId": "EXAMPLE123", "retrievalQuery": { "multimodalInputList": [ { "content": { "byteContent": "base64-encoded-image-data" }, "modality": "IMAGE" } ] } } ``` Or you can query with text only by using the `text` field: ``` { "knowledgeBaseId": "EXAMPLE123", "retrievalQuery": { "text": "Find similar shoes" } } ``` **Common multimodal query patterns** Following are some common query patterns: Image-to-image search Upload an image to find visually similar images. Example: Upload a photo of a red Nike shoe to find similar shoes in your product catalog. Text-based search Use text queries to find relevant content. Example: "Find similar shoes" to search your product catalog using text descriptions. Visual document search Search for charts, diagrams, or visual elements within documents. Example: Upload a chart image to find similar charts in your document collection. **Choosing between Nova and BDA for multimodal content** When working with multimodal content, choose your approach based on your content type and query patterns: **Nova vs BDA Decision Matrix** | Content Type | Use Nova Multimodal Embeddings | Use Bedrock Data Automation (BDA) Parser | | --- | --- | --- | | Video Content | Visual storytelling focus (sports, ads, demonstrations), queries on visual elements, minimal speech content | Important speech/narration (presentations, meetings, tutorials), queries on spoken content, need transcripts | | Audio Content | Music or sound effects identification, non-speech audio analysis | Podcasts, interviews, meetings, any content with speech requiring transcription | | Image Content | Visual similarity searches, image-to-image retrieval, visual content analysis | Text extraction from images, document processing, OCR requirements | **Note** Nova multimodal embeddings cannot process speech content directly. If your audio or video files contain important spoken information, use the BDA parser to convert speech to text first, or choose a text embedding model instead. **Multimodal query limitations** Following are some limitations with multimodal queries: + Maximum of one image per query in the current release + Image queries are only supported with multimodal embedding models (Titan G1 or Cohere Embed v3) + RetrieveAndGenerate API is not supported for knowledge bases with multimodal embedding models and S3 content buckets + If you provide an image query to a knowledge base using text-only embedding models, a 4xx error will be returned **Multimodal API response structure** Retrieval responses for multimodal content include additional metadata: + **Source URI:** Points to your original S3 bucket location + **Supplemental URI:** Points to the copy in your multimodal storage bucket + **Timestamp metadata:** Included for video and audio chunks to enable precise playback positioning **Note** When using the API or SDK, you'll need to handle file retrieval and timestamp navigation in your application. The console handles this automatically with enhanced video playback and automatic timestamp navigation. ------ **Note** If you receive an error that the prompt exceeds the character limit while generating responses, you can shorten the prompt in the following ways: Reduce the maximum number of retrieved results (this shortens what is filled in for the \$1search\$1results\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). Recreate the data source with a chunking strategy that uses smaller chunks (this shortens what is filled in for the \$1search\$1results\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). Shorten the prompt template. Shorten the user query (this shortens what is filled in for the \$1query\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). # Query a knowledge base and generate responses based off the retrieved data **Important** Guardrails are applied only to the input and the generated response from the LLM. They are not applied to the references retrieved from Knowledge Bases at runtime. After your knowledge base is set up, you can query it and generate responses based on the chunks retrieved from your source data by using the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API operation. The responses are returned with citations to the original source data. You can also [use a reranking model](rerank.md) instead of the default Amazon Bedrock Knowledge Bases ranker to rank source chunks for relevance during retrieval. **Multimodal content limitations** `RetrieveAndGenerate` has limited support for multimodal content. When using Nova Multimodal Embeddings, RAG functionality is restricted to text content only. For full multimodal support including audio and video processing, use BDA with text embedding models. For details, see [Build a knowledge base for multimodal content](kb-multimodal.md). **Note** Images returned from the `Retrieve` response during the `RetrieveAndGenerate` flow are included in the prompt for response generation. The `RetrieveAndGenerate` response can't include images, but it can cite the sources that contain the images. To learn how to query your knowledge base, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] **To test your knowledge base** 1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock). 1. In the left navigation pane, choose **Knowledge bases**. 1. In the **Knowledge bases** section, do one of the following actions: + Choose the radio button next to the knowledge base you want to test and select **Test knowledge base**. A test window expands from the right. + Choose the knowledge base that you want to test. A test window expands from the right. 1. To generate responses based on information retrieved from your knowledge base, turn on **Generate responses for your query**. Amazon Bedrock will generate responses based on your data sources and cites the information it provides with footnotes. 1. To choose a model to use for response generation, choose **Select model**. Then select **Apply**. 1. (Optional) Select the configurations icon (![\[Three horizontal sliders with adjustable circular controls for settings or parameters.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/configurations.png)) to open up **Configurations**. For information about configurations, see [Configure and customize queries and response generation](kb-test-config.md). 1. Enter a query in the text box in the chat window and select **Run** to return responses from the knowledge base. 1. Select a footnote to see an excerpt from the cited source for that part of the response. Choose the link to navigate to the S3 object containing the file. 1. To see details about the returned chunks, select **Show source details**. + To see the configurations that you set for query, expand **Query configurations**. + To view details about a source chunk, expand it by choosing the right arrow (![\[Play button icon with a triangular shape pointing to the right.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/caret-right-filled.png)) next to it. You can see the following information: + The raw text from the source chunk. To copy this text, choose the copy icon (![\[Icon representing a crop or resize function, with two overlapping rectangles.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/copy.png)). If you used Amazon S3 to store your data, choose the external link icon (![\[Icon of a square with an arrow pointing outward from its top-right corner.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/external.png)) to navigate to the S3 object containing the file. + The metadata associated with the source chunk, if you used Amazon S3 to store your data. The attribute/field keys and values are defined in the `.metadata.json` file that's associated with the source document. For more information, see the **Metadata and filtering** section in [Configure and customize queries and response generation](kb-test-config.md). **Chat options** + To use a different model for response generation, Select **Change model**. If you change the model, the text in the chat window will be completely cleared. + Switch to retrieving source chunks directly by clearing **Generate responses**. If you change the setting, the text in the chat window will be completely cleared. + To clear the chat window, select the broom icon (![\[Magnifying glass icon with a checkmark inside, symbolizing search or inspection.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/broom.png)). + To copy all the output in the chat window, select the copy icon (![\[Icon representing a crop or resize function, with two overlapping rectangles.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/copy.png)). ------ #### [ API ] To query a knowledge base and use a foundation model to generate responses based off the results from the data sources, send a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request with a [Agents for Amazon Bedrock runtime endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-rt). The [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html) API returns data in a streaming format and allows you to access the generated responses in chunks without waiting for the entire result. The following fields are required: **Note** The API response contains citation events. The `citation` member has been deprecated. We recommend that you use the `generatedResponse` and `retrievedReferences` fields instead. For reference, see [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_CitationEvent.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_CitationEvent.html). **** | Field | Basic description | | --- | --- | | input | Contains a text field to specify the query. | | retrieveAndGenerateConfiguration | Contains a [RetrieveAndGenerateConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateConfiguration.html), which specifies configurations for retrieval and generation. See below for more details. | The following fields are optional: **** | Field | Use case | | --- | --- | | sessionId | Use the same value as a previous session to continue that session and maintain context from it for the model. | | sessionConfiguration | To include a custom KMS key for encryption of the session. | Include the `knowledgeBaseConfiguration` field in the [RetrieveAndGenerateConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateConfiguration.html). This field maps to a [KnowledgeBaseRetrieveAndGenerateConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrieveAndGenerateConfiguration.html) object, which contains the following fields: + The following fields are required: **** [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/bedrock/latest/userguide/kb-test-retrieve-generate.html) + The following fields are optional: **** [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/bedrock/latest/userguide/kb-test-retrieve-generate.html) You can use a reranking model over the default Amazon Bedrock Knowledge Bases ranking model by including the `rerankingConfiguration` field in the [KnowledgeBaseVectorSearchConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html) within the [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html). The `rerankingConfiguration` field maps to a [VectorSearchRerankingConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_VectorSearchRerankingConfiguration.html) object, in which you can specify the reranking model to use, any additional request fields to include, metadata attributes to filter out documents during reranking, and the number of results to return after reranking. For more information, see [VectorSearchRerankingConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_VectorSearchRerankingConfiguration.html). **Note** If you the `numberOfRerankedResults` value that you specify is greater than the `numberOfResults` value in the [KnowledgeBaseVectorSearchConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseVectorSearchConfiguration.html), the maximum number of results that will be returned is the value for `numberOfResults`. An exception is if you use query decomposition (for more information, see the **Query modifications** section in [Configure and customize queries and response generation](kb-test-config.md). If you use query decomposition, the `numberOfRerankedResults` can be up to five times the `numberOfResults`. The response returns the generated response in the `output` field and the cited source chunks as an array in the `citations` field. Each [Citation](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Citation.html) object contains the following fields. **** | Field | Basic description | | --- | --- | | generatedResponsePart | In the textResponsePart field, the text that the citation pertains to is included. The span field provides the indexes for the beginning and end of the part of the output that has a citation. | | retrievedReferences | An array of [RetrievedReference](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievedReference.html) objects, each of which contains the content of a source chunk, metadata associated with the document, and the URI or URL location of the document in the data source. If the content is an image, the data URI of the base64-encoded content is returned in the following format: data:image/jpeg;base64,\$1\$1base64-encoded string\$1. | The response also returns a `sessionId` value, which you can reuse in another request to maintain the same conversation. If you included a `guardrailConfiguration` in the request, the `guardrailAction` field informs you if the content was blocked or not. If the retrieved data contains images, the response also returns the following response headers, which contain metadata for source chunks returned in the response: + `x-amz-bedrock-kb-byte-content-source` – Contains the Amazon S3 URI of the image. + `x-amz-bedrock-kb-description` – Contains the base64-encoded string for the image. **Note** You can't filter on these metadata response headers when [configuring metadata filters](kb-test-config.md). ------ **Note** If you receive an error that the prompt exceeds the character limit while generating responses, you can shorten the prompt in the following ways: Reduce the maximum number of retrieved results (this shortens what is filled in for the \$1search\$1results\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). Recreate the data source with a chunking strategy that uses smaller chunks (this shortens what is filled in for the \$1search\$1results\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). Shorten the prompt template. Shorten the user query (this shortens what is filled in for the \$1query\$1 placeholder in the [Knowledge base prompt templates: orchestration & generation](kb-test-config.md#kb-test-config-prompt-template)). # Generate a query for structured data When you connect a structured data store to your knowledge base, your knowledge base can query it by converting the natural language query provided by the user into an SQL query, based on the structure of the data source being queried. When you use: + [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html): The response returns the result of the SQL query execution. + [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html): The generated response is based on the result of the SQL query execution. + [GenerateQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerateQuery.html): Amazon Bedrock Knowledge Bases decouples the conversion of the query from the retrieval process. You can use this API operation to transform a query into SQL. ## Using the `GenerateQuery` API You can use the response from the `GenerateQuery` API operation with a subsequent `Retrieve` or `RetrieveAndGenerate` action, or insert it into other workflows. `GenerateQuery` allows you to efficiently transform queries into SQL queries by taking into consideration the structure of your knowledge base's data source. To turn a natural language query into a SQL query, submit a [GenerateQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerateQuery.html) request with an [Agents for Amazon Bedrock runtime endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-rt). The `GenerateQuery` request contains the following fields: + queryGenerationInput – Specify `TEXT` as the `type` and include the query in the `text` field. **Note** Queries must be written in English. + transformationConfiguration – Specify `TEXT_TO_SQL` as the `mode`. In the `textToSqlConfiguration` field, specify `KNOWLEDGE_BASE` as the `type`. Then, specify the ARN of the knowledge base. The response returns an array containing a [GeneratedQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GeneratedQuery.html) object in the `queries` field. The object contains an SQL query for the query in the `sql` field. ## Key considerations The following are some key considerations when generating a query using structured data. + **Cross-region inference and structured data retrieval** Structured data retrieval uses cross-Region inference to select the optimal AWS Region within your geography to process your inference request. This doesn't incur any additional charges, and improves customer experience by maximizing available resources and model availability. Cross-inference requests are kept within the AWS Regions that are part of the geography where the data originally resides. Your data remains stored within the source Region but the input prompts and output results might move outside of this Region. All data will be transmitted encrypted across Amazon’s secure network. For more information, see [Increase throughput with cross-Region inference](cross-region-inference.md). + **Accuracy of generated SQL queries** The accuracy of a generated SQL query can vary depending on context, table schemas, and the intent of a user query. Evaluate the generated queries to ensure that they suit your use case before using them in your workload. + **Number of retrieved results** The following limitations apply when generating the response. + When using the `InvokeAgent`, `RetrieveAndGenerate`, and `RetrieveAndGenerateStream` API operations, only 10 retrieved results are used when generating the response. + When using the `InvokeAgent` API, if there are more than 10 rows of retrieved results, the total number of retrieved rows is not passed to the agent for generating the response. If you use the `RetrieveAndGenerate` API instead, the total number of rows is included in the prompt for generating the final response. + **`GenerateQuery` API quota** The `GenerateQuery` API has a quota of 2 requests per second. ## Grant a role permissions to access generated queries For your knowledge base that's conencted to a structured data source, if you want to to perform some additional operations on the generated squeries, then you must grant permissions to perform the `GenerateQuery` API action. To allow your IAM role to query a knowledge base connected to a structured data store, attach the following policy to the role: ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Sid": "GetKB", "Effect": "Allow", "Action": [ "bedrock:GetKnowledgeBase" ], "Resource": [ "arn:aws:bedrock:us-east-1:123456789012:knowledge-base/KnowledgeBaseId" ] }, { "Sid": "GenerateQueryAccess", "Effect": "Allow", "Action": [ "bedrock:GenerateQuery", "sqlworkbench:GetSqlRecommendations" ], "Resource": "*" }, { "Sid": "Retrieve", "Effect": "Allow", "Action": [ "bedrock:Retrieve" ], "Resource": [ "arn:aws:bedrock:us-east-1:123456789012:knowledge-base/KnowledgeBaseId" ] }, { "Sid": "RetrieveAndGenerate", "Effect": "Allow", "Action": [ "bedrock:RetrieveAndGenerate" ], "Resource": [ "*" ] } ] } ``` ------ You can remove statements that you don't need, depending on your use case: + The `GetKB` and `GenerateQuery` statements are required to call [GenerateQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerateQuery.html) to generate SQL queries that take into account user queries and your connected data source. + The `Retrieve` statement is required to call [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) to retrieve data from your structured data store. + The `RetrieveAndGenerate` statement is required to call [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) to retrieve data from your structured data store and generate responses based off the data. # Query a knowledge base connected to an Amazon Kendra GenAI index You can query a knowledge base that uses an Amazon Kendra GenAI index, and return only relevant text from data sources. For this query, send a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) request with an [Agents for Amazon Bedrock runtime endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-rt), like with a standard knowledge base. The structure of a response returned from a knowledge base with an Amazon Kendra GenAI index is the same as a standard [ KnowledgeBaseRetrievalResult](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalResult.html). However, the response also includes a few additional fields from Amazon Kendra. The following table describes the fields from Amazon Kendra that you might see in a returned response. Amazon Bedrock gets these fields from the Amazon Kendra response. If that response doesn't contain these fields, then the returned query result from Amazon Bedrock won't have these fields either. | Field | Description | | --- | --- | | x-amz-kendra-document-title | The title of the returned document. | | x-amz-kendra-score-confidence | A relative ranking of how relevant the response is to the query. Possible values are VERY\$1HIGH, HIGH, MEDIUM, LOW, and NOT\$1AVAILABLE. | | x-amz-kendra-passage-id | The ID of the returned passage. | | x-amz-kendra-document-id | The ID of the returned document. | | DocumentAttributes | Document attributes or metadata fields from Amazon Kendra. The returned query result from the knowledge base stores these as metadata key-value pairs. You can filter the results with metadata filtering from Amazon Bedrock. For more information, see [DocumentAttribute](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttribute.html). | # Query a knowledge base connected to an Amazon Neptune Analytics graph You can query a knowledge base that uses an Amazon Neptune Analytics graph, and return only relevant text from data sources. For this query, send a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) request with an [Agents for Amazon Bedrock runtime endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-rt), like with a standard knowledge base. For information about querying a knowledge base and retrieving data and generating responses, see: + [Query a knowledge base and retrieve data](kb-test-retrieve.md) + [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md) The structure of a response returned from a knowledge base with an Amazon Neptune Analytics graph, is the same as a standard [ KnowledgeBaseRetrievalResult](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalResult.html). However, the response also includes a few additional fields from Amazon Neptune. The following table describes the fields from Neptune Analytics that you might see in a returned response. Amazon Bedrock gets these fields from the Neptune Analytics response. If that response doesn't contain these fields, then the returned query result from Amazon Bedrock won't have these fields either. | Field | Description | | --- | --- | | x-amz-bedrock-kb-source-uri | The Amazon S3 URL of the returned document. | | score | A distance measure that indicates how closely a response matches the provided query, where lower values indicate better matches. | | x-amz-bedrock-kb-data-source-id | The ID of the data source used for the knowledge base. | | x-amz-bedrock-kb-chunk-id | The ID of the chunk that was used to retrieve the information for the query and generate the response. | | DocumentAttributes | Document attributes or metadata fields from Amazon Kendra. The returned query result from the knowledge base stores these as metadata key-value pairs. You can filter the results with metadata filtering from Amazon Bedrock. | ## Using metadata and filtering When you query the knowledge base and generate responses, you can filter on metadata for finding more relevant documents. For example, you can filter based on the publication date of the document. You can use the Amazon Bedrock console or the runtime API [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html) for this purpose, which can specify some general filter conditions. The following are some considerations for using the `RetrievalFilter` API for Neptune Analytics graphs. + The `startsWith` and `listContains` filters are not supported. + The list variant of the `stringContains` filter is not supported. The following shows an example: ``` "vectorSearchConfiguration": { "numberOfResults": 5, "filter": { "orAll": [ { "andAll": [ { "equals": { "key": "genre", "value": "entertainment" } }, { "greaterThan": { "key": "year", "value": 2018 } } ] }, { "andAll": [ { "startsWith": { "key": "author", "value": "C" } } ] } ] } } } ``` # Configure and customize queries and response generation You can configure and customize retrieval and response generation, further improving the relevancy of responses. For example, you can apply filters to document metadata fields/attributes to use the most recently updated documents or documents with recent modification times. **Note** All of the following configurations, except for **Orchestration and generation**, are only applicable to unstructured data sources. To learn more about these configurations in the console or the API, select from the following topics: ## Number of source chunks When you query a knowledge base, Amazon Bedrock returns up to five results in the response by default. Each result corresponds to a source chunk. **Note** The actual number of results in the response might be less than the specified `numberOfResults` value, since this parameter sets the maximum number of results to return. If you have configured hierarchical chunking for your chunking strategy, the `numberOfResults` parameter maps to the number of child chunks that the knowledge base will retrieve. Since child chunks that share the same parent chunk are replaced with the parent chunk in the final response, the number of results returned might be less than the requested amount. To modify the maximum number of results to return, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). In the **Configurations** pane, expand the **Source chunks** section and enter the maximum number of source chunks to return. ------ #### [ API ] When you make a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) or [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request, include a `retrievalConfiguration` field, mapped to a [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object. To see the location of this field, refer to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) and [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request bodies in the API reference. The following JSON object shows the minimal fields required in the [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object to set the maximum number of results to return: ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "numberOfResults": number } } ``` Specify the maximum number of retrieved results (see the `numberOfResults` field in [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) for the range of accepted values) to return in the `numberOfResults` field. ------ ## Search type The search type defines how data sources in the knowledge base are queried. The following search types are possible: **Note** Hybrid search is only supported for Amazon RDS, Amazon OpenSearch Serverless, and MongoDB vector stores that contain a filterable text field. If you use a different vector store or your vector store doesn't contain a filterable text field, the query uses semantic search. + **Default** – Amazon Bedrock decides the search strategy for you. + **Hybrid** – Combines searching vector embeddings (semantic search) with searching through the raw text. + **Semantic** – Only searches vector embeddings. To learn how to define the search type, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). When you open the **Configurations** pane, expand the **Search type** section, turn on **Override default search**, and select an option. ------ #### [ API ] When you make a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) or [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request, include a `retrievalConfiguration` field, mapped to a [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object. To see the location of this field, refer to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) and [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request bodies in the API reference. The following JSON object shows the minimal fields required in the [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object to set search type configurations: ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "overrideSearchType": "HYBRID | SEMANTIC" } } ``` Specify the search type in the `overrideSearchType` field. You have the following options: + If you don't specify a value, Amazon Bedrock decides which search strategy is best suited for your vector store configuration. + **HYBRID** – Amazon Bedrock queries the knowledge base using both the vector embeddings and the raw text. + **SEMANTIC** – Amazon Bedrock queries the knowledge base using its vector embeddings. ------ ## Streaming ------ #### [ Console ] Follow the console steps at [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). When you open the **Configurations** pane, expand the **Streaming preference** section and turn on **Stream response**. ------ #### [ API ] To stream responses, use the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html) API. For more details about filling out the fields, see the **API** tab at [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). ------ ## Manual metadata filtering You can apply filters to document fields/attributes to help you further improve the relevancy of responses. Your data sources can include document metadata attributes/fields to filter on and can specify which fields to include in the embeddings. For example, "epoch\$1modification\$1time" represents the time in number of seconds since January 1, 1970 (UTC) when the document was last updated. You can filter on the most recent data, where "epoch\$1modification\$1time" is *greater than* a certain number. These most recent documents can be used for the query. To use filters when querying a knowledge base, check that your knowledge base fulfills the following requirements: + When configuring your data source connector, most connectors crawl the main metadata fields of your documents. If you're using an Amazon S3 bucket as your data source, the bucket must include at least one `fileName.extension.metadata.json` for the file or document it's associated with. See **Document metadata fields** in [Connection configuration](s3-data-source-connector.md#configuration-s3-connector) for more information about configuring the metadata file. + If your knowledge base's vector index is in an Amazon OpenSearch Serverless vector store, check that the vector index is configured with the `faiss` engine. If the vector index is configured with the `nmslib` engine, you'll have to do one of the following: + [Create a new knowledge base](knowledge-base-create.md) in the console and let Amazon Bedrock automatically create a vector index in Amazon OpenSearch Serverless for you. + [Create another vector index](knowledge-base-setup.md) in the vector store and select `faiss` as the **Engine**. Then [Create a new knowledge base](knowledge-base-create.md) and specify the new vector index. + If your knowledge base uses a vector index in an S3 vector bucket, you cannot use the `startsWith` and `stringContains` filters. + If you're adding metadata to an existing vector index in an Amazon Aurora database cluster, we recommend that you provide the field name of the custom metadata column to store all your metadata in a single column. During [data ingestion](kb-data-source-sync-ingest.md), this column will be used to populate all the information in your metadata files from your data sources. If you choose to provide this field, you must create an index on this column. + When you [create a new knowledge base](knowledge-base-create.md) in the console and let Amazon Bedrock configure your Amazon Aurora database, it will automatically create a single column for you and populate it with the information from your metadata files. + When you choose to [create another vector index](knowledge-base-setup.md) in the vector store, you must provide the custom metadata field name to store information from your metadata files. If you don't provide this field name, you must create a column for each metadata attribute in your files and specify the data type (text, number, or boolean). For example, if the attribute `genre` exists in your data source, you would add a column named `genre` and specify `text` as the data type. During ingestion, these separate columns will be populated with the corresponding attribute values. If you have PDF documents in your data source and use Amazon OpenSearch Serverless for your vector store: Amazon Bedrock knowledge bases will generate document page numbers and store them in a metadata field/attribute called *x-amz-bedrock-kb-document-page-number*. Note that page numbers stored in a metadata field is not supported if you choose no chunking for your documents. You can use the following filtering operators to filter results when you query: **Filtering operators** | Operator | Console | API filter name | Supported attribute data types | Filtered results | | --- | --- | --- | --- | --- | | Equals | = | [equals](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-equals) | string, number, boolean | Attribute matches the value you provide | | Not equals | \$1= | [notEquals](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-notEquals) | string, number, boolean | Attribute doesn’t match the value you provide | | Greater than | > | [greaterThan](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-greaterThan) | number | Attribute is greater than the value you provide | | Greater than or equals | >= | [greaterThanOrEquals](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-greaterThanOrEquals) | number | Attribute is greater than or equal to the value you provide | | Less than | < | [lessThan](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-lessThan) | number | Attribute is less than the value you provide | | Less than or equals | <= | [lessThanOrEquals](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-lessThanOrEquals) | number | Attribute is less than or equal to the value you provide | | In | : | [in](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-in) | string list | Attribute is in the list you provide (currently best supported with Amazon OpenSearch Serverless and Neptune Analytics GraphRAG vector stores) | | Not in | \$1: | [notIn](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-notIn) | string list | Attribute isn’t in the list you provide (currently best supported with Amazon OpenSearch Serverless and Neptune Analytics GraphRAG vector stores) | | String contains | Not available | [stringContains](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-stringContains) | string | Attribute must be a string. Attribute name matches the key and whose value is a string that contains the value that you provided as a substring, or a list with a member that contains the value that you provided as a substring (currently best supported with Amazon OpenSearch Serverless vector store. The Neptune Analytics GraphRAG vector store supports the string variant but not the list variant of this filter). | | List contains | Not available | [listContains](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-listContains) | string | Attribute must be a string list. Attribute name matches the key and whose value is a list that contains the value that you provided as one of its members (currently best supported with Amazon OpenSearch Serverless vector stores). | To combine filtering operators, you can use the following logical operators: **Logical operators** | Operator | Console | API filter field name | Filtered results | | --- | --- | --- | --- | | And | and | [andAll](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-andAll) | Results fulfill all of the filtering expressions in the group | | Or | or | [orAll](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrievalFilter.html#bedrock-Type-agent-runtime_RetrievalFilter-orAll) | Results fulfill at least one of the filtering expressions in the group | To learn how to filter results using metadata, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). When you open the **Configurations** pane, you'll see a **Filters section**. The following procedures describe different use cases: + To add a filter, create a filtering expression by entering a metadata attribute, filtering operator, and value in the box. Separate each part of the expression with a whitespace. Press **Enter** to add the filter. For a list of accepted filtering operators, see the **Filtering operators** table above. You can also see a list of filtering operators when you add a whitespace after the metadata attribute. **Note** You must surround strings with quotation marks. For example, you can filter for results from source documents that contain a `genre` metadata attribute whose value is `"entertainment"` by adding the following filter: **genre = "entertainment"**. ![\[Add one filter.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-one.png) + To add another filter, enter another filtering expression in the box and press **Enter**. You can add up to 5 filters in the group. ![\[Add another filter.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-more.png) + By default, the query will return results that fulfill all the filtering expressions you provide. To return results that fulfill at least one of the filtering expressions, choose the **and** dropdown menu between any two filtering operations and select **or**. ![\[Change the logical operation between filters.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-logical.png) + To combine different logical operators, select **\$1 Add Group** to add a filter group. Enter filtering expressions in the new group. You can add up to 5 filter groups. ![\[Add a filter group to combine different logical operators.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-group.png) + To change the logical operator used between all the filtering groups, choose the **AND** dropdown menu between any two filter groups and select **OR**. ![\[Change the logical operation between filter groups.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-group-logical.png) + To edit a filter, select it, modify the filtering operation, and choose **Apply**. ![\[Edit a filter.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-edit.png) + To remove a filter group, choose the trash can icon (![\[Trapezoid-shaped diagram showing data flow from source to destination through AWS Transfer Family.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/trash.png)) next to the group. To remove a filter, choose the delete icon (![\[Close or cancel icon represented by an "X" symbol.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/close.png)) next to the filter. ![\[Delete a filter or filter group.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-delete.png) The following image shows an example filter configuration that returns all documents written after **2018** whose genre is **"entertainment"**, in addition to documents whose genre is **"cooking"** or **"sports"** and whose author starts with **"C"**. ![\[Example filter configuration.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/kb/filter-example.png) ------ #### [ API ] When you make a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) or [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request, include a `retrievalConfiguration` field, mapped to a [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object. To see the location of this field, refer to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) and [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request bodies in the API reference. The following JSON objects show the minimal fields required in the [KnowledgeBaseRetrievalConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_KnowledgeBaseRetrievalConfiguration.html) object to set filters for different use cases: 1. Use one filtering operator (see the **Filtering operators** table above). ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] } } } } ``` 1. Use a logical operator (see the **Logical operators** table above) to combine up to 5. ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "andAll | orAll": [ "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, ... ] } } } ``` 1. Use a logical operator to combine up to 5 filtering operators into a filter group, and a second logical operator to combine that filter group with another filtering operator. ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "andAll | orAll": [ "andAll | orAll": [ "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, ... ], "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] } ] } } } ``` 1. Combine up to 5 filter groups by embedding them within another logical operator. You can create one level of embedding. ``` "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "andAll | orAll": [ "andAll | orAll": [ "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, ... ], "andAll | orAll": [ "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, "": { "key": "string", "value": "string" | number | boolean | ["string", "string", ...] }, ... ] ] } } } ``` The following table describes the filter types that you can use: **** | Field | Supported value data types | Filtered results | | --- | --- | --- | | equals | string, number, boolean | Attribute matches the value you provide | | notEquals | string, number, boolean | Attribute doesn't match the value you provide | | greaterThan | number | Attribute is greater than the value you provide | | greaterThanOrEquals | number | Attribute is greater than or equal to the value you provide | | lessThan | number | Attribute is less than the value you provide | | lessThanOrEquals | number | Attribute is less than or equal to the value you provide | | in | list of strings | Attribute is in the list you provide | | notIn | list of strings | Attribute isn't in the list you provide | | startsWith | string | Attribute starts with the string you provide (only supported for Amazon OpenSearch Serverless vector stores) | To combine filter types, you can use one of the following logical operators: **** | Field | Maps to | Filtered results | | --- | --- | --- | | andAll | List of up to 5 filter types | Results fulfill all of the filtering expressions in the group | | orAll | List of up to 5 filter types | Results fulfill at least one of the filtering expressions in the group | For examples, see [Send a query and include filters (Retrieve)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html#API_agent-runtime_Retrieve_Example_2) and [Send a query and include filters (RetrieveAndGenerate)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html#API_agent-runtime_RetrieveAndGenerate_Example_2). ------ ## Implicit metadata filtering Amazon Bedrock Knowledge Base generates and applies a retrieval filter based on the user query and a metadata schema. **Note** The feature currently only works with Anthropic Claude 3.5 Sonnet. The `implicitFilterConfiguration` is specified in the `vectorSearchConfiguration` of the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) request body. Include the following fields: + `metadataAttributes` – In this array, provide schemas describing metadata attributes that the model will generate a filter for. + `modelArn` – The ARN of the model to use. The following shows an example of metadata schemas that you can add to the array in `metadataAttributes`. ``` [ { "key": "company", "type": "STRING", "description": "The full name of the company. E.g. `Amazon.com, Inc.`, `Alphabet Inc.`, etc" }, { "key": "ticker", "type": "STRING", "description": "The ticker name of a company in the stock market, e.g. AMZN, AAPL" }, { "key": "pe_ratio", "type": "NUMBER", "description": "The price to earning ratio of the company. This is a measure of valuation of a company. The lower the pe ratio, the company stock is considered chearper." }, { "key": "is_us_company", "type": "BOOLEAN", "description": "Indicates whether the company is a US company." }, { "key": "tags", "type": "STRING_LIST", "description": "Tags of the company, indicating its main business. E.g. `E-commerce`, `Search engine`, `Artificial intelligence`, `Cloud computing`, etc" } ] ``` ## Guardrails You can implement safeguards for your knowledge base for your use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple request and response conditions, providing a consistent user experience and standardizing safety controls across your knowledge base. You can configure denied topics to disallow undesirable topics and content filters to block harmful content in model inputs and responses. For more information, see [Detect and filter harmful content by using Amazon Bedrock Guardrails](guardrails.md). **Note** Using guardrails with contextual grounding for knowledge bases is currently not supported on Claude 3 Sonnet and Haiku. For general prompt engineering guidelines, see [Prompt engineering concepts](prompt-engineering-guidelines.md). Choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). In the test window, turn on **Generate responses**. Then, in the **Configurations** pane, expand the **Guardrails** section. 1. In the **Guardrails** section, choose the **Name** and the **Version** of your guardrail. If you would like to see the details for your chosen guardrail and version, choose **View**. Alternatively, you can create a new one by choosing the **Guardrail** link. 1. When you're finished editing, choose **Save changes**. To exit without saving choose **Discard changes**. ------ #### [ API ] When you make a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request, include the `guardrailConfiguration` field within the `generationConfiguration` to use your guardrail with the request. To see the location of this field, refer to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request body in the API reference. The following JSON object shows the minimal fields required in the [GenerationConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerationConfiguration.html) to set the `guardrailConfiguration`: ``` "generationConfiguration": { "guardrailConfiguration": { "guardrailId": "string", "guardrailVersion": "string" } } ``` Specify the `guardrailId` and `guardrailVersion` of your chosen guardrails. ------ ## Reranking You can use a reranker model to rerank results from knowledge base query. Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). When you open the **Configurations** pane, expand the **Reranking** section. Select a reranker model, update permissions if necessary, and modify any additional options. Enter a prompt and select **Run** to test the results after reranking. ## Query decomposition Query decomposition is a technique used to break down a complex queries into smaller, more manageable sub-queries. This approach can help in retrieving more accurate and relevant information, especially when the initial query is multifaceted or too broad. Enabling this option may result in multiple queries being executed against your Knowledge Base, which may aid in a more accurate final response. For example, for a question like *“Who scored higher in the 2022 FIFA World Cup, Argentina or France?”*, Amazon Bedrock knowledge bases may first generate the following sub-queries, before generating a final answer: 1. *How many goals did Argentina score in the 2022 FIFA World Cup final?* 1. *How many goals did France score in the 2022 FIFA World Cup final?* ------ #### [ Console ] 1. Create and sync a data source or use an existing knowledge base. 1. Go to the test window and open the configuration panel. 1. Enable query decomposition. ------ #### [ API ] ``` POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "string" }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "orchestrationConfiguration": { // Query decomposition "queryTransformationConfiguration": { "type": "string" // enum of QUERY_DECOMPOSITION } }, ...} } ``` ------ ## Inference parameters When generating responses based off retrieval of information, you can use [inference parameters](inference-parameters.md) to gain more control over the model's behavior during inference and influence the model's outputs. To learn how to modify the inference parameters, choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] **To modify inference parameters when querying a knowledge base** – Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). When you open the **Configurations** pane, you'll see an **Inference parameters** section. Modify the parameters as necessary. **To modify inference parameters when chatting with your document** – Follow the steps at [Chat with your document without a knowledge base configured](knowledge-base-chatdoc.md). In the **Configurations** pane, expand the **Inference parameters** section and modify the parameters as necessary. ------ #### [ API ] You provide the model parameters in the call to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API. You can customize the model by providing inference parameters in the `inferenceConfig` field of either the `knowledgeBaseConfiguration` (if you query a knowledge base) or the `externalSourcesConfiguration` (if you [chat with your document](knowledge-base-chatdoc.md)). Within the `inferenceConfig` field is a `textInferenceConfig` field that contains the following parameters that you can: + temperature + topP + maxTokenCount + stopSequences You can customize the model by using the following parameters in the `inferenceConfig` field of both `externalSourcesConfiguration` and `knowledgeBaseConfiguration`: + temperature + topP + maxTokenCount + stopSequences For a detailed explanation of the function of each of these parameters, see [Influence response generation with inference parameters](inference-parameters.md). Additionally, you can provide custom parameters not supported by `textInferenceConfig` via the `additionalModelRequestFields` map. You can provide parameters unique to specific models with this argument, for the unique parameters see [Inference request parameters and response fields for foundation models](model-parameters.md). If a parameter is omitted from `textInferenceConfig`, a default value will be used. Any parameters not recognized in `textInferneceConfig` will be ignored, while any parameters not recognized in `AdditionalModelRequestFields` will cause an exception. A validation exception is thrown if there is the same parameter in both `additionalModelRequestFields` and `TextInferenceConfig`. **Using model parameters in RetrieveAndGenerate** The following is an example of the structure for `inferenceConfig` and `additionalModelRequestFields` under the `generationConfiguration` in the `RetrieveAndGenerate` request body: ``` "inferenceConfig": { "textInferenceConfig": { "temperature": 0.5, "topP": 0.5, "maxTokens": 2048, "stopSequences": ["\nObservation"] } }, "additionalModelRequestFields": { "top_k": 50 } ``` The proceeding example sets a `temperature` of 0.5, `top_p` of 0.5, `maxTokens` of 2048, stops generation if it encounters the string "\$1nObservation" in the generated response, and passes a custom `top_k` value of 50. ------ ## Knowledge base prompt templates: orchestration & generation When you query a knowledge base and request response generation, Amazon Bedrock uses a prompt template that combines instructions and context with the user query to construct the generation prompt that's sent to the model for response generation. You can also customize the orchestration prompt, which turns the user's prompt into a search query. You can engineer the prompt templates with the following tools: + **Prompt placeholders** – Pre-defined variables in Amazon Bedrock Knowledge Bases that are dynamically filled in at runtime during knowledge base query. In the system prompt, you'll see these placeholders surrounded by the `$` symbol. The following list describes the placeholders you can use: **Note** The `$output_format_instructions$` placeholder is a required field for citations to be displayed in the response. **** [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/bedrock/latest/userguide/kb-test-config.html) + **XML tags** – Anthropic models support the use of XML tags to structure and delineate your prompts. Use descriptive tag names for optimal results. For example, in the default system prompt, you'll see the `` tag used to delineate a database of previously asked questions). For more information, see [Use XML tags](https://docs.anthropic.com/claude/docs/use-xml-tags) in the [Anthropic user guide](https://docs.anthropic.com/en/docs/welcome). For general prompt engineering guidelines, see [Prompt engineering concepts](prompt-engineering-guidelines.md). Choose the tab for your preferred method, and then follow the steps: ------ #### [ Console ] Follow the console steps at [Query a knowledge base and retrieve data](kb-test-retrieve.md) or [Query a knowledge base and generate responses based off the retrieved data](kb-test-retrieve-generate.md). In the test window, turn on **Generate responses**. Then, in the **Configurations** pane, expand the **Knowledge base prompt template** section. 1. Choose **Edit**. 1. Edit the system prompt in the text editor, including prompt placeholders and XML tags as necessary. To revert to the default prompt template, choose **Reset to default**. 1. When you're finished editing, choose **Save changes**. To exit without saving the system prompt, choose **Discard changes**. ------ #### [ API ] When you make a [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request, include a `generationConfiguration` field, mapped to a [GenerationConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerationConfiguration.html) object. To see the location of this field, refer to the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) request body in the API reference. The following JSON object shows the minimal fields required in the [GenerationConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerationConfiguration.html) object to set the maximum number of retrieved results to return: ``` "generationConfiguration": { "promptTemplate": { "textPromptTemplate": "string" } } ``` Enter your custom prompt template in the `textPromptTemplate` field, including prompt placeholders and XML tags as necessary. For the maximum number of characters allowed in the system prompt, see the `textPromptTemplate` field in [GenerationConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerationConfiguration.html). ------ # Configure response generation for reasoning models with Knowledge Bases Certain foundation models can perform model reasoning, where they take a larger, complex task and break it down into smaller, simpler steps. This process, often referred to as chain of thought (CoT) reasoning, can improve model accuracy by giving the model a chance to think before it responds. Model reasoning is most useful for task such as multi-step analysis, math problems, and complex reasoning tasks. For more information, see [Enhance model responses with model reasoning](inference-reasoning.md). **Note** This page describes how to use the reasoning configuration specifically for Amazon Bedrock Knowledge Bases. For information about configuring reasoning for direct model invocation using the `InvokeModel` API, see [Enhance model responses with model reasoning](inference-reasoning.md). When model reasoning is enabled, it can result in improved accuracy with better citation results but can result in a latency increase. The following are some considerations when you query the data sources and generate responses using reasoning models with Amazon Bedrock Knowledge Bases. **Topics** + [Reasoning models](#kb-test-reasoning-models) + [Using model reasoning for Claude 3.7 Sonnet](#kb-test-reasoning-using) + [General considerations](#kb-test-reasoning-general-considerations) + [Retrieve and generate API considerations](#kb-test-reasoning-api-considerations) ## Reasoning models Model reasoning is available for the following models. | Foundation Model | Model ID | Number of tokens | Reasoning configuration | | --- | --- | --- | --- | | Anthropic Claude Opus 4 | anthropic.claude-opus-4-20250514-v1:0 | This model will have 32,768 tokens, which includes both output and reasoning tokens. | Reasoning can be enabled or disabled for this model using a configurable token budget. By default, reasoning is disabled. | | Anthropic Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | This model will have 65,536 tokens, which includes both output and reasoning tokens. | Reasoning can be enabled or disabled for this model using a configurable token budget. By default, reasoning is disabled. | | Anthropic Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | This model will have 65,536 tokens, which includes both output and reasoning tokens. | Reasoning can be enabled or disabled for this model using a configurable token budget. By default, reasoning is disabled. | | DeepSeek DeepSeek-R1 | deepseek.r1-v1:0 | This model will have 8192 tokens, which includes both output and reasoning tokens. The number of thinking tokens cannot be configured and the maximum number of output tokens must not be greater than 8192. | Reasoning is always enabled for this model. The model does not support toggling the reasoning capability on and off. | ## Using model reasoning for Claude 3.7 Sonnet **Note** Model reasoning is always enabled for the DeepSeek-R1 model. The model does not support toggling the reasoning capability on and off. When using the Claude 3.7 Sonnet model, model reasoning can be enabled or disabled using the `additionalModelRequestFields` parameter of the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API. This parameter accepts any key-value pairs. For example, you can add a `reasoningConfig` field and use a `type` key to enable or disable reasoning, as shown below. ``` { "input": { "text": "string", "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "generationConfiguration": { "additionalModelRequestFields": { "reasoningConfig" : { "type": "enabled", "budget_tokens": INT_VAL, #required when enabled } } }, "knowledgeBaseId": "string", }, "type": "string" }, "sessionId": "string" } ``` ## General considerations The following are some general considerations for using the reasoning models for Knowledge Bases. + The reasoning models will have up to five minutes to respond to a query. If the model takes more than five minutes to respond to the query, it results in a time out. + To avoid exceeding the five-minute timeout, model reasoning is enabled only at the generation step when you configure your queries and response generation. The orchestration step cannot have model reasoning. + The reasoning models can use up to 8192 tokens to respond to queries, which will include both the output and thinking tokens. Any request that has a request for maximum number of output tokens greater than this limit will result in an error. ## Retrieve and generate API considerations The following are some considerations when using the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API for the reasoning models. + By default, when reasoning is disabled for all models including the Claude 3.7 Sonnet, the temperature is set to zero. When reasoning is enabled, the temperature must be set to one. ``` "inferenceConfig": { "textInferenceConfig": { "maxTokens": 8192, "temperature": 1 } } ``` + The parameter, Top P, must be disabled when reasoning is enabled for the Claude 3.7 Sonnet model. Top P is an additional model request field that determines the percentile of possible tokens to select from during generation. By default, the Top P value for other Anthropic Claude models is one. For the Claude 3.7 Sonnet model, this value will be disabled by default. + When model reasoning is in use, it can result in an increase in latency. When using this API operation and the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html) API operation, you might notice a delay in receiving the response from the API.