Important
Guardrails are applied only to the input and the generated response from the LLM. They are not applied to the references retrieved from Knowledge Bases at runtime.
After your knowledge base is set up, you can query it and generate responses based on the chunks retrieved from your source data by using the RetrieveAndGenerate API operation. The responses are returned with citations to the original source data. You can also use a reranking model instead of the default Amazon Bedrock Knowledge Bases ranker to rank source chunks for relevance during retrieval.
Note
Images returned from the Retrieve
response during the RetrieveAndGenerate
flow are included in the prompt for response generation. The RetrieveAndGenerate
response can't include images, but it can cite the sources that contain the images.
To learn how to query your knowledge base, choose the tab for your preferred method, and then follow the steps:
To test your knowledge base
-
Sign in to the AWS Management Console using an IAM role with Amazon Bedrock permissions, and open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/
. -
In the left navigation pane, choose Knowledge bases.
-
In the Knowledge bases section, do one of the following actions:
-
Choose the radio button next to the knowledge base you want to test and select Test knowledge base. A test window expands from the right.
-
Choose the knowledge base that you want to test. A test window expands from the right.
-
-
To generate responses based on information retrieved from your knowledge base, turn on Generate responses for your query. Amazon Bedrock will generate responses based on your data sources and cites the information it provides with footnotes.
-
To choose a model to use for response generation, choose Select model. Then select Apply.
-
(Optional) Select the configurations icon (
) to open up Configurations. For information about configurations, see Configure and customize queries and response generation.
-
Enter a query in the text box in the chat window and select Run to return responses from the knowledge base.
-
Select a footnote to see an excerpt from the cited source for that part of the response. Choose the link to navigate to the S3 object containing the file.
-
To see details about the returned chunks, select Show source details.
-
To see the configurations that you set for query, expand Query configurations.
-
To view details about a source chunk, expand it by choosing the right arrow (
) next to it. You can see the following information:
-
The raw text from the source chunk. To copy this text, choose the copy icon (
). If you used Amazon S3 to store your data, choose the external link icon (
) to navigate to the S3 object containing the file.
-
The metadata associated with the source chunk, if you used Amazon S3 to store your data. The attribute/field keys and values are defined in the
.metadata.json
file that's associated with the source document. For more information, see the Metadata and filtering section in Configure and customize queries and response generation.
-
-
Chat options
-
To use a different model for response generation, Select Change model. If you change the model, the text in the chat window will be completely cleared.
-
Switch to retrieving source chunks directly by clearing Generate responses. If you change the setting, the text in the chat window will be completely cleared.
-
To clear the chat window, select the broom icon (
).
-
To copy all the output in the chat window, select the copy icon (
).
Note
If you receive an error that the prompt exceeds the character limit while generating responses, you can shorten the prompt in the following ways:
-
Reduce the maximum number of retrieved results (this shortens what is filled in for the $search_results$ placeholder in the Knowledge base prompt templates: orchestration & generation).
-
Recreate the data source with a chunking strategy that uses smaller chunks (this shortens what is filled in for the $search_results$ placeholder in the Knowledge base prompt templates: orchestration & generation).
-
Shorten the prompt template.
-
Shorten the user query (this shortens what is filled in for the $query$ placeholder in the Knowledge base prompt templates: orchestration & generation).