Retrieving passages
You can use the Retrieve API as a retriever for retrieval augmented generation (RAG) systems.
RAG systems use generative artificial intelligence to build question-answering applications. RAG systems consist of a retriever and large language models (LLM). Given a query, the retriever identifies the most relevant chunks of text from a corpus of documents and feeds it to the LLM to provide the most useful answer. Then, the LLM analyzes the relevant text chunks or passages and generates a comprehensive response for the query.
The Retrieve
API looks at chunks of text or excerpts
that are referred to as passages and returns the
top passages that are most relevant to the query.
Like the Query
API, the Retrieve
API also searches for relevant information using
semantic search. Semantic search takes into account the search query's context, plus
all the available information from the indexed documents. However, by default, the
Query
API only returns excerpt passages of up to
100 token words. With the Retrieve
API, you can retrieve
longer passages of up to 200 token words and up to 100 semantically relevant
passages. This doesn't include question-answer or FAQ type responses from your
index. The passages are text excerpts that can be semantically extracted from
multiple documents and multiple parts of the same document. If in extreme cases your
documents produce zero passages using the Retrieve
API,
you can alternatively use the Query
API and its types of
responses.
You can also do the following with the Retrieve
API:
-
Override boosting at the index level
-
Filter based on document fields or attributes
-
Filter based on the user or their group access to documents
-
View the confidence score bucket for a retrieved passage result. The confidence bucket provides a relative ranking that indicates how confident Amazon Kendra is that the response is relevant to the query.
Note
Confidence score buckets are currently available only for English.
You can also include certain fields in the response that might provide useful additional information.
The Retrieve
API currently doesn't support all
features supported by the Query
API. The following
features are not supported: querying using advance query syntax, suggested spell
corrections for queries, faceting, query
suggestions to autocomplete search queries, and incremental
learning. Note that not all features apply to the
Retrieve
API. Any future releases of the
Retrieve
API will be documented in this
guide.
The Retrieve
API shares the number of query
capacity units that you set for your index. For more information on
what's included in a single capacity unit and the default base capacity for an
index, see Adjusting
capacity.
Note
You can't add capacity if you are using the Amazon Kendra Developer Edition; you can only add capacity when using Amazon Kendra Enterprise Edition. For more information on what's included in the Developer and Enterprise Editions, see Amazon Kendra Editions.
The following is an example of using the Retrieve
API
to retrieve the top 100 most relevant passages from documents in an index for the
query "how does amazon kendra work?"