Build your own RAG

Focus mode

Build your own RAG - Amazon Nova

When constructing your own retrieval augmented generation (RAG) system, you can leverage a retriever system and a generator system. The retriever can be an embedding model that identifies the relevant chunks from the vector database based on similarity scores. The generator can be a Large Language Model (LLM) that utilizes the model's capability to answer questions based on the retrieved results (also known as chunks). In the following sections, we will provide additional tips on how to optimize the prompts for your RAG system.

Tip

Leverage the System Prompts: As with other functionalities, enhancing the system prompt can be beneficial. You can define the RAG Systems description in the system prompt, outlining the desired persona and behavior for the model.

Tip

Use Model Instructions: Additionally, you can include a dedicated "Model Instructions:" section within the system prompt, where you can provide specific guidelines for the model to follow.

For instance, you can list instructions such as: In this example session, the model has access to search results and a user's question, its job is to answer the user's question using only information from the search results.


Model Instructions:
- You should provide concise answer to simple questions
when the answer is directly contained in search results,
but when comes to yes/no question, provide some details.
- In case the question requires multi-hop reasoning, you
should find relevant information from search results and
summarize the answer based on relevant information with
logical reasoning.
- If the search results do not contain information that
can answer the question, please state that you could not
find an exact answer to the question, and if search results
are completely irrelevant, say that you could not find an
exact answer, then summarize search results.
- Remember to add citations to your response using markers
like %[1]%, %[2]%, %[3]%, etc for the corresponding passage
supports the response.

Tip

Avoid Hallucination by restricting the instructions: Bring more focus to instructions by clearly mentioning "DO NOT USE INFORMATION THAT IS NOT IN SEARCH RESULTS!" as a model instruction so the answers are grounded in the provided context.


- DO NOT USE INFORMATION THAT IS NOT IN SEARCH RESULTS!

Tip

Provide an input query followed by search results: Provide an input query followed by the retriever search results or contextual chunks. The model works best when the chunk results are provided after Resource: Search Results:


{query}
Resource: Search Results: {rag_chunks_retreiver_results}

You can combine all of the previous recommendations with the following prompt template. This template will only generate based on retrieved chunks.


In this session, the model has access
to search results and a user's question, your job
is to answer the user's question using only
information from the search results.

Model Instructions:
- You should provide concise answer to simple
questions when the answer is directly contained in
search results, but when comes to yes/no question,
provide some details.
- In case the question requires multi-hop reasoning,
you should find relevant information from search
results and summarize the answer based on relevant
information with logical reasoning.
- If the search results do not contain information
that can answer the question, please state that you
could not find an exact answer to the question, and
if search results are completely irrelevant, say
that you could not find an exact answer, then summarize
search results.
- Remember to add a citation to the end of your response
using markers like %[1]%, %[2]%, %[3]%, etc for the
corresponding passage supports the response.
- DO NOT USE INFORMATION THAT IS NOT IN SEARCH RESULTS!

{Query} 
Resource: {search_results}

Multimodal RAG

When you create a multimodal RAG, there are a few additional best practices you should observe.

Use images directly if they are not text-heavy (that is, natural scenes, text-sparse slides, infographics, and so on) Amazon Nova has been optimized to handle non-text-heavy images. You do not need to pass an additional text summary for these images in the grounded generation.
Enhance text-heavy images with text summaries (e.g., PDF reports, papers). For text-heavy PDFs, the best approach is to retrieve both images (PDFs) and corresponding text summaries. The text summaries can help the model to identify relevant information from massive amounts of text in the original image.
Let the model know that you are passing images. In the instructions, you can add a sentence like "You will be provided with images and texts from search results".

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Use external tools

Tool calling systems

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Build your own RAG

Tip

Tip

Tip

Tip

Multimodal RAG

On this page

Did this page help you?

Next topic:

Previous topic:

Need help?