Choose the best performing RAG source using Amazon Bedrock evaluations

Focus mode

Choose the best performing RAG source using Amazon Bedrock evaluations - Amazon Bedrock

You can use computed metrics to evaluate how effectively a Retrieval Augmented Generation (RAG) system retrieves relevant information from your data sources, and how effective the generated responses are in answering questions. The results of a RAG evaluation allow you to compare different Amazon Bedrock Knowledge Bases and other RAG sources, and then to choose the best Knowledge Base or RAG system for your application.

You can set up two different types of RAG evaluation jobs.

Retrieve only – In a retrieve-only RAG evaluation job, the report is based on the data retrieved from your RAG source. You can either evaluate an Amazon Bedrock Knowledge Base, or you can bring your own inference response data from an external RAG source.
Retrieve and generate – In a retrieve-and-generate RAG evaluation job, the report is based on the data retrieved from your knowledge base and the summaries generated by the evaluator model. You can either use an Amazon Bedrock Knowledge Base and evaluator model, or you can bring your own inference response data from an external RAG source.

Use the following topics to see how to create and manage knowledge base evaluation jobs, and the kinds of performance metrics you can use.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Stop job

Prerequisites

Next topic:

Prerequisites

Previous topic:

Stop job

Need help?

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences