Knowledge base evaluation of retrieval or response generation - Amazon Bedrock

Knowledge base evaluation of retrieval or response generation

You can choose to evaluate on retrieval only, or retrieval with response generation. Retrieval only is assessing how well your knowledge base can retrieve highly relevant information from your data sources. Retrieval with response generation is assessing how well your knowledge base can generate useful, appropriate responses based on the information it retrieves.

Note

Knowledge base evaluation is in preview mode and is subject to change.

The following table summarizes retrieval only and retrieval with response generation evaluations, and the relevant metrics for each type.

Evaluation type Metrics Programmatic input
Retrieve information only

Quality metrics:

  • Context relevance

  • Context coverage

  • Builtin.ContextRelevance

  • Builtin.ContextCoverage

Retrieve information and generate responses

Quality metrics:

  • Correctness

  • Completeness

  • Helpfulness

  • Logical coherence

  • Faithfulness

Responsible AI metrics

  • Harmfulness

  • Stereotyping

  • Refusal

Quality metrics:

  • Builtin.Correctness

  • Builtin.Completeness

  • Builtin.Helpfulness

  • Builtin.LogicalCoherence

  • Builtin.Faithfulness

Responsible AI metrics

  • Builtin.Harmfulness

  • Builtin.Stereotyping

  • Builtin.Refusal

Use the following topics to learn more about each metric related to retrieval only and retrieval with response generation