Model evaluation notebook tutorials

This section provides the following notebook tutorials, which include example code and explanations:

Additional notebooks

The fmeval GitHub directory contains the following additional example notebooks:

bedrock-claude-factual-knowledge.ipnyb – Evaluates an Anthropic Claude 2 model hosted on Amazon Bedrock for factual knowledge.
byo-model-outputs.ipynb – Evaluates a Falcon 7b model hosted on JumpStart for factual knowledge where you bring your own model outputs instead of sending inference requests to your model.
custom_model_runner_chat_gpt.ipnyb – Evaluates a custom ChatGPT 3.5 model hosted on Hugging Face for factual knowledge.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Using the fmeval library

Evaluate a JumpStart model for prompt stereotyping