Automatic model evaluation

You can create an automatic model evaluation in Studio or by using the fmeval library inside your own code. Studio uses a wizard to create the model evaluation job. The fmeval library provides tools to customize your work flow further.

Both types of automatic model evaluation jobs support the use of publicly available JumpStart models, and JumpStart models that you previously deployed to an endpoint. If you use a JumpStart that has not been previously deployed, SageMaker AI will handle creating the necessary resource, and shutting them down once the model evaluation job has finished.

To use text based LLMs from other AWS service or a model hosted outside of AWS, you must use the fmeval library.

When your jobs are completed the results are saved in the Amazon S3 bucket specified when the job was created. To learn how to interpret your results, see Understand the results of your model evaluation job.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Create a model evaluation job that uses human workers

Create an automatic model evaluation job in Studio