Requirements for custom prompt datasets in model evaluation job that uses a model as judge - Amazon Bedrock

Requirements for custom prompt datasets in model evaluation job that uses a model as judge

To create a model evaluation job that uses a model as judge you must specify a prompt dataset. The prompts are then used during inference with the model you select to evaluate. This prompt dataset uses the same format as automatic model evaluation jobs. Some key values pairs are now required now when you use the Correctness(Builtin.Correctness) metric or the Completeness (Builtin.Completeness) metric.

You must create a custom prompt dataset in a model evaluation jobs that uses a model as judge. Custom prompt datasets must be stored in Amazon S3, and use the JSON line format and use the .jsonl file extension. Each line must be a valid JSON object. There can be up to 1000 prompts in your dataset per automatic evaluation job.

For job created using the console you must update the Cross Origin Resource Sharing (CORS) configuration on the S3 bucket. To learn more about the required CORS permissions, see Required Cross Origin Resource Sharing (CORS) permissions on S3 buckets.

Key value pairs used in prompt dataset for model evaluation jobs the use a model as judge
  • prompt – required to indicate the input for the following tasks:

    • The prompt that your model should respond to, in general text generation.

    • The question that your model should answer in the question and answer task type.

    • The text that your model should summarize in text summarization task.

    • The text that your model should classify in classification tasks.

  • referenceResponse – required to indicate the ground truth response for Completeness and Correctness metrics.

    • The correct response.

    • The complete response.

  • category– (optional) generates evaluation scores reported for each category.

The following prompt is expanded for clarity. In your actual prompt dataset each line (a prompt) must be a valid JSON object.

{ "prompt": "Bobigny is the capital of", "referenceResponse": "Seine-Saint-Denis", "category": "Capitals" }