HumanEvaluationConfig
Specifies the custom metrics, how tasks will be rated, the flow definition ARN, and your custom prompt datasets. Model evaluation jobs use human workers only support the use of custom prompt datasets. To learn more about custom prompt datasets and the required format, see Custom prompt datasets.
When you create custom metrics in HumanEvaluationCustomMetric
you must specify the metric's name
. The list of names
specified in the HumanEvaluationCustomMetric
array, must match the metricNames
array of strings specified in EvaluationDatasetMetricConfig
. For example, if in the HumanEvaluationCustomMetric
array your specified the names "accuracy", "toxicity", "readability"
as custom metrics then the metricNames
array would need to look like the following ["accuracy", "toxicity", "readability"]
in EvaluationDatasetMetricConfig
.
Contents
- datasetMetricConfigs
-
Use to specify the metrics, task, and prompt dataset to be used in your model evaluation job.
Type: Array of EvaluationDatasetMetricConfig objects
Array Members: Minimum number of 1 item. Maximum number of 5 items.
Required: Yes
- customMetrics
-
A
HumanEvaluationCustomMetric
object. It contains the names the metrics, how the metrics are to be evaluated, an optional description.Type: Array of HumanEvaluationCustomMetric objects
Array Members: Minimum number of 1 item. Maximum number of 10 items.
Required: No
- humanWorkflowConfig
-
The parameters of the human workflow.
Type: HumanWorkflowConfig object
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: