interface EvaluatorInferenceConfig
| Language | Type name |
|---|---|
.NET | Amazon.CDK.AWS.Bedrock.Agentcore.Alpha.EvaluatorInferenceConfig |
Go | github.com/aws/aws-cdk-go/awsbedrockagentcorealpha/v2#EvaluatorInferenceConfig |
Java | software.amazon.awscdk.services.bedrock.agentcore.alpha.EvaluatorInferenceConfig |
Python | aws_cdk.aws_bedrock_agentcore_alpha.EvaluatorInferenceConfig |
TypeScript (source) | @aws-cdk/aws-bedrock-agentcore-alpha ยป EvaluatorInferenceConfig |
Inference configuration for a custom LLM-as-a-Judge evaluator.
Controls how the foundation model generates evaluation responses.
Example
// LLM-as-a-Judge with categorical rating scale
const categoricalEvaluator = new agentcore.Evaluator(this, 'CategoricalEvaluator', {
evaluatorName: 'domain_accuracy_evaluator',
level: agentcore.EvaluationLevel.SESSION,
description: 'Evaluates domain-specific accuracy of agent responses',
evaluatorConfig: agentcore.EvaluatorConfig.llmAsAJudge({
instructions: 'Evaluate whether the agent response is accurate within the healthcare domain.',
modelId: 'us.anthropic.claude-sonnet-4-6',
ratingScale: agentcore.EvaluatorRatingScale.categorical([
{ label: 'Accurate', definition: 'The response contains factually correct healthcare information.' },
{ label: 'Inaccurate', definition: 'The response contains incorrect or misleading healthcare information.' },
]),
}),
});
// LLM-as-a-Judge with numerical rating scale and inference config
const numericalEvaluator = new agentcore.Evaluator(this, 'NumericalEvaluator', {
evaluatorName: 'response_quality_evaluator',
level: agentcore.EvaluationLevel.TRACE,
evaluatorConfig: agentcore.EvaluatorConfig.llmAsAJudge({
instructions: 'Rate the overall quality of the agent response on a scale of 1 to 5.',
modelId: 'us.anthropic.claude-sonnet-4-6',
ratingScale: agentcore.EvaluatorRatingScale.numerical([
{ label: 'Poor', definition: 'Inadequate response.', value: 1 },
{ label: 'Below Average', definition: 'Partially addresses the query.', value: 2 },
{ label: 'Average', definition: 'Adequately addresses the query.', value: 3 },
{ label: 'Good', definition: 'Well-structured and accurate response.', value: 4 },
{ label: 'Excellent', definition: 'Outstanding response exceeding expectations.', value: 5 },
]),
inferenceConfig: {
maxTokens: 1024,
temperature: 0.1,
},
}),
});
Properties
| Name | Type | Description |
|---|---|---|
| max | number | The maximum number of tokens to generate in the model response. |
| temperature? | number | The temperature value that controls randomness in the model's responses. |
| top | number | The top-p sampling parameter that controls the diversity of the model's responses. |
maxTokens?
Type:
number
(optional, default: The foundation model's default maximum token limit is used)
The maximum number of tokens to generate in the model response.
temperature?
Type:
number
(optional, default: The foundation model's default temperature is used)
The temperature value that controls randomness in the model's responses.
Higher values produce more diverse outputs. Range: 0.0 to 1.0.
topP?
Type:
number
(optional, default: The foundation model's default top-p value is used)
The top-p sampling parameter that controls the diversity of the model's responses.
Range: 0.0 to 1.0.

.NET
Go
Java
Python
TypeScript (