BedrockAgentCore / Client / get_batch_evaluation

get_batch_evaluation

BedrockAgentCore.Client.get_batch_evaluation(**kwargs)

Retrieves detailed information about a batch evaluation, including its status, configuration, results, and any error details.

See also: AWS API Documentation

Request Syntax

response = client.get_batch_evaluation(
    batchEvaluationId='string'
)
Parameters:

batchEvaluationId (string) –

[REQUIRED]

The unique identifier of the batch evaluation to retrieve.

Return type:

dict

Returns:

Response Syntax

{
    'batchEvaluationId': 'string',
    'batchEvaluationArn': 'string',
    'batchEvaluationName': 'string',
    'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERRORS'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING',
    'createdAt': datetime(2015, 1, 1),
    'evaluators': [
        {
            'evaluatorId': 'string'
        },
    ],
    'dataSourceConfig': {
        'cloudWatchLogs': {
            'serviceNames': [
                'string',
            ],
            'logGroupNames': [
                'string',
            ],
            'filterConfig': {
                'sessionIds': [
                    'string',
                ],
                'timeRange': {
                    'startTime': datetime(2015, 1, 1),
                    'endTime': datetime(2015, 1, 1)
                }
            }
        }
    },
    'outputConfig': {
        'cloudWatchConfig': {
            'logGroupName': 'string',
            'logStreamName': 'string'
        }
    },
    'evaluationResults': {
        'numberOfSessionsCompleted': 123,
        'numberOfSessionsInProgress': 123,
        'numberOfSessionsFailed': 123,
        'totalNumberOfSessions': 123,
        'numberOfSessionsIgnored': 123,
        'evaluatorSummaries': [
            {
                'evaluatorId': 'string',
                'statistics': {
                    'averageScore': 123.0
                },
                'totalEvaluated': 123,
                'totalFailed': 123
            },
        ]
    },
    'errorDetails': [
        'string',
    ],
    'description': 'string',
    'updatedAt': datetime(2015, 1, 1)
}

Response Structure

  • (dict) –

    • batchEvaluationId (string) –

      The unique identifier of the batch evaluation.

    • batchEvaluationArn (string) –

      The Amazon Resource Name (ARN) of the batch evaluation.

    • batchEvaluationName (string) –

      The name of the batch evaluation.

    • status (string) –

      The current status of the batch evaluation.

    • createdAt (datetime) –

      The timestamp when the batch evaluation was created.

    • evaluators (list) –

      The list of evaluators applied during the batch evaluation.

      • (dict) –

        An evaluator to run against sessions.

        • evaluatorId (string) –

          The unique identifier of the evaluator. Can reference built-in evaluators (e.g., Builtin.Helpfulness) or custom evaluators.

    • dataSourceConfig (dict) –

      The data source configuration specifying where agent traces are pulled from.

      Note

      This is a Tagged Union structure. Only one of the following top level keys will be set: cloudWatchLogs. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      
      • cloudWatchLogs (dict) –

        Pull session spans from CloudWatch

        • serviceNames (list) –

          The list of agent service names to filter traces within the specified log groups.

          • (string) –

        • logGroupNames (list) –

          The list of CloudWatch log group names to read agent traces from. Maximum of 5 log groups.

          • (string) –

        • filterConfig (dict) –

          Optional filter configuration to narrow down which sessions to evaluate.

          • sessionIds (list) –

            A list of specific session IDs to evaluate. If specified, only these sessions are included in the evaluation.

            • (string) –

          • timeRange (dict) –

            The time range filter for selecting sessions to evaluate.

            • startTime (datetime) –

              The start time of the time range. Only sessions with activity at or after this timestamp are included.

            • endTime (datetime) –

              The end time of the time range. Only sessions with activity before this timestamp are included.

    • outputConfig (dict) –

      The output configuration specifying where evaluation results are written.

      Note

      This is a Tagged Union structure. Only one of the following top level keys will be set: cloudWatchConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      
      • cloudWatchConfig (dict) –

        The CloudWatch Logs configuration for writing evaluation results.

        • logGroupName (string) –

          The name of the CloudWatch log group where evaluation results will be written.

        • logStreamName (string) –

          The name of the CloudWatch log stream where evaluation results will be written.

    • evaluationResults (dict) –

      The aggregated evaluation results, including session completion counts and evaluator score summaries.

      • numberOfSessionsCompleted (integer) –

        The number of sessions that have been successfully evaluated.

      • numberOfSessionsInProgress (integer) –

        The number of sessions currently being evaluated.

      • numberOfSessionsFailed (integer) –

        The number of sessions that failed evaluation.

      • totalNumberOfSessions (integer) –

        The total number of sessions included in the batch evaluation.

      • numberOfSessionsIgnored (integer) –

        The number of sessions that were ignored during evaluation.

      • evaluatorSummaries (list) –

        A list of per-evaluator summary statistics.

        • (dict) –

          Summary statistics for a single evaluator within a batch evaluation.

          • evaluatorId (string) –

            The unique identifier of the evaluator.

          • statistics (dict) –

            The aggregated statistics for this evaluator.

            • averageScore (float) –

              The average score across all evaluated sessions for this evaluator.

          • totalEvaluated (integer) –

            The total number of sessions evaluated by this evaluator.

          • totalFailed (integer) –

            The total number of sessions that failed evaluation by this evaluator.

    • errorDetails (list) –

      The error details if the batch evaluation encountered failures.

      • (string) –

    • description (string) –

      The description of the batch evaluation.

    • updatedAt (datetime) –

      The timestamp when the batch evaluation was last updated.

Exceptions

  • BedrockAgentCore.Client.exceptions.UnauthorizedException

  • BedrockAgentCore.Client.exceptions.ValidationException

  • BedrockAgentCore.Client.exceptions.AccessDeniedException

  • BedrockAgentCore.Client.exceptions.ThrottlingException

  • BedrockAgentCore.Client.exceptions.ResourceNotFoundException

  • BedrockAgentCore.Client.exceptions.InternalServerException