Request Syntax Request Parameters Response Syntax Response Elements Errors See Also

ListInferenceRecommendationsJobSteps

Returns a list of the subtasks for an Inference Recommender job.

The supported subtasks are benchmarks, which evaluate the performance of your model on different instance types.

Request Syntax


{
   "JobName": "string",
   "MaxResults": number,
   "NextToken": "string",
   "Status": "string",
   "StepType": "string"
}

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

JobName

The name for the Inference Recommender job.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 64.

Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,63}

Required: Yes

MaxResults

The maximum number of results to return.

Type: Integer

Valid Range: Minimum value of 1. Maximum value of 100.

Required: No

NextToken

A token that you can specify to return more results from the list. Specify this field if you have a token that was returned from a previous request.

Type: String

Length Constraints: Maximum length of 8192.

Pattern: .*

Required: No

Status

A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.

Type: String

Required: No

StepType

A filter to return details about the specified type of subtask.

BENCHMARK: Evaluate the performance of your model on different instance types.

Type: String

Valid Values: BENCHMARK

Required: No

Response Syntax


{
   "NextToken": "string",
   "Steps": [ 
      { 
         "InferenceBenchmark": { 
            "EndpointConfiguration": { 
               "EndpointName": "string",
               "InitialInstanceCount": number,
               "InstanceType": "string",
               "ServerlessConfig": { 
                  "MaxConcurrency": number,
                  "MemorySizeInMB": number,
                  "ProvisionedConcurrency": number
               },
               "VariantName": "string"
            },
            "EndpointMetrics": { 
               "MaxInvocations": number,
               "ModelLatency": number
            },
            "FailureReason": "string",
            "InvocationEndTime": number,
            "InvocationStartTime": number,
            "Metrics": { 
               "CostPerHour": number,
               "CostPerInference": number,
               "CpuUtilization": number,
               "MaxInvocations": number,
               "MemoryUtilization": number,
               "ModelLatency": number,
               "ModelSetupTime": number
            },
            "ModelConfiguration": { 
               "CompilationJobName": "string",
               "EnvironmentParameters": [ 
                  { 
                     "Key": "string",
                     "Value": "string",
                     "ValueType": "string"
                  }
               ],
               "InferenceSpecificationName": "string"
            }
         },
         "JobName": "string",
         "Status": "string",
         "StepType": "string"
      }
   ]
}