DescribeInferenceRecommendationsJob
Provides the results of the Inference Recommender job. One or more recommendation jobs are returned.
Request Syntax
{
"JobName": "string
"
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- JobName
-
The name of the job. The name must be unique within an AWS Region in the AWS account.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,63}
Required: Yes
Response Syntax
{
"CompletionTime": number,
"CreationTime": number,
"EndpointPerformances": [
{
"EndpointInfo": {
"EndpointName": "string"
},
"Metrics": {
"MaxInvocations": number,
"ModelLatency": number
}
}
],
"FailureReason": "string",
"InferenceRecommendations": [
{
"EndpointConfiguration": {
"EndpointName": "string",
"InitialInstanceCount": number,
"InstanceType": "string",
"ServerlessConfig": {
"MaxConcurrency": number,
"MemorySizeInMB": number,
"ProvisionedConcurrency": number
},
"VariantName": "string"
},
"InvocationEndTime": number,
"InvocationStartTime": number,
"Metrics": {
"CostPerHour": number,
"CostPerInference": number,
"CpuUtilization": number,
"MaxInvocations": number,
"MemoryUtilization": number,
"ModelLatency": number,
"ModelSetupTime": number
},
"ModelConfiguration": {
"CompilationJobName": "string",
"EnvironmentParameters": [
{
"Key": "string",
"Value": "string",
"ValueType": "string"
}
],
"InferenceSpecificationName": "string"
},
"RecommendationId": "string"
}
],
"InputConfig": {
"ContainerConfig": {
"DataInputConfig": "string",
"Domain": "string",
"Framework": "string",
"FrameworkVersion": "string",
"NearestModelName": "string",
"PayloadConfig": {
"SamplePayloadUrl": "string",
"SupportedContentTypes": [ "string" ]
},
"SupportedEndpointType": "string",
"SupportedInstanceTypes": [ "string" ],
"SupportedResponseMIMETypes": [ "string" ],
"Task": "string"
},
"EndpointConfigurations": [
{
"EnvironmentParameterRanges": {
"CategoricalParameterRanges": [
{
"Name": "string",
"Value": [ "string" ]
}
]
},
"InferenceSpecificationName": "string",
"InstanceType": "string",
"ServerlessConfig": {
"MaxConcurrency": number,
"MemorySizeInMB": number,
"ProvisionedConcurrency": number
}
}
],
"Endpoints": [
{
"EndpointName": "string"
}
],
"JobDurationInSeconds": number,
"ModelName": "string",
"ModelPackageVersionArn": "string",
"ResourceLimit": {
"MaxNumberOfTests": number,
"MaxParallelOfTests": number
},
"TrafficPattern": {
"Phases": [
{
"DurationInSeconds": number,
"InitialNumberOfUsers": number,
"SpawnRate": number
}
],
"Stairs": {
"DurationInSeconds": number,
"NumberOfSteps": number,
"UsersPerStep": number
},
"TrafficType": "string"
},
"VolumeKmsKeyId": "string",
"VpcConfig": {
"SecurityGroupIds": [ "string" ],
"Subnets": [ "string" ]
}
},
"JobArn": "string",
"JobDescription": "string",
"JobName": "string",
"JobType": "string",
"LastModifiedTime": number,
"RoleArn": "string",
"Status": "string",
"StoppingConditions": {
"FlatInvocations": "string",
"MaxInvocations": number,
"ModelLatencyThresholds": [
{
"Percentile": "string",
"ValueInMilliseconds": number
}
]
}
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- CompletionTime
-
A timestamp that shows when the job completed.
Type: Timestamp
- CreationTime
-
A timestamp that shows when the job was created.
Type: Timestamp
- EndpointPerformances
-
The performance results from running an Inference Recommender job on an existing endpoint.
Type: Array of EndpointPerformance objects
Array Members: Maximum number of 1 item.
- FailureReason
-
If the job fails, provides information why the job failed.
Type: String
Length Constraints: Maximum length of 1024.
- InferenceRecommendations
-
The recommendations made by Inference Recommender.
Type: Array of InferenceRecommendation objects
Array Members: Minimum number of 1 item. Maximum number of 10 items.
- InputConfig
-
Returns information about the versioned model package Amazon Resource Name (ARN), the traffic pattern, and endpoint configurations you provided when you initiated the job.
Type: RecommendationJobInputConfig object
- JobArn
-
The Amazon Resource Name (ARN) of the job.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:inference-recommendations-job/.*
- JobDescription
-
The job description that you provided when you initiated the job.
Type: String
Length Constraints: Maximum length of 128.
- JobName
-
The name of the job. The name must be unique within an AWS Region in the AWS account.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,63}
- JobType
-
The job type that you provided when you initiated the job.
Type: String
Valid Values:
Default | Advanced
- LastModifiedTime
-
A timestamp that shows when the job was last modified.
Type: Timestamp
- RoleArn
-
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role you provided when you initiated the job.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern:
^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$
- Status
-
The status of the job.
Type: String
Valid Values:
PENDING | IN_PROGRESS | COMPLETED | FAILED | STOPPING | STOPPED | DELETING | DELETED
- StoppingConditions
-
The stopping conditions that you provided when you initiated the job.
Type: RecommendationJobStoppingConditions object
Errors
For information about the errors that are common to all actions, see Common Errors.
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: