CreateModelInvocationJob - Amazon Bedrock

CreateModelInvocationJob

Creates a batch inference job to invoke a model on multiple prompts. Format your data according to Format your inference data and upload it to an Amazon S3 bucket. For more information, see Process multiple prompts with batch inference.

The response returns a jobArn that you can use to stop or get details about the job.

Request Syntax

POST /model-invocation-job HTTP/1.1 Content-type: application/json { "clientRequestToken": "string", "inputDataConfig": { ... }, "jobName": "string", "modelId": "string", "outputDataConfig": { ... }, "roleArn": "string", "tags": [ { "key": "string", "value": "string" } ], "timeoutDurationInHours": number, "vpcConfig": { "securityGroupIds": [ "string" ], "subnetIds": [ "string" ] } }

URI Request Parameters

The request does not use any URI parameters.

Request Body

The request accepts the following data in JSON format.

clientRequestToken

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 256.

Pattern: ^[a-zA-Z0-9]{1,256}(-*[a-zA-Z0-9]){0,256}$

Required: No

inputDataConfig

Details about the location of the input to the batch inference job.

Type: ModelInvocationJobInputDataConfig object

Note: This object is a Union. Only one member of this object can be specified or returned.

Required: Yes

jobName

A name to give the batch inference job.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 63.

Pattern: ^[a-zA-Z0-9]{1,63}(-*[a-zA-Z0-9\+\-\.]){0,63}$

Required: Yes

modelId

The unique identifier of the foundation model to use for the batch inference job.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 2048.

Pattern: ^(arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:(([0-9]{12}:custom-model/[a-z0-9-]{1,63}[.]{1}[a-z0-9-:]{1,63}/[a-z0-9]{12}$)|(:foundation-model/[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}$)))|([a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([.]?[a-z0-9-]{1,63})([:][a-z0-9-]{1,63}){0,2})|(([0-9a-zA-Z][_-]?)+)$

Required: Yes

outputDataConfig

Details about the location of the output of the batch inference job.

Type: ModelInvocationJobOutputDataConfig object

Note: This object is a Union. Only one member of this object can be specified or returned.

Required: Yes

roleArn

The Amazon Resource Name (ARN) of the service role with permissions to carry out and manage batch inference. You can use the console to create a default service role or follow the steps at Create a service role for batch inference.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 2048.

Pattern: ^arn:aws(-[^:]+)?:iam::([0-9]{12})?:role/.+$

Required: Yes

tags

Any tags to associate with the batch inference job. For more information, see Tagging Amazon Bedrock resources.

Type: Array of Tag objects

Array Members: Minimum number of 0 items. Maximum number of 200 items.

Required: No

timeoutDurationInHours

The number of hours after which to force the batch inference job to time out.

Type: Integer

Valid Range: Minimum value of 24. Maximum value of 168.

Required: No

vpcConfig

The configuration of the Virtual Private Cloud (VPC) for the data in the batch inference job. For more information, see Protect batch inference jobs using a VPC.

Type: VpcConfig object

Required: No

Response Syntax

HTTP/1.1 200 Content-type: application/json { "jobArn": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

jobArn

The Amazon Resource Name (ARN) of the batch inference job.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 1011.

Pattern: ^(arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:[0-9]{12}:model-invocation-job/[a-z0-9]{12})$

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

The request is denied because of missing access permissions.

HTTP Status Code: 403

ConflictException

Error occurred because of a conflict while performing an operation.

HTTP Status Code: 400

InternalServerException

An internal server error occurred. Retry your request.

HTTP Status Code: 500

ResourceNotFoundException

The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.

HTTP Status Code: 404

ServiceQuotaExceededException

The number of requests exceeds the service quota. Resubmit your request later.

HTTP Status Code: 400

ThrottlingException

The number of requests exceeds the limit. Resubmit your request later.

HTTP Status Code: 429

ValidationException

Input validation failed. Check your request parameters and retry the request.

HTTP Status Code: 400

Examples

Create a batch inference job

This example illustrates one usage of CreateModelInvocationJob.

POST /model-invocation-job HTTP/1.1 Content-type: application/json { "clientRequestToken": "string", "inputDataConfig": { "s3InputDataConfig": { "s3Uri": "s3://input-bucket/abc.jsonl" } }, "jobName": "my-batch-job", "modelId": "anthropic.claude-3-haiku-20240307-v1:0", "outputDataConfig": { "s3OutputDataConfig": { "s3Uri": "s3://output-bucket/" } }, "roleArn": "arn:aws:iam::123456789012:role/MyBatchInferenceRole" }

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: