Create a batch inference job - Amazon Bedrock

Create a batch inference job

After you've set up an Amazon S3 bucket with files for running model inference, you can create a batch inference job. To learn how to create a batch inference job, select the tab corresponding to your method of choice and follow the steps.

Console
To create a batch inference job
  1. Sign in to the AWS Management Console using an IAM role with Amazon Bedrock permissions, and open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/.

  2. From the left navigation pane, select Batch inference.

  3. In the Batch inference jobs section, choose Create job.

  4. In the Job details section, give the batch inference job a Job name and select a model to use for the batch inference job by choosing Select model.

  5. In the Input data section, choose Browse S3 and select the S3 location containing the files for your batch inference job. Check that the files conform to the format described in Format and upload your inference data.

  6. In the Output data section, choose Browse S3 and select an S3 location to store the outtput files from your batch inference job. By default, the output data will be encrypted by an AWS managed key. To choose a custom KMS key, select Customize encryption settings (advanced) and choose a key. For more information about encryption of batch inference data and setting up a custom KMS key see LINK.

  7. In the Service access section, select one of the following options:

    • Use an existing service role – Select a service role from the drop-down list. For more information on setting up a custom role with the appropriate permissions, see Required permissions for batch inference.

    • Create and use a new service role – Enter a name for the service role.

  8. (Optional) To associate tags with the batch inference job, expand the Tags section and add a key and optional value for each tag. For more information, see Manage resources using tags.

  9. Choose Create batch inference job.

API

To create a batch inference job, send a CreateModelInvocationJob request (see link for request and response formats and field details) with an Amazon Bedrock control plane endpoint.

The following fields are required:

Field Use case
jobName To specify a name for the job.
roleArn To specify the Amazon Resource Name (ARN) of the service role with permissions to create and manage the job. For more information, see Create a service role for batch inference.
modelId To specify the ID or ARN of the model to use in inference.
inputDataConfig To specify the S3 location containing the prompts and configurations to submit to the job. For more information, see Format and upload your inference data.
outputDataConfig To specify the S3 location to write the model responses to.

The following fields are optional:

Field Use case
timeoutDurationInHours To specify the duration in hours after which the job will time out.
tags To specify any tags to associate with the job. For more information, see Manage resources using tags.
clientRequestToken Identifier to ensure the API request completes only once.

The response returns a jobArn that you can use to refer to the job when carrying out other batch inference-related API calls.