# Prerequisites for batch inference
<a name="batch-inference-prereq"></a>

To perform batch inference, you must fulfill the following prerequisites:

1. Prepare your dataset and upload it to an Amazon S3 bucket.

1. Create an S3 bucket for your output data.

1. Set up batch inference-related permissions for the relevant IAM identities.

1. (Optional) Set up a VPC to protect the data in your S3 while carrying out batch inference. You can skip this step if you don't need to use a VPC.

To learn how to fulfill these prerequisites, navigate through the following topics:

**Topics**
+ [

# Format and upload your batch inference data
](batch-inference-data.md)
+ [

# Required permissions for batch inference
](batch-inference-permissions.md)
+ [

# Protect batch inference jobs using a VPC
](batch-vpc.md)

# Format and upload your batch inference data
<a name="batch-inference-data"></a>

You must add your batch inference data to an S3 location that you'll choose or specify when submitting a model invocation job. The S3 location must contain the following items:
+ At least one JSONL file that defines the model inputs. A JSONL contains rows of JSON objects. Your JSONL file must end in the extension .jsonl and be in the following format:

  ```
  { "recordId" : "alphanumeric string", "modelInput" : {JSON body} }
  ...
  ```

  Each line contains a JSON object with a `recordId` field and a `modelInput` field. The format of the `modelInput` JSON object depends on the model invocation type that you choose when you [create the batch inference job](batch-inference-create.md). If you use the `InvokeModel` type (default), the format must match the `body` field for the model that you use in the `InvokeModel` request (see [Inference request parameters and response fields for foundation models](model-parameters.md)). If you use the `Converse` type, the format must match the request body of the [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API.
**Note**  
If you omit the `recordId` field, Amazon Bedrock adds it in the output.
The order of records in the output JSONL file is not guaranteed to match the order of records in the input JSONL file.
You specify the model that you want to use when you create the [batch inference job](batch-inference-create.md).
+ (If your input content contains an Amazon S3 location) Some models allow you to define the content of the input as an S3 location. See [Example video input for Amazon Nova](#batch-inference-data-ex-s3).
**Warning**  
When using S3 URIs in your prompts, all resources must be in the same S3 bucket and folder. The `InputDataConfig` parameter must specify the folder path containing all linked resources (such as videos or images), not just an individual `.jsonl` file. Note that S3 paths are case-sensitive, so ensure your URIs match the exact folder structure.

Ensure that your inputs conform to the batch inference quotas. You can search for the following quotas at [Amazon Bedrock service quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock):
+ **Minimum number of records per batch inference job** – The minimum number of records (JSON objects) across JSONL files in the job.
+ **Records per input file per batch inference job** – The maximum number of records (JSON objects) in a single JSONL file in the job.
+ **Records per batch inference job** – The maximum number of records (JSON objects) across JSONL files in the job.
+ **Batch inference input file size** – The maximum size of a single file in the job.
+ **Batch inference job size** – The maximum cumulative size of all input files.

To better understand how to set up your batch inference inputs, see the following examples:

## Example text input for Anthropic Claude 3 Haiku
<a name="batch-inference-data-ex-text"></a>

If you plan to run batch inference using the [Messages API](model-parameters-anthropic-claude-messages.md) format for the Anthropic Claude 3 Haiku model, you might provide a JSONL file containing the following JSON object as one of the lines:

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
```

## Example video input for Amazon Nova
<a name="batch-inference-data-ex-s3"></a>

If you plan to run batch inference on video inputs using the Amazon Nova Lite or Amazon Nova Pro models, you have the option of defining the video in bytes or as an S3 location in the JSONL file. For example, you might have an S3 bucket whose path is `s3://batch-inference-input-bucket` and contains the following files:

```
s3://batch-inference-input-bucket/
├── videos/
│   ├── video1.mp4
│   ├── video2.mp4
│   ├── ...
│   └── video50.mp4
└── input.jsonl
```

A sample record from the `input.jsonl` file would be the following:

```
{
    "recordId": "RECORD01",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "You are an expert in recipe videos. Describe this video in less than 200 words following these guidelines: ..."
                    },
                    {
                        "video": {
                            "format": "mp4",
                            "source": {
                                "s3Location": {
                                    "uri": "s3://batch-inference-input-bucket/videos/video1.mp4",
                                    "bucketOwner": "111122223333"
                                }
                            }
                        }
                    }
                ]
            }
        ]
    }
}
```

When you create the batch inference job, you must specify the folder path `s3://batch-inference-input-bucket` in your `InputDataConfig` parameter. Batch inference will process the `input.jsonl` file at this location, along with any referenced resources (such as the video files in the `videos` subfolder).

The following resources provide more information about submitting video inputs for batch inference:
+ To learn how to validate Amazon S3 URIs in an input request, see the [Amazon S3 URL Parsing blog](https://aws.amazon.com/blogs/devops/s3-uri-parsing-is-now-available-in-aws-sdk-for-java-2-x/).
+ For more information on how to set up invocation records for video understanding with Nova, see [Amazon Nova vision prompting guidelines](https://docs.aws.amazon.com/nova/latest/userguide/prompting-vision-prompting.html).

## Example Converse input
<a name="batch-inference-data-ex-converse"></a>

If you set the model invocation type to `Converse` when creating the batch inference job, the `modelInput` field must use the Converse API request format. The following example shows a JSONL record for a Converse batch inference job:

```
{
    "recordId": "CALL0000001",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "Summarize the following call transcript: ..."
                    }
                ]
            }
        ],
        "inferenceConfig": {
            "maxTokens": 1024
        }
    }
}
```

For the full list of fields supported in the Converse request body, see [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) in the API reference.

The following topic describes how to set up S3 access and batch inference permissions for an identity to be able to carry out batch inference.

# Required permissions for batch inference
<a name="batch-inference-permissions"></a>

To carry out batch inference, you must set up permissions for the following IAM identities:
+ The IAM identity that will create and manage batch inference jobs.
+ The batch inference [service role](security-iam-sr.md) that Amazon Bedrock assumes to perform actions on your behalf.

To learn how to set up permissions for each identity, navigate through the following topics:

**Topics**
+ [

## Required permissions for an IAM identity to submit and manage batch inference jobs
](#batch-inference-permissions-user)
+ [

## Required permissions for a service role to carry out batch inference
](#batch-inference-permissions-service)

## Required permissions for an IAM identity to submit and manage batch inference jobs
<a name="batch-inference-permissions-user"></a>

For an IAM identity to use this feature, you must configure it with the necessary permissions. To do so, do one of the following:
+ To allow an identity to carry out all Amazon Bedrock actions, attach the [AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess) policy to the identity. If you do this, you can skip this topic. This option is less secure.
+ As a security best practice, you should grant only the necessary actions to an identity. This topic describes the permissions that you need for this feature.

To restrict permissions to only actions that are used for batch inference, attach the following identity-based policy to an IAM identity:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "BatchInference",
            "Effect": "Allow",
            "Action": [  
                "bedrock:ListFoundationModels",
                "bedrock:GetFoundationModel",
                "bedrock:ListInferenceProfiles",
                "bedrock:GetInferenceProfile",
                "bedrock:ListCustomModels",
                "bedrock:GetCustomModel",
                "bedrock:TagResource", 
                "bedrock:UntagResource", 
                "bedrock:ListTagsForResource",
                "bedrock:CreateModelInvocationJob",
                "bedrock:GetModelInvocationJob",
                "bedrock:ListModelInvocationJobs",
                "bedrock:StopModelInvocationJob"
            ],
            "Resource": "*"
        }
    ]   
}
```

------

To further restrict permissions, you can omit actions, or you can specify resources and condition keys by which to filter permissions. For more information about actions, resources, and condition keys, see the following topics in the *Service Authorization Reference*:
+ [Actions defined by Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions) – Learn about actions, the resource types that you can scope them to in the `Resource` field, and the condition keys that you can filter permissions on in the `Condition` field.
+ [Resource types defined by Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies) – Learn about the resource types in Amazon Bedrock.
+ [Condition keys for Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys) – Learn about the condition keys in Amazon Bedrock.

The following policy is an example that scopes down permissions for batch inference to only allow a user with the account ID `123456789012` to create batch inference jobs in the `us-west-2` Region, using the Anthropic Claude 3 Haiku model:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "CreateBatchInferenceJob",
            "Effect": "Allow",
            "Action": [
                "bedrock:CreateModelInvocationJob"
            ],
            "Resource": [
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                "arn:aws:bedrock:us-west-2:123456789012:model-invocation-job/*"
            ]
        }
    ]
}
```

------

## Required permissions for a service role to carry out batch inference
<a name="batch-inference-permissions-service"></a>

Batch inference is carried out by a [service role](security-iam-sr.md) that assumes your identity to perform actions on your behalf. You can create a service role in the following ways:
+ Let Amazon Bedrock automatically create a service role with the necessary permissions for you by using the AWS Management Console. You can select this option when you create a batch inference job.
+ Create a custom service role for Amazon Bedrock by using AWS Identity and Access Management and attach the necessary permissions. When you submit the batch inference job, you then specify this role. For more information about creating a custom service role for batch inference, see [Create a custom service role for batch inference](batch-iam-sr.md). For more general information about creating service roles, see [Create a role to delegate permissions to an AWS service](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html) in the IAM User Guide.

**Important**  
If the S3 bucket in which you [uploaded your data for batch inference](batch-inference-data.md) is in a different AWS account, you must configure an S3 bucket policy to allow the service role access to the data. You must manually configure this policy even if you use the console to automatically create a service role. To learn how to configure an S3 bucket policy for Amazon Bedrock resources, see [Attach a bucket policy to an Amazon S3 bucket to allow another account to access it](s3-bucket-access.md#s3-bucket-access-cross-account).
Foundation models in Amazon Bedrock are AWS-managed resources that cannot be used with IAM policy conditions requiring customer ownership. These models are owned and operated by AWS, and cannot be owned by individual customers. Any IAM policy condition that checks for customer-owned resources (such as conditions using resource tags, organization ID, or other ownership attributes) will fail when applied to foundation models, potentially blocking legitimate access to these services.  
For example, if your policy includes an `aws:ResourceOrgID` condition like this:  

  ```
  {
    "Condition": {
      "StringEqualsIgnoreCase": {
        "aws:ResourceOrgID": ["o-xxxxxxxx"]
      }
    }
  }
  ```
Your batch inference job will fail with `AccessDeniedException`. Remove the `aws:ResourceOrgID` condition or create separate policy statements for foundation models.

# Protect batch inference jobs using a VPC
<a name="batch-vpc"></a>

When you run a batch inference job, the job accesses your Amazon S3 bucket to download the input data and to write the output data. To control access to your data, we recommend that you use a virtual private cloud (VPC) with [Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html). You can further protect your data by configuring your VPC so that your data isn't available over the internet and instead creating a VPC interface endpoint with [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html) to establish a private connection to your data. For more information about how Amazon VPC and AWS PrivateLink integrate with Amazon Bedrock, see [Protect your data using Amazon VPC and AWS PrivateLink](usingVPC.md).

Carry out the following steps to configure and use a VPC for the input prompts and output model responses for your batch inference jobs.

**Topics**
+ [

## Set up VPC to protect your data during batch inference
](#batch-vpc-setup)
+ [

## Attach VPC permissions to a batch inference role
](#batch-vpc-role)
+ [

## Add the VPC configuration when submitting a batch inference job
](#batch-vpc-config)

## Set up VPC to protect your data during batch inference
<a name="batch-vpc-setup"></a>

To set up a VPC, follow the steps at [Set up a VPC](usingVPC.md#create-vpc). You can further secure your VPC by setting up an S3 VPC endpoint and using resource-based IAM policies to restrict access to the S3 bucket containing your batch inference data by following the steps at [(Example) Restrict data access to your Amazon S3 data using VPC](vpc-s3.md).

## Attach VPC permissions to a batch inference role
<a name="batch-vpc-role"></a>

After you finish setting up your VPC, attach the following permissions to your [batch inference service role](batch-iam-sr.md) to allow it to access the VPC. Modify this policy to allow access to only the VPC resources that your job needs. Replace the *subnet-ids* and *security-group-id* with the values from your VPC.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeVpcs",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
                "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}",
                "arn:aws:ec2:us-east-1:123456789012:security-group/${{security-group-id}}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/BedrockManaged": [
                        "true"
                    ]
                },
                "ArnEquals": {
                    "aws:RequestTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "3",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": [
                "*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:Subnet": [
                        "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}"
                    ]
                },
                "ArnEquals": {
                    "ec2:ResourceTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "4",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "CreateNetworkInterface"
                    ]
                },
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "BedrockManaged",
                        "BedrockModelInvocationJobArn"
                    ]
                }
            }
        }
    ]
}
```

------

## Add the VPC configuration when submitting a batch inference job
<a name="batch-vpc-config"></a>

After you configure the VPC and the required roles and permissions as described in the previous sections, you can create a batch inference job that uses this VPC.

**Note**  
Currently, when creating a batch inference job, you can only use a VPC through the API.

When you specify the VPC subnets and security groups for a job, Amazon Bedrock creates *elastic network interfaces* (ENIs) that are associated with your security groups in one of the subnets. ENIs allow the Amazon Bedrock job to connect to resources in your VPC. For information about ENIs, see [Elastic Network Interfaces](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_ElasticNetworkInterfaces.html) in the *Amazon VPC User Guide*. Amazon Bedrock tags ENIs that it creates with `BedrockManaged` and `BedrockModelInvocationJobArn` tags.

We recommend that you provide at least one subnet in each Availability Zone.

You can use security groups to establish rules for controlling Amazon Bedrock access to your VPC resources.

When you submit a [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) request, you can include a `VpcConfig` as a request parameter to specify the VPC subnets and security groups to use, as in the following example.

```
"vpcConfig": { 
    "securityGroupIds": [
        "sg-0123456789abcdef0"
    ],
    "subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ]
}
```