# Process multiple prompts with batch inference
<a name="batch-inference"></a>

With batch inference, you can submit multiple prompts and generate responses asynchronously. You can format your input data by using either the `InvokeModel` or `Converse` API format. Batch inference helps you process a large number of requests efficiently by sending a single request and generating the responses in an Amazon S3 bucket. After defining model inputs in files you create, you upload the files to an S3 bucket. You then submit a batch inference request and specify the S3 bucket. After the job is complete, you can retrieve the output files from S3. You can use batch inference to improve the performance of model inference on large datasets.

**Note**  
Batch inference isn't supported for provisioned models.

See the following resources for general information about batch inference:
+ To see pricing for batch inference, see [Amazon Bedrock pricing](https://aws.amazon.com/bedrock/pricing/).
+ To see quotas for batch inference, see [Amazon Bedrock endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html) in the AWS General Reference.
+ To receive notifications when batch inference jobs complete or change state instead of polling, see [Monitor Amazon Bedrock job state changes using Amazon EventBridgeMonitor event changes](monitoring-eventbridge.md).

**Topics**
+ [

# Supported Regions and models for batch inference
](batch-inference-supported.md)
+ [

# Prerequisites for batch inference
](batch-inference-prereq.md)
+ [

# Create a batch inference job
](batch-inference-create.md)
+ [

# Monitor batch inference jobs
](batch-inference-monitor.md)
+ [

# Stop a batch inference job
](batch-inference-stop.md)
+ [

# View the results of a batch inference job
](batch-inference-results.md)
+ [

# Code example for batch inference
](batch-inference-example.md)
+ [

# Submit a batch of prompts with the OpenAI Batch API
](inference-openai-batch.md)

# Supported Regions and models for batch inference
<a name="batch-inference-supported"></a>

The following list provides links to general information about Regional and model support in Amazon Bedrock:
+ For a list of Region codes and endpoints supported in Amazon Bedrock, see [Amazon Bedrock endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bedrock_region).
+ For a list of Amazon Bedrock model IDs to use when calling Amazon Bedrock API operations, see [Supported foundation models in Amazon Bedrock](models-supported.md).
+ For a list of Amazon Bedrock inference profile IDs to use when calling Amazon Bedrock API operations, see [Supported cross-Region inference profiles](inference-profiles-support.md#inference-profiles-support-system).

Batch inference can be used with different types of models. The following list describes support for different types of Amazon Bedrock models:
+ **Single-region model support** – Lists regions that support sending inference requests to a foundation model in one AWS Region. For a full table of models available across Amazon Bedrock, see [Supported foundation models in Amazon Bedrock](models-supported.md).
+ **Cross-region inference profile support** – Lists regions that support using a cross-region inference profile, which support sending inference requests to a foundation model in multiple AWS regions within a geographical area. An inference profile has a prefix preceding the model ID that indicates its geographical area (for example, `us.`, `apac`). For more information for available inference profiles across Amazon Bedrock, see [Supported Regions and models for inference profiles](inference-profiles-support.md).
+ **Custom model support** – Lists regions that support sending inference requests to a customized model. For more information about model customization, see [Customize your model to improve its performance for your use case](custom-models.md).

The following table summarizes support for batch inference:


| Provider | Model | Model ID | Single-region model support | Cross-region inference profile support | Custom model support | 
| --- | --- | --- | --- | --- | --- | 
| Amazon | Amazon Nova Multimodal Embeddings | amazon.nova-2-multimodal-embeddings-v1:0 |  us-east-1  |  | N/A | 
| Amazon | Nova 2 Lite | amazon.nova-2-lite-v1:0 | N/A |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Nova Lite | amazon.nova-lite-v1:0 |  me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Nova Micro | amazon.nova-micro-v1:0 |  us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-2  | N/A | 
| Amazon | Nova Premier | amazon.nova-premier-v1:0 | N/A |  us-east-1 us-east-2 us-west-2  | N/A | 
| Amazon | Nova Pro | amazon.nova-pro-v1:0 |  ap-southeast-3 me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Titan Multimodal Embeddings G1 | amazon.titan-embed-image-v1 |  ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  |  us-east-1 us-west-2  | 
| Amazon | Titan Text Embeddings V2 | amazon.titan-embed-text-v2:0 |  ap-northeast-1 ap-northeast-2 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-2 sa-east-1 us-east-1 us-west-2  |  | N/A | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ca-central-1 eu-central-1 eu-central-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | N/A | N/A | 
| Anthropic | Claude 3 Opus | anthropic.claude-3-opus-20240229-v1:0 |  us-west-2  |  us-east-1  | N/A | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1:0 |  ap-northeast-2 ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Anthropic | Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 |  us-west-2  |  us-east-1  | N/A | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 |  ap-northeast-1 ap-northeast-2 ap-southeast-1 eu-central-1 us-east-1 us-east-2 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 |  us-west-2  |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 us-east-1 us-east-2 us-west-2  | N/A | 
| Anthropic | Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | N/A |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 eu-central-1 eu-north-1 eu-west-1 eu-west-3 us-east-1 us-east-2 us-west-2  | N/A | 
| Anthropic | Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | N/A |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Opus 4.5 | anthropic.claude-opus-4-5-20251101-v1:0 | N/A |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Opus 4.6 | anthropic.claude-opus-4-6-v1 | N/A |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | N/A |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | N/A |  af-south-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-gov-east-1 us-gov-west-1 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4.6 | anthropic.claude-sonnet-4-6 |  eu-west-2  |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| DeepSeek | DeepSeek V3.2 | deepseek.v3.2 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| DeepSeek | DeepSeek-V3.1 | deepseek.v3-v1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  |  | N/A | 
| Google | Gemma 3 12B IT | google.gemma-3-12b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Google | Gemma 3 27B PT | google.gemma-3-27b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Google | Gemma 3 4B IT | google.gemma-3-4b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Meta | Llama 3.1 405B Instruct | meta.llama3-1-405b-instruct-v1:0 |  us-west-2  |  | N/A | 
| Meta | Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0 |  us-west-2  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0 |  us-west-2  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 11B Instruct | meta.llama3-2-11b-instruct-v1:0 |  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 1B Instruct | meta.llama3-2-1b-instruct-v1:0 |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 3B Instruct | meta.llama3-2-3b-instruct-v1:0 |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 90B Instruct | meta.llama3-2-90b-instruct-v1:0 |  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.3 70B Instruct | meta.llama3-3-70b-instruct-v1:0 |  us-east-2  |  us-east-1 us-east-2 us-west-2  | N/A | 
| Meta | Llama 4 Maverick 17B Instruct | meta.llama4-maverick-17b-instruct-v1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Meta | Llama 4 Scout 17B Instruct | meta.llama4-scout-17b-instruct-v1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| MiniMax | MiniMax M2 | minimax.minimax-m2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| MiniMax | MiniMax M2.1 | minimax.minimax-m2.1 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Devstral 2 123B | mistral.devstral-2-123b |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Magistral Small 2509 | mistral.magistral-small-2509 |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Ministral 14B 3.0 | mistral.ministral-3-14b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Ministral 3 8B | mistral.ministral-3-8b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Ministral 3B | mistral.ministral-3-3b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Large (24.07) | mistral.mistral-large-2407-v1:0 |  us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Large 3 | mistral.mistral-large-3-675b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Small (24.02) | mistral.mistral-small-2402-v1:0 |  us-east-1  | N/A | N/A | 
| Mistral AI | Voxtral Mini 3B 2507 | mistral.voxtral-mini-3b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Voxtral Small 24B 2507 | mistral.voxtral-small-24b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Moonshot AI | Kimi K2 Thinking | moonshot.kimi-k2-thinking |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Moonshot AI | Kimi K2.5 | moonshotai.kimi-k2.5 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | NVIDIA Nemotron Nano 12B v2 VL BF16 | nvidia.nemotron-nano-12b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | NVIDIA Nemotron Nano 9B v2 | nvidia.nemotron-nano-9b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | Nemotron Nano 3 30B | nvidia.nemotron-nano-3-30b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | GPT OSS Safeguard 120B | openai.gpt-oss-safeguard-120b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | GPT OSS Safeguard 20B | openai.gpt-oss-safeguard-20b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | gpt-oss-120b | openai.gpt-oss-120b-1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | N/A | N/A | 
| OpenAI | gpt-oss-20b | openai.gpt-oss-20b-1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-2 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 32B (dense) | qwen.qwen3-32b-v1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 Coder 480B A35B Instruct | qwen.qwen3-coder-480b-a35b-v1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 Coder Next | qwen.qwen3-coder-next |  ap-southeast-2 eu-west-2 us-east-1  | N/A | N/A | 
| Qwen | Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 VL 235B A22B | qwen.qwen3-vl-235b-a22b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Z.AI | GLM 4.7 | zai.glm-4.7 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Z.AI | GLM 4.7 Flash | zai.glm-4.7-flash |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 

# Prerequisites for batch inference
<a name="batch-inference-prereq"></a>

To perform batch inference, you must fulfill the following prerequisites:

1. Prepare your dataset and upload it to an Amazon S3 bucket.

1. Create an S3 bucket for your output data.

1. Set up batch inference-related permissions for the relevant IAM identities.

1. (Optional) Set up a VPC to protect the data in your S3 while carrying out batch inference. You can skip this step if you don't need to use a VPC.

To learn how to fulfill these prerequisites, navigate through the following topics:

**Topics**
+ [

# Format and upload your batch inference data
](batch-inference-data.md)
+ [

# Required permissions for batch inference
](batch-inference-permissions.md)
+ [

# Protect batch inference jobs using a VPC
](batch-vpc.md)

# Format and upload your batch inference data
<a name="batch-inference-data"></a>

You must add your batch inference data to an S3 location that you'll choose or specify when submitting a model invocation job. The S3 location must contain the following items:
+ At least one JSONL file that defines the model inputs. A JSONL contains rows of JSON objects. Your JSONL file must end in the extension .jsonl and be in the following format:

  ```
  { "recordId" : "alphanumeric string", "modelInput" : {JSON body} }
  ...
  ```

  Each line contains a JSON object with a `recordId` field and a `modelInput` field. The format of the `modelInput` JSON object depends on the model invocation type that you choose when you [create the batch inference job](batch-inference-create.md). If you use the `InvokeModel` type (default), the format must match the `body` field for the model that you use in the `InvokeModel` request (see [Inference request parameters and response fields for foundation models](model-parameters.md)). If you use the `Converse` type, the format must match the request body of the [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API.
**Note**  
If you omit the `recordId` field, Amazon Bedrock adds it in the output.
The order of records in the output JSONL file is not guaranteed to match the order of records in the input JSONL file.
You specify the model that you want to use when you create the [batch inference job](batch-inference-create.md).
+ (If your input content contains an Amazon S3 location) Some models allow you to define the content of the input as an S3 location. See [Example video input for Amazon Nova](#batch-inference-data-ex-s3).
**Warning**  
When using S3 URIs in your prompts, all resources must be in the same S3 bucket and folder. The `InputDataConfig` parameter must specify the folder path containing all linked resources (such as videos or images), not just an individual `.jsonl` file. Note that S3 paths are case-sensitive, so ensure your URIs match the exact folder structure.

Ensure that your inputs conform to the batch inference quotas. You can search for the following quotas at [Amazon Bedrock service quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock):
+ **Minimum number of records per batch inference job** – The minimum number of records (JSON objects) across JSONL files in the job.
+ **Records per input file per batch inference job** – The maximum number of records (JSON objects) in a single JSONL file in the job.
+ **Records per batch inference job** – The maximum number of records (JSON objects) across JSONL files in the job.
+ **Batch inference input file size** – The maximum size of a single file in the job.
+ **Batch inference job size** – The maximum cumulative size of all input files.

To better understand how to set up your batch inference inputs, see the following examples:

## Example text input for Anthropic Claude 3 Haiku
<a name="batch-inference-data-ex-text"></a>

If you plan to run batch inference using the [Messages API](model-parameters-anthropic-claude-messages.md) format for the Anthropic Claude 3 Haiku model, you might provide a JSONL file containing the following JSON object as one of the lines:

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
```

## Example video input for Amazon Nova
<a name="batch-inference-data-ex-s3"></a>

If you plan to run batch inference on video inputs using the Amazon Nova Lite or Amazon Nova Pro models, you have the option of defining the video in bytes or as an S3 location in the JSONL file. For example, you might have an S3 bucket whose path is `s3://batch-inference-input-bucket` and contains the following files:

```
s3://batch-inference-input-bucket/
├── videos/
│   ├── video1.mp4
│   ├── video2.mp4
│   ├── ...
│   └── video50.mp4
└── input.jsonl
```

A sample record from the `input.jsonl` file would be the following:

```
{
    "recordId": "RECORD01",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "You are an expert in recipe videos. Describe this video in less than 200 words following these guidelines: ..."
                    },
                    {
                        "video": {
                            "format": "mp4",
                            "source": {
                                "s3Location": {
                                    "uri": "s3://batch-inference-input-bucket/videos/video1.mp4",
                                    "bucketOwner": "111122223333"
                                }
                            }
                        }
                    }
                ]
            }
        ]
    }
}
```

When you create the batch inference job, you must specify the folder path `s3://batch-inference-input-bucket` in your `InputDataConfig` parameter. Batch inference will process the `input.jsonl` file at this location, along with any referenced resources (such as the video files in the `videos` subfolder).

The following resources provide more information about submitting video inputs for batch inference:
+ To learn how to validate Amazon S3 URIs in an input request, see the [Amazon S3 URL Parsing blog](https://aws.amazon.com/blogs/devops/s3-uri-parsing-is-now-available-in-aws-sdk-for-java-2-x/).
+ For more information on how to set up invocation records for video understanding with Nova, see [Amazon Nova vision prompting guidelines](https://docs.aws.amazon.com/nova/latest/userguide/prompting-vision-prompting.html).

## Example Converse input
<a name="batch-inference-data-ex-converse"></a>

If you set the model invocation type to `Converse` when creating the batch inference job, the `modelInput` field must use the Converse API request format. The following example shows a JSONL record for a Converse batch inference job:

```
{
    "recordId": "CALL0000001",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "Summarize the following call transcript: ..."
                    }
                ]
            }
        ],
        "inferenceConfig": {
            "maxTokens": 1024
        }
    }
}
```

For the full list of fields supported in the Converse request body, see [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) in the API reference.

The following topic describes how to set up S3 access and batch inference permissions for an identity to be able to carry out batch inference.

# Required permissions for batch inference
<a name="batch-inference-permissions"></a>

To carry out batch inference, you must set up permissions for the following IAM identities:
+ The IAM identity that will create and manage batch inference jobs.
+ The batch inference [service role](security-iam-sr.md) that Amazon Bedrock assumes to perform actions on your behalf.

To learn how to set up permissions for each identity, navigate through the following topics:

**Topics**
+ [

## Required permissions for an IAM identity to submit and manage batch inference jobs
](#batch-inference-permissions-user)
+ [

## Required permissions for a service role to carry out batch inference
](#batch-inference-permissions-service)

## Required permissions for an IAM identity to submit and manage batch inference jobs
<a name="batch-inference-permissions-user"></a>

For an IAM identity to use this feature, you must configure it with the necessary permissions. To do so, do one of the following:
+ To allow an identity to carry out all Amazon Bedrock actions, attach the [AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess) policy to the identity. If you do this, you can skip this topic. This option is less secure.
+ As a security best practice, you should grant only the necessary actions to an identity. This topic describes the permissions that you need for this feature.

To restrict permissions to only actions that are used for batch inference, attach the following identity-based policy to an IAM identity:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "BatchInference",
            "Effect": "Allow",
            "Action": [  
                "bedrock:ListFoundationModels",
                "bedrock:GetFoundationModel",
                "bedrock:ListInferenceProfiles",
                "bedrock:GetInferenceProfile",
                "bedrock:ListCustomModels",
                "bedrock:GetCustomModel",
                "bedrock:TagResource", 
                "bedrock:UntagResource", 
                "bedrock:ListTagsForResource",
                "bedrock:CreateModelInvocationJob",
                "bedrock:GetModelInvocationJob",
                "bedrock:ListModelInvocationJobs",
                "bedrock:StopModelInvocationJob"
            ],
            "Resource": "*"
        }
    ]   
}
```

------

To further restrict permissions, you can omit actions, or you can specify resources and condition keys by which to filter permissions. For more information about actions, resources, and condition keys, see the following topics in the *Service Authorization Reference*:
+ [Actions defined by Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions) – Learn about actions, the resource types that you can scope them to in the `Resource` field, and the condition keys that you can filter permissions on in the `Condition` field.
+ [Resource types defined by Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies) – Learn about the resource types in Amazon Bedrock.
+ [Condition keys for Amazon Bedrock](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys) – Learn about the condition keys in Amazon Bedrock.

The following policy is an example that scopes down permissions for batch inference to only allow a user with the account ID `123456789012` to create batch inference jobs in the `us-west-2` Region, using the Anthropic Claude 3 Haiku model:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "CreateBatchInferenceJob",
            "Effect": "Allow",
            "Action": [
                "bedrock:CreateModelInvocationJob"
            ],
            "Resource": [
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                "arn:aws:bedrock:us-west-2:123456789012:model-invocation-job/*"
            ]
        }
    ]
}
```

------

## Required permissions for a service role to carry out batch inference
<a name="batch-inference-permissions-service"></a>

Batch inference is carried out by a [service role](security-iam-sr.md) that assumes your identity to perform actions on your behalf. You can create a service role in the following ways:
+ Let Amazon Bedrock automatically create a service role with the necessary permissions for you by using the AWS Management Console. You can select this option when you create a batch inference job.
+ Create a custom service role for Amazon Bedrock by using AWS Identity and Access Management and attach the necessary permissions. When you submit the batch inference job, you then specify this role. For more information about creating a custom service role for batch inference, see [Create a custom service role for batch inference](batch-iam-sr.md). For more general information about creating service roles, see [Create a role to delegate permissions to an AWS service](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html) in the IAM User Guide.

**Important**  
If the S3 bucket in which you [uploaded your data for batch inference](batch-inference-data.md) is in a different AWS account, you must configure an S3 bucket policy to allow the service role access to the data. You must manually configure this policy even if you use the console to automatically create a service role. To learn how to configure an S3 bucket policy for Amazon Bedrock resources, see [Attach a bucket policy to an Amazon S3 bucket to allow another account to access it](s3-bucket-access.md#s3-bucket-access-cross-account).
Foundation models in Amazon Bedrock are AWS-managed resources that cannot be used with IAM policy conditions requiring customer ownership. These models are owned and operated by AWS, and cannot be owned by individual customers. Any IAM policy condition that checks for customer-owned resources (such as conditions using resource tags, organization ID, or other ownership attributes) will fail when applied to foundation models, potentially blocking legitimate access to these services.  
For example, if your policy includes an `aws:ResourceOrgID` condition like this:  

  ```
  {
    "Condition": {
      "StringEqualsIgnoreCase": {
        "aws:ResourceOrgID": ["o-xxxxxxxx"]
      }
    }
  }
  ```
Your batch inference job will fail with `AccessDeniedException`. Remove the `aws:ResourceOrgID` condition or create separate policy statements for foundation models.

# Protect batch inference jobs using a VPC
<a name="batch-vpc"></a>

When you run a batch inference job, the job accesses your Amazon S3 bucket to download the input data and to write the output data. To control access to your data, we recommend that you use a virtual private cloud (VPC) with [Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html). You can further protect your data by configuring your VPC so that your data isn't available over the internet and instead creating a VPC interface endpoint with [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html) to establish a private connection to your data. For more information about how Amazon VPC and AWS PrivateLink integrate with Amazon Bedrock, see [Protect your data using Amazon VPC and AWS PrivateLink](usingVPC.md).

Carry out the following steps to configure and use a VPC for the input prompts and output model responses for your batch inference jobs.

**Topics**
+ [

## Set up VPC to protect your data during batch inference
](#batch-vpc-setup)
+ [

## Attach VPC permissions to a batch inference role
](#batch-vpc-role)
+ [

## Add the VPC configuration when submitting a batch inference job
](#batch-vpc-config)

## Set up VPC to protect your data during batch inference
<a name="batch-vpc-setup"></a>

To set up a VPC, follow the steps at [Set up a VPC](usingVPC.md#create-vpc). You can further secure your VPC by setting up an S3 VPC endpoint and using resource-based IAM policies to restrict access to the S3 bucket containing your batch inference data by following the steps at [(Example) Restrict data access to your Amazon S3 data using VPC](vpc-s3.md).

## Attach VPC permissions to a batch inference role
<a name="batch-vpc-role"></a>

After you finish setting up your VPC, attach the following permissions to your [batch inference service role](batch-iam-sr.md) to allow it to access the VPC. Modify this policy to allow access to only the VPC resources that your job needs. Replace the *subnet-ids* and *security-group-id* with the values from your VPC.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeVpcs",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
                "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}",
                "arn:aws:ec2:us-east-1:123456789012:security-group/${{security-group-id}}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/BedrockManaged": [
                        "true"
                    ]
                },
                "ArnEquals": {
                    "aws:RequestTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "3",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": [
                "*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:Subnet": [
                        "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}"
                    ]
                },
                "ArnEquals": {
                    "ec2:ResourceTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "4",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "CreateNetworkInterface"
                    ]
                },
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "BedrockManaged",
                        "BedrockModelInvocationJobArn"
                    ]
                }
            }
        }
    ]
}
```

------

## Add the VPC configuration when submitting a batch inference job
<a name="batch-vpc-config"></a>

After you configure the VPC and the required roles and permissions as described in the previous sections, you can create a batch inference job that uses this VPC.

**Note**  
Currently, when creating a batch inference job, you can only use a VPC through the API.

When you specify the VPC subnets and security groups for a job, Amazon Bedrock creates *elastic network interfaces* (ENIs) that are associated with your security groups in one of the subnets. ENIs allow the Amazon Bedrock job to connect to resources in your VPC. For information about ENIs, see [Elastic Network Interfaces](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_ElasticNetworkInterfaces.html) in the *Amazon VPC User Guide*. Amazon Bedrock tags ENIs that it creates with `BedrockManaged` and `BedrockModelInvocationJobArn` tags.

We recommend that you provide at least one subnet in each Availability Zone.

You can use security groups to establish rules for controlling Amazon Bedrock access to your VPC resources.

When you submit a [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) request, you can include a `VpcConfig` as a request parameter to specify the VPC subnets and security groups to use, as in the following example.

```
"vpcConfig": { 
    "securityGroupIds": [
        "sg-0123456789abcdef0"
    ],
    "subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ]
}
```

# Create a batch inference job
<a name="batch-inference-create"></a>

After you've set up an Amazon S3 bucket with files for running model inference, you can create a batch inference job. Before you begin, check that you set up the files in accordance with the instructions described in [Format and upload your batch inference data](batch-inference-data.md).

**Note**  
To submit a batch inference job using a VPC, you must use the API. Select the API tab to learn how to include the VPC configuration.

To learn how to create a batch inference job, choose the tab for your preferred method, and then follow the steps:

------
#### [ Console ]

**To create a batch inference job**

1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock).

1. From the left navigation pane, select **Batch inference**.

1. In the **Batch inference jobs** section, choose **Create job**.

1. In the **Job details** section, give the batch inference job a **Job name** and select a model to use for the batch inference job by choosing **Select model**.

1. In the **Model invocation type** section, choose the API format for your input data. Choose **InvokeModel** if your input data uses model-specific request formats, or choose **Converse** if your input data uses the Converse API format. The default is **InvokeModel**.

1. In the **Input data** section, choose **Browse S3** and select an S3 location for your batch inference job. Batch inference processes all JSONL and accompanying content files at that S3 location, whether the location is an S3 folder or a single JSONL file.
**Note**  
If the input data is in an S3 bucket that belongs to a different account from the one from which you're submitting the job, you must use the API to submit the batch inference job. To learn how to do this, select the API tab above.

1. In the **Output data** section, choose **Browse S3** and select an S3 location to store the output files from your batch inference job. By default, the output data will be encrypted by an AWS managed key. To choose a custom KMS key, select **Customize encryption settings (advanced)** and choose a key. For more information about encryption of Amazon Bedrock resources and setting up a custom KMS key see [Data encryption](data-encryption.md).
**Note**  
If you plan to write the output data to an S3 bucket that belongs to a different account from the one from which you're submitting the job, you must use the API to submit the batch inference job. To learn how to do this, select the API tab above.

1. In the **Service access** section, select one of the following options:
   + **Use an existing service role** – Select a service role from the drop-down list. For more information on setting up a custom role with the appropriate permissions, see [Required permissions for batch inference](batch-inference-permissions.md).
   + **Create and use a new service role** – Enter a name for the service role.

1. (Optional) To associate tags with the batch inference job, expand the **Tags** section and add a key and optional value for each tag. For more information, see [Tagging Amazon Bedrock resources](tagging.md).

1. Choose **Create batch inference job**.

------
#### [ API ]

To create a batch inference job, send a [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) request with an [Amazon Bedrock control plane endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp).

The following fields are required:


****  

| Field | Use case | 
| --- | --- | 
| jobName | To specify a name for the job. | 
| roleArn | To specify the Amazon Resource Name (ARN) of the service role with permissions to create and manage the job. For more information, see [Create a custom service role for batch inference](batch-iam-sr.md). | 
| modelId | To specify the ID or ARN of the model to use in inference. | 
| inputDataConfig | To specify the S3 location containing the input data. Batch inference processes all JSONL and accompanying content files at that S3 location, whether the location is an S3 folder or a single JSONL file. For more information, see [Format and upload your batch inference data](batch-inference-data.md). | 
| outputDataConfig | To specify the S3 location to write the model responses to. | 

The following fields are optional:


****  

| Field | Use case | 
| --- | --- | 
| modelInvocationType | To specify the API format of the input data. Set to Converse to use the Converse API format, or InvokeModel (default) to use model-specific request formats. For more information about the Converse request format, see [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html). | 
| timeoutDurationInHours | To specify the duration in hours after which the job will time out. | 
| tags | To specify any tags to associate with the job. For more information, see [Tagging Amazon Bedrock resources](tagging.md). | 
| vpcConfig | To specify the VPC configuration to use to protect your data during the job. For more information, see [Protect batch inference jobs using a VPC](batch-vpc.md). | 
| clientRequestToken | To ensure the API request completes only once. For more information, see [Ensuring idempotency](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html). | 

The response returns a `jobArn` that you can use to refer to the job when carrying out other batch inference-related API calls.

------

# Monitor batch inference jobs
<a name="batch-inference-monitor"></a>

Apart from the configurations you set for a batch inference job, you can also monitor its progress by seeing its status. For more information about the possible statuses for a job, see the `status` field in [ModelInvocationJobSummary](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ModelInvocationJobSummary.html).

To track a job's progress, you can use the progress counters that the [GetModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetModelInvocationJob.html) and [ListModelInvocationJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListModelInvocationJobs.html) API operations return. These counters show the total number of input records and how many the service has processed. You can monitor completion without checking Amazon S3 output buckets. Alternatively, you can find these numbers in the `manifest.json.out` file in the Amazon S3 bucket that contains the output files. For more information, see [View the results of a batch inference job](batch-inference-results.md). To learn how to download an S3 object, see [Downloading objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html).

**Tip**  
Instead of polling for job status, you can use Amazon EventBridge to receive automatic notifications when a batch inference job completes or changes state. For more information, see [Monitor Amazon Bedrock job state changes using Amazon EventBridgeMonitor event changes](monitoring-eventbridge.md).

To learn how to view details about batch inference jobs, choose the tab for your preferred method, and then follow the steps:

------
#### [ Console ]

**To view information about batch inference jobs**

1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock).

1. From the left navigation pane, select **Batch inference**.

1. In the **Batch inference jobs** section, choose a job.

1. On the job details page, you can view information about the job's configuration and monitor its progress by viewing its **Status**.

------
#### [ API ]

To get information about a batch inference job, send a [GetModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetModelInvocationJob.html) request with an [Amazon Bedrock control plane endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp) and provide the ID or ARN of the job in the `jobIdentifier` field.

To list information about multiple batch inference jobs, send [ListModelInvocationJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListModelInvocationJobs.html) request with an [Amazon Bedrock control plane endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp). You can specify the following optional parameters:


****  

| Field | Short description | 
| --- | --- | 
| maxResults | The maximum number of results to return in a response. | 
| nextToken | If there are more results than the number you specified in the maxResults field, the response returns a nextToken value. To see the next batch of results, send the nextToken value in another request. | 

The response for `GetModelInvocationJob` and `ListModelInvocationJobs` includes a `modelInvocationType` field that indicates whether the job uses the `InvokeModel` or `Converse` API format.

The response also includes the following fields that you can use to track the progress of a running job:
+ `totalRecordCount` – The total number of records submitted to the batch inference job.
+ `processedRecordCount` – The number of records processed so far, which includes both successes and errors.
+ `successRecordCount` – The number of records successfully processed so far.
+ `errorRecordCount` – The number of records that have caused errors during processing.

To calculate the percentage of progress for a running job, divide `processedRecordCount` by `totalRecordCount`. The counters return `0` when you submit a job but processing has not yet started. While a job is in progress, the counters might be delayed by up to 1 minute.

To list all the tags for a job, send a [ListTagsForResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListTagsForResource.html) request with an [Amazon Bedrock control plane endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp) and include the Amazon Resource Name (ARN) of the job.

------

# Stop a batch inference job
<a name="batch-inference-stop"></a>

To learn how to stop an ongoing batch inference job, choose the tab for your preferred method, and then follow the steps:

------
#### [ Console ]

**To stop a batch inference job**

1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock).

1. From the left navigation pane, select **Batch inference**.

1. Select a job to go to the job details page or select the option button next to a job.

1. Choose **Stop job**.

1. Review the message and choose **Stop job** to confirm.
**Note**  
You're charged for tokens that have already been processed.

------
#### [ API ]

To stop a batch inference job, send a [StopModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_StopModelInvocationJob.html) request with an [Amazon Bedrock control plane endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp) and provide the ID or ARN of the job in the `jobIdentifier` field.

If the job was successfully stopped, you receive an HTTP 200 response.

------

# View the results of a batch inference job
<a name="batch-inference-results"></a>

After a batch inference job is `Completed`, you can extract the results of the batch inference job from the files in the Amazon S3 bucket that you specified during creation of the job. To learn how to download an S3 object, see [Downloading objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html). The S3 bucket contains the following files:

1. Amazon Bedrock generates an output JSONL file for each input JSONL file. The output files contain outputs from the model for each input in the following format. An `error` object replaces the `modelOutput` field in any line where there was an error in inference. The format of the `modelOutput` JSON object depends on the model invocation type. For `InvokeModel` jobs, the format matches the `body` field in the `InvokeModel` response (see [Inference request parameters and response fields for foundation models](model-parameters.md)). For `Converse` jobs, the format matches the response body of the [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API.

   ```
   { "recordId" : "string", "modelInput": {JSON body}, "modelOutput": {JSON body} }
   ```

   The following example shows a possible output file.

   ```
   { "recordId" : "3223593EFGH", "modelInput" : {"inputText": "Roses are red, violets are"}, "modelOutput" : {"inputTextTokenCount": 8, "results": [{"tokenCount": 3, "outputText": "blue\n", "completionReason": "FINISH"}]}}
   { "recordId" : "1223213ABCD", "modelInput" : {"inputText": "Hello world"}, "error" : {"errorCode" : 400, "errorMessage" : "bad request" }}
   ```

1. A `manifest.json.out` file containing a summary of the batch inference job.

   ```
   {
       "totalRecordCount" : number, 
       "processedRecordCount" : number,
       "successRecordCount": number,
       "errorRecordCount": number,
       "inputTokenCount": number,
       "outputTokenCount" : number
   }
   ```

   The fields are described below:
   + `totalRecordCount` – The total number of records submitted to the batch inference job.
   + `processedRecordCount` – The number of records processed, which includes both successes and errors.
   + `successRecordCount` – The number of records successfully processed.
   + `errorRecordCount` – The number of records that caused errors during processing.
   + `inputTokenCount` – The total number of input tokens submitted to the batch inference job.
   + `outputTokenCount` – The total number of output tokens generated by the batch inference job.

# Code example for batch inference
<a name="batch-inference-example"></a>

The code example in this chapter shows how to create a batch inference job, view information about it, and stop it. This example uses the `InvokeModel` API format. For information about using the `Converse` API format, see [Format and upload your batch inference data](batch-inference-data.md).

Select a language to see a code example for it:

------
#### [ Python ]

Create a JSONL file named *abc.jsonl* and include a JSON object for each record that contains at least the minimum number of records (see the **Minimum number of records per batch inference job for *\$1Model\$1*** [Quotas for Amazon Bedrock](quotas.md)). In this example, you'll use the Anthropic Claude 3 Haiku model. The following example shows the first input JSON in the file:

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
... 
# Add records until you hit the minimum
```

Create an S3 bucket called *amzn-s3-demo-bucket-input* and upload the file to it. Then create an S3 bucket called *amzn-s3-demo-bucket-output* to write your output files to. Run the following code snippet to submit a job and get the *jobArn* from the response:

```
import boto3

bedrock = boto3.client(service_name="bedrock")

inputDataConfig=({
    "s3InputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-input/abc.jsonl"
    }
})

outputDataConfig=({
    "s3OutputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-output/"
    }
})

response=bedrock.create_model_invocation_job(
    roleArn="arn:aws:iam::123456789012:role/MyBatchInferenceRole",
    modelId="anthropic.claude-3-haiku-20240307-v1:0",
    jobName="my-batch-job",
    inputDataConfig=inputDataConfig,
    outputDataConfig=outputDataConfig
)

jobArn = response.get('jobArn')
```

Return the `status` of the job.

```
bedrock.get_model_invocation_job(jobIdentifier=jobArn)['status']
```

List batch inference jobs that *Failed*.

```
bedrock.list_model_invocation_jobs(
    maxResults=10,
    statusEquals="Failed",
    sortOrder="Descending"
)
```

Stop the job that you started.

```
bedrock.stop_model_invocation_job(jobIdentifier=jobArn)
```

------

# Submit a batch of prompts with the OpenAI Batch API
<a name="inference-openai-batch"></a>

You can run a batch inference job using the [OpenAI Create batch API](https://platform.openai.com/docs/api-reference/batch) with Amazon Bedrock OpenAI models.

You can call the OpenAI Create batch API in the following ways:
+ Make an HTTP request with an Amazon Bedrock Runtime endpoint.
+ Use an OpenAI SDK request with an Amazon Bedrock Runtime endpoint.

Select a topic to learn more:

**Topics**
+ [

## Supported models and Regions for the OpenAI batch API
](#inference-openai-batch-supported)
+ [

## Prerequisites to use the OpenAI batch API
](#inference-openai-batch-prereq)
+ [

## Create an OpenAI batch job
](#inference-openai-batch-create)
+ [

## Retrieve an OpenAI batch job
](#inference-openai-batch-retrieve)
+ [

## List OpenAI batch jobs
](#inference-openai-batch-list)
+ [

## Cancel an OpenAI batch job
](#inference-openai-batch-cancel)

## Supported models and Regions for the OpenAI batch API
<a name="inference-openai-batch-supported"></a>

You can use the OpenAI Create batch API with all OpenAI models supported in Amazon Bedrock and in the AWS Regions that support these models. For more information about supported models and regions, see [Supported foundation models in Amazon Bedrock](models-supported.md).

## Prerequisites to use the OpenAI batch API
<a name="inference-openai-batch-prereq"></a>

To see prerequisites for using the OpenAI batch API operations, choose the tab for your preferred method, and then follow the steps:

------
#### [ OpenAI SDK ]
+ **Authentication** – The OpenAI SDK only supports authentication with an Amazon Bedrock API key. Generate an Amazon Bedrock API key to authenticate your request. To learn about Amazon Bedrock API keys and how to generate them, see the API keys section in the Build chapter.
+ **Endpoint** – Find the endpoint that corresponds to the AWS Region to use in [Amazon Bedrock Runtime endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt). If you use an AWS SDK, you might only need to specify the region code and not the whole endpoint when you set up the client.
+ **Model access** – Request access to an Amazon Bedrock model that supports this feature. For more information, see [Manage model access using SDK and CLI](model-access.md#model-access-modify).
+ **Install an OpenAI SDK** – For more information, see [Libraries](https://platform.openai.com/docs/libraries) in the OpenAI documentation.
+ **Batch JSONL file uploaded to S3** – Follow the steps at [Prepare your batch file](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file) in the OpenAI documentation to prepare your batch file with the correct format. Then upload it to an Amazon S3 bucket.
+ **IAM permissions** – Make sure that you have the following IAM identities with the proper permissions:
  + An IAM identity that you authenticate with can carry out batch inference-related API operations. For more information, see [Required permissions for an IAM identity to submit and manage batch inference jobs](batch-inference-permissions.md).
  + The batch inference service role that you use can assume your identity, invoke the OpenAI model that you use, and has access to your batch JSONL file in S3. For more information, see [Service roles](security-iam-sr.md).

------
#### [ HTTP request ]
+ **Authentication** – You can authenticate with either your AWS credentials or with an Amazon Bedrock API key.

  Set up your AWS credentials or generate an Amazon Bedrock API key to authenticate your request.
  + To learn about setting up your AWS credentials, see [Programmatic access with AWS security credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds-programmatic-access.html).
  + To learn about Amazon Bedrock API keys and how to generate them, see the API keys section in the Build chapter.
+ **Endpoint** – Find the endpoint that corresponds to the AWS Region to use in [Amazon Bedrock Runtime endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt). If you use an AWS SDK, you might only need to specify the region code and not the whole endpoint when you set up the client.
+ **Model access** – Request access to an Amazon Bedrock model that supports this feature. For more information, see [Manage model access using SDK and CLI](model-access.md#model-access-modify).
+ **Batch JSONL file uploaded to S3** – Follow the steps at [Prepare your batch file](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file) in the OpenAI documentation to prepare your batch file with the correct format. Then upload it to an Amazon S3 bucket.
+ **IAM permissions** – Make sure that you have the following IAM identities with the proper permissions:
  + An IAM identity that you authenticate with can carry out batch inference-related API operations. For more information, see [Required permissions for an IAM identity to submit and manage batch inference jobs](batch-inference-permissions.md).
  + The batch inference service role that you use can assume your identity, invoke the OpenAI model that you use, and has access to your batch JSONL file in S3. For more information, see [Service roles](security-iam-sr.md).

------

## Create an OpenAI batch job
<a name="inference-openai-batch-create"></a>

For details about the OpenAI Create batch API, refer to the following resources in the OpenAI documentation:
+ [Create batch](https://platform.openai.com/docs/api-reference/batch/create) – Details both the request and response.
+ [The request output object](https://platform.openai.com/docs/api-reference/batch/request-output) – Details the fields of the generated output from the batch job. Refer to this documentation when interpreting the results in your S3 bucket.

**Form the request**  
When forming the batch inference request, note the following Amazon Bedrock-specific fields and values:

**Request headers**
+ X-Amzn-Bedrock-RoleArn (required) – The Amazon Resource Name (ARN) of the batch inference service role. For more information, see [Create a custom service role for batch inference](batch-iam-sr.md)
+ X-Amzn-Bedrock-ModelId (required) – The ID of the foundation model to use in inference. For more information, see [Supported foundation models in Amazon Bedrock](models-supported.md).
+ X-Amzn-Bedrock-OutputEncryptionKeyId (optional) – The ID of a KMS key that you want to use to encrypt the output S3 files. For more information, see [Specifying server-side encryption with AWS KMS (SSE-KMS)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html).
+ X-Amzn-Bedrock-Tags (optional) – A dictionary of keys and values that indicate tags to attach to the output. For more information, see [Tagging Amazon Bedrock resources](tagging.md).

**Request body parameters:**
+ endpoint – Must be `v1/chat/completions`.
+ input\$1file\$1id – Specify the S3 URI of your batch JSONL file.

**Find the generated results**  
The creation response includes a batch ID. The results and error logging of the batch inference job are written to the S3 folder containing the input file. The results will be in a folder with the same name as the batch ID, as in the following folder structure:

```
---- {batch_input_folder}
        |---- {batch_input}.jsonl
        |---- {batch_id}
	           |---- {batch_input}.jsonl.out
	           |---- {batch_input}.jsonl.err
```

To see examples of using the OpenAI Create batch API with different methods, choose the tab for your preferred method, and then follow the steps:

------
#### [ OpenAI SDK (Python) ]

To create a batch job with the OpenAI SDK, do the following:

1. Import the OpenAI SDK and set up the client with the following fields:
   + `base_url` – Prefix the Amazon Bedrock Runtime endpoint to `/openai/v1`, as in the following format:

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – Specify an Amazon Bedrock API key.
   + `default_headers` – If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the `extra_headers` when making a specific API call.

1. Use the [batches.create()](https://platform.openai.com/docs/api-reference/batch/create) method with the client.

Before running the following example, replace the placeholders in the following fields:
+ api\$1key – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ X-Amzn-BedrockRoleArn – Replace *arn:aws:iam::123456789012:role/BatchServiceRole* with the actual batch inference service role you set up.
+ input\$1file\$1id – Replace *s3://amzn-s3-demo-bucket/openai-input.jsonl* with the actual S3 URI to which you uploaded your batch JSONL file.

The example calls the OpenAI Create batch job API in `us-west-2` and includes one piece of metadata.

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK", # Replace with actual API key
    default_headers={
        "X-Amzn-Bedrock-RoleArn": "arn:aws:iam::123456789012:role/BatchServiceRole" # Replace with actual service role ARN
    }
)

job = client.batches.create(
    input_file_id="s3://amzn-s3-demo-bucket/openai-input.jsonl", # Replace with actual S3 URI
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={
        "description": "test input"
    },
    extra_headers={
        "X-Amzn-Bedrock-ModelId": "openai.gpt-oss-20b-1:0",
    }
)
print(job)
```

------
#### [ HTTP request ]

To create a chat completion with a direct HTTP request, do the following:

1. Use the POST method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to `/openai/v1/batches`, as in the following format:

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

1. Specify your AWS credentials or an Amazon Bedrock API key in the `Authorization` header.

Before running the following example, first replace the placeholders in the following fields:
+ Authorization – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ X-Amzn-BedrockRoleArn – Replace *arn:aws:iam::123456789012:role/BatchServiceRole* with the actual batch inference service role you set up.
+ input\$1file\$1id – Replace *s3://amzn-s3-demo-bucket/openai-input.jsonl* with the actual S3 URI to which you uploaded your batch JSONL file.

The following example calls the Create chat completion API in `us-west-2` and includes one piece of metadata:

```
curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \  
    -H 'Content-Type: application/json' \
    -H 'X-Amzn-Bedrock-ModelId: openai.gpt-oss-20b-1:0' \
    -H 'X-Amzn-Bedrock-RoleArn: arn:aws:iam::123456789012:role/BatchServiceRole' \  
    -d '{    
    "input_file_id": "s3://amzn-s3-demo-bucket/openai-input.jsonl",    
    "endpoint": "/v1/chat/completions",    
    "completion_window": "24h",
    "metadata": {"description": "test input"}  
}'
```

------

## Retrieve an OpenAI batch job
<a name="inference-openai-batch-retrieve"></a>

For details about the OpenAI Retrieve batch API request and response, refer to [Retrieve batch](https://platform.openai.com/docs/api-reference/batch/retrieve).

When you make the request, you specify the ID of the batch job for which to get information. The response returns information about a batch job, including the output and error file names that you can look up in your S3 buckets.

To see examples of using the OpenAI Retrieve batch API with different methods, choose the tab for your preferred method, and then follow the steps:

------
#### [ OpenAI SDK (Python) ]

To retrieve a batch job with the OpenAI SDK, do the following:

1. Import the OpenAI SDK and set up the client with the following fields:
   + `base_url` – Prefix the Amazon Bedrock Runtime endpoint to `/openai/v1`, as in the following format:

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – Specify an Amazon Bedrock API key.
   + `default_headers` – If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the `extra_headers` when making a specific API call.

1. Use the [batches.retrieve()](https://platform.openai.com/docs/api-reference/batch/create) method with the client and specify the ID of the batch for which to retrieve information.

Before running the following example, replace the placeholders in the following fields:
+ api\$1key – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ batch\$1id – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.

The example calls the OpenAI Retrieve batch job API in `us-west-2` on a batch job whose ID is *batch\$1abc123*.

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.retrieve(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

To retrieve a batch job with a direct HTTP request, do the following:

1. Use the GET method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to `/openai/v1/batches/${batch_id}`, as in the following format:

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123
   ```

1. Specify your AWS credentials or an Amazon Bedrock API key in the `Authorization` header.

Before running the following example, first replace the placeholders in the following fields:
+ Authorization – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ batch\$1abc123 – In the path, replace this value with the actual ID of your batch job.

The following example calls the OpenAI Retrieve batch API in `us-west-2` on a batch job whose ID is *batch\$1abc123*.

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------

## List OpenAI batch jobs
<a name="inference-openai-batch-list"></a>

For details about the OpenAI List batches API request and response, refer to [List batches](https://platform.openai.com/docs/api-reference/batch/list). The response returns an array of information about your batch jobs.

When you make the request, you can include query parameters to filter the results. The response returns information about a batch job, including the output and error file names that you can look up in your S3 buckets.

To see examples of using the OpenAI List batches API with different methods, choose the tab for your preferred method, and then follow the steps:

------
#### [ OpenAI SDK (Python) ]

To list batch jobs with the OpenAI SDK, do the following:

1. Import the OpenAI SDK and set up the client with the following fields:
   + `base_url` – Prefix the Amazon Bedrock Runtime endpoint to `/openai/v1`, as in the following format:

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – Specify an Amazon Bedrock API key.
   + `default_headers` – If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the `extra_headers` when making a specific API call.

1. Use the [batches.list()](https://platform.openai.com/docs/api-reference/batch/list) method with the client. You can include any of the optional parameters.

Before running the following example, replace the placeholders in the following fields:
+ api\$1key – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.

The example calls the OpenAI List batch jobs API in `us-west-2` and specifies a limit of 2 results to return.

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.list(limit=2)

print(job)
```

------
#### [ HTTP request ]

To list batch jobs with a direct HTTP request, do the following:

1. Use the GET method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to `/openai/v1/batches`, as in the following format:

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

   You can include any of the optional query parameters.

1. Specify your AWS credentials or an Amazon Bedrock API key in the `Authorization` header.

Before running the following example, first replace the placeholders in the following fields:
+ Authorization – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.

The following example calls the OpenAI List batches API in `us-west-2` and specifies a limit of 2 results to return.

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches?limit=2' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \
```

------

## Cancel an OpenAI batch job
<a name="inference-openai-batch-cancel"></a>

For details about the OpenAI Cancel batch API request and response, refer to [Cancel batch](https://platform.openai.com/docs/api-reference/batch/cancel). The response returns information about the cancelled batch job.

When you make the request, you specify the ID of the batch job that you want to cancel.

To see examples of using the OpenAI Cancel batch API with different methods, choose the tab for your preferred method, and then follow the steps:

------
#### [ OpenAI SDK (Python) ]

To cancel a batch job with the OpenAI SDK, do the following:

1. Import the OpenAI SDK and set up the client with the following fields:
   + `base_url` – Prefix the Amazon Bedrock Runtime endpoint to `/openai/v1`, as in the following format:

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – Specify an Amazon Bedrock API key.
   + `default_headers` – If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the `extra_headers` when making a specific API call.

1. Use the [batches.cancel()](https://platform.openai.com/docs/api-reference/batch/cancel) method with the client and specify the ID of the batch for which to retrieve information.

Before running the following example, replace the placeholders in the following fields:
+ api\$1key – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ batch\$1id – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.

The example calls the OpenAI Cancel batch job API in `us-west-2` on a batch job whose ID is *batch\$1abc123*.

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.cancel(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

To cancel a batch job with a direct HTTP request, do the following:

1. Use the POST method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to `/openai/v1/batches/${batch_id}/cancel`, as in the following format:

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123/cancel
   ```

1. Specify your AWS credentials or an Amazon Bedrock API key in the `Authorization` header.

Before running the following example, first replace the placeholders in the following fields:
+ Authorization – Replace *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* with your actual API key.
+ batch\$1abc123 – In the path, replace this value with the actual ID of your batch job.

The following example calls the OpenAI Cancel batch API in `us-west-2` on a batch job whose ID is *batch\$1abc123*.

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123/cancel' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------