

# What is AWS Batch?
<a name="what-is-batch"></a>

AWS Batch helps you to run batch computing workloads on the AWS Cloud. Batch computing is a common way for developers, scientists, and engineers to access large amounts of compute resources. AWS Batch removes the undifferentiated heavy lifting of configuring and managing the required infrastructure, similar to traditional batch computing software. This service can efficiently provision resources in response to jobs submitted in order to eliminate capacity constraints, reduce compute costs, and deliver results quickly.

As a fully managed service, AWS Batch helps you to run batch computing workloads of any scale. AWS Batch automatically provisions compute resources and optimizes the workload distribution based on the quantity and scale of the workloads. With AWS Batch, there's no need to install or manage batch computing software, so you can focus your time on analyzing results and solving problems.

![\[Showing the layers of AWS Batch for workloads, orchestration, and capacity\]](http://docs.aws.amazon.com/batch/latest/userguide/images/batch-diagram.png)


AWS Batch provides all of the necessary functionality to run high-scale, compute-intensive workloads on top of AWS managed container orchestration services, Amazon ECS and Amazon EKS. AWS Batch is able to scale compute capacity on Amazon EC2 instances and Fargate resources.

AWS Batch provides a fully managed service for batch workloads, and delivers the operational capabilities to optimize these types of workloads for throughput, speed, resource efficiency, and cost. 

AWS Batch also enables SageMaker Training job queuing, allowing data scientists and ML engineers to submit Training jobs with priorities to configurable queues. This capability ensures that ML workloads run automatically as soon as resources become available, eliminating the need for manual coordination and improving resource utilization.

For machine learning workloads, AWS Batch provides queuing capabilities for SageMaker Training jobs. You can configure queues with specific policies to optimize cost, performance, and resource allocation for your ML Training workloads.

![\[Workflow diagram showing administrator setting up roles, data scientist creating service environment and job queue, submitting SageMaker training jobs, and monitoring jobs in both AWS Batch queue and SageMaker AI execution\]](http://docs.aws.amazon.com/batch/latest/userguide/images/Batch-SageMaker-Diagram-Light-Mode.png)


This provides a shared responsibility model where administrators set up the infrastructure and permissions, while data scientists can focus on submitting and monitoring their ML training workloads. Jobs are automatically queued and executed based on configured priorities and resource availability.

## Are you a first-time AWS Batch user?
<a name="first-time-user"></a>

If you are a first-time user of AWS Batch, we recommend that you begin by reading the following sections:
+ [Components of AWS Batch](batch_components.md)
+ [Create IAM account and administrative user](create-an-iam-account.md)
+ [Setting up AWS Batch](get-set-up-for-aws-batch.md)
+ [Getting started with AWS Batch tutorials](Batch_GetStarted.md)
+ [Getting started with AWS Batch on SageMaker AI](getting-started-sagemaker.md) 

## Related services
<a name="related-services"></a>

AWS Batch is a fully managed batch computing service that plans, schedules, and runs your containerized batch ML, simulation, and analytics workloads across the full range of AWS compute offerings, such as Amazon ECS, Amazon EKS, AWS Fargate, and Spot or On-Demand Instances. For more information about each managed compute service, see:
+ [Amazon EC2 *User Guide*](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html)
+ [AWS Fargate* Developer Guide*](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html)
+ [Amazon EKS *User Guide*](https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html)
+ [Amazon SageMaker AI* Developer Guide*](https://docs.aws.amazon.com/sagemaker/latest/dg/gs.htm)

## Accessing AWS Batch
<a name="acessing-servicename"></a>

You can access AWS Batch using the following:

**AWS Batch console**  
The web interface where you create and manage resources.

**AWS Command Line Interface**  
Interact with AWS services using commands in your command line shell. The AWS Command Line Interface is supported on Windows, macOS, and Linux. For more information about the AWS CLI, see [AWS Command Line Interface User Guide](https://docs.aws.amazon.com/cli/latest/userguide/). You can find the AWS Batch commands in the [AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/).

**AWS SDKs**  
If you prefer to build applications using language-specific APIs instead of submitting a request over HTTP or HTTPS, use the libraries, sample code, tutorials, and other resources provided by AWS. These libraries provide basic functions that automate tasks, such as cryptographically signing your requests, retrying requests, and handling error responses. These functions make it more efficient for you to get started. For more information, see [Tools to Build on AWS](https://aws.amazon.com/developer/tools/).

# Components of AWS Batch
<a name="batch_components"></a>

AWS Batch simplifies running batch jobs across multiple Availability Zones within a Region. You can create AWS Batch compute environments within a new or existing VPC. After a compute environment is up and associated with a job queue, you can define job definitions that specify which Docker container images to run your jobs. Container images are stored in and pulled from container registries, which may exist within or outside of your AWS infrastructure.

![\[Showing the components of AWS Batch and how they integrated together\]](http://docs.aws.amazon.com/batch/latest/userguide/images/batch-components.png)


## Compute environment
<a name="component_compute_environment"></a>

A compute environment is a set of managed or unmanaged compute resources that are used to run jobs. With managed compute environments, you can specify desired compute type (Fargate or EC2) at several levels of detail. You can set up compute environments that use a particular type of EC2 instance, a particular model such as `c5.2xlarge` or `m5.10xlarge`. Or, you can choose only to specify that you want to use the newest instance types. You can also specify the minimum, desired, and maximum number of vCPUs for the environment, along with the amount that you're willing to pay for a Spot Instance as a percentage of the On-Demand Instance price and a target set of VPC subnets. AWS Batch efficiently launches, manages, and terminates compute types as needed. You can also manage your own compute environments. As such, you're responsible for setting up and scaling the instances in an Amazon ECS cluster that AWS Batch creates for you. For more information, see [Compute environments for AWS Batch](compute_environments.md).

## Job queues
<a name="component_job_queue"></a>

When you submit an AWS Batch job, you submit it to a particular job queue, where the job resides until it's scheduled onto a compute environment. You associate one or more compute environments with a job queue. You can also assign priority values for these compute environments and even across job queues themselves. For example, you can have a high priority queue that you submit time-sensitive jobs to, and a low priority queue for jobs that can run anytime when compute resources are cheaper. For more information, see [Job queues](job_queues.md).

## Job definitions
<a name="component_job_definition"></a>

A job definition specifies how jobs are to be run. You can think of a job definition as a blueprint for the resources in your job. You can supply your job with an IAM role to provide access to other AWS resources. You also specify both memory and CPU requirements. The job definition can also control container properties, environment variables, and mount points for persistent storage. Many of the specifications in a job definition can be overridden by specifying new values when submitting individual Jobs. For more information, see [Job definitions](job_definitions.md)

## Jobs
<a name="component_job"></a>

A unit of work (such as a shell script, a Linux executable, or a Docker container image) that you submit to AWS Batch. It has a name, and runs as a containerized application on AWS Fargate or Amazon EC2 resources in your compute environment, using parameters that you specify in a job definition. Jobs can reference other jobs by name or by ID, and can be dependent on the successful completion of other jobs or the availability of [ resources](resource-aware-scheduling.md) you specify. For more information, see [Jobs](jobs.md).

## Scheduling policy
<a name="component_scheduling_policy"></a>

You can use scheduling policies to configure how compute resources in a job queue are allocated between users or workloads. Using fair-share scheduling policies, you can assign different share identifiers to workloads or users. The AWS Batch job scheduler defaults to a first-in, first-out (FIFO) strategy. For more information, see [Fair-share scheduling policies](job_scheduling.md).

## Consumable resources
<a name="component_consumable_resource"></a>

A consumable resource is a resource that is needed to run your jobs, such as a 3rd party license token, database access bandwidth, the need to throttle calls to a third-party API, and so on. You specify the consumable resources which are needed for a job to run, and Batch takes these resource dependencies into account when it schedules a job. You can reduce the under-utilization of compute resources by allocating only the jobs that have all the required resources available. For more information, see [Resource-aware scheduling](resource-aware-scheduling.md).

## Service environment
<a name="component_service_environment"></a>

A Service environment defines how AWS Batch integrates with SageMaker for job execution. Service environments enable AWS Batch to submit and manage jobs on SageMaker while providing the queuing, scheduling, and priority management capabilities of AWS Batch. Service Environments define capacity limits for specific service types such as SageMaker Training jobs. The capacity limits control the maximum resources that can be used by service jobs in the environment. For more information, see [Service environments for AWS Batch](service-environments.md).

## Service job
<a name="component_service_job"></a>

A service job is a unit of work that you submit to AWS Batch to run on a service environment. Service jobs leverage AWS Batch's queuing and scheduling capabilities while delegating actual execution to the external service. For example, SageMaker Training jobs submitted as service jobs are queued and prioritized by AWS Batch, but the SageMaker Training job execution occurs within SageMaker AI infrastructure. This integration enables data scientists and ML engineers to benefit from AWS Batch's automated workload management, and priority queuing, for their SageMaker AI Training workloads. Service jobs can reference other jobs by name or ID and support job dependencies. For more information, see [Service jobs in AWS Batch](service-jobs.md).