- Languages supported
-
Fargate: AWS Fargate is a
container orchestration service, meaning that it supports any programming
language or runtime environment that can be packaged into a Docker
container. This flexibility allows developers to use virtually any
language, framework, or library that suits their application needs.
Whether you're using Python, Java, Node.js, Go, .NET, Ruby, PHP, or
even custom languages and environments, Fargate can run them as long
as they are encapsulated in a container. This broad language support
makes Fargate ideal for running diverse applications, including
legacy systems, multi-language microservices, and modern
cloud-native applications.
Lambda: AWS Lambda offers native
support for a more limited set of languages compared to Fargate,
specifically designed for event-driven functions. As of now, Lambda
officially supports the following languages:
-
Node.js
-
Python
-
Java
-
Go
-
Ruby
-
C#
-
PowerShell
Lambda also supports custom runtimes, which allow you to bring your own language or
runtime environment, but this requires more setup and management compared to using the
natively supported options. If you choose to deploy your Lambda function from a container
image, you can write your function in Rust by using an AWS OS-only base image and
including the Rust runtime client in your image. If you're using a language that doesn't
have an AWS-provided runtime interface client, you must create your own.
- Event-driven invocation
-
Lambda is inherently designed for
event-driven computing. Lambda functions are triggered in response
to a variety of events, including changes in data, user actions, or
scheduled tasks. It integrates seamlessly with many AWS services,
such as Amazon S3 (for example, invoking a function when a file is uploaded),
DynamoDB (for example, triggering on data updates), and API Gateway (for example,
handling HTTP requests). The Lambda event-driven architecture is ideal
for applications that need to respond immediately to events without
requiring persistent compute resources.
Fargate is not natively
event-driven, but with some additional boilerplate logic, it can
integrate with event sources such as Amazon SQS and Kinesis. Whereas Lambda
handles the bulk of this integration logic for you, you will have to
implement this integration yourself using the APIs for these
services.
- Runtime/use cases
-
Fargate is designed to run
containerized applications, providing a flexible runtime
environment where you can define the CPU, memory, and networking
settings for your containers. Since Fargate operates on a
container-based model, it supports long-running processes,
persistent services, and applications with specific runtime
requirements. The containers in Fargate can run indefinitely, as
there is no hard limit on execution time, making it ideal for
applications that need to be up and running continuously.
Lambda, on the other hand, is
optimized for short-lived, event-driven tasks. Lambda functions
are executed in a stateless environment where the maximum
execution time is capped at 15 minutes. This makes Lambda
well-suited for scenarios like file processing, real-time data
streaming, and HTTP request handling, where the tasks are brief
and don't require long-running processes.
In Lambda, the runtime environment is more abstracted, and there
is less control over the underlying infrastructure. Lambda's
stateless nature means that each function invocation is
independent, and any state or data that needs to persist between
invocations must be managed externally (for example, in databases or
storage services).
- Scaling
-
Fargate scales by adjusting the number of running
containers based on the desired state defined in your container orchestration service
(Amazon ECS). This scaling can be done manually or automatically through Amazon EC2 Auto Scaling. This blog post offers further details on how.
In Fargate, each container runs in its isolated environment, and scaling involves
launching additional containers or stopping them based on the load. The Amazon ECS service
scheduler is able to launch up to 500 tasks in less than a minute per service for web and
other long-running services.
For Lambda, concurrency is the number of in-flight requests that your AWS Lambda function
is handling at the same time. This differs from concurrency in Fargate, where each
Fargate task can handle concurrent requests as long as there are available compute and
network resources. For each concurrent request, Lambda provisions a separate instance of
your execution environment. As your functions receive more requests, Lambda automatically
handles scaling the number of execution environments until you reach your account's
concurrency limit. By default, Lambda provides your account with a total concurrency limit
of 1,000 concurrent executions across all functions in an AWS Region, and you can
request a quota increase if needed.
For each Lambda function in a Region, the concurrency scaling rate is 1,000 execution
instances every 10 seconds, up to the maximum account concurrency. As explained in this blog, if the number of requests in a 10 second period exceeds
1,000, the additional requests will be throttled. The following graph demonstrates how
Lambda scaling works assuming an account concurrency of 7000.
- Cold start and cold-start mitigation
-
Lambda can experience cold starts, which occur when a
function is invoked after being idle for some time. During a cold start, the Lambda service
needs to initialize a new execution environment, including loading the runtime,
dependencies, and the function code. This process can introduce latency, particularly for
languages with longer initialization times (for example, Java, or C#). Cold starts can
impact the performance of applications, especially those requiring low-latency responses.
To mitigate cold starts in Lambda, several strategies can be
employed:
-
Minimize function size:
Reducing the size of your function package and its dependencies
can decrease the time needed for initialization.
-
Increase memory allocation: Higher memory allocations increase CPU capacity, potentially reducing initialization time.
-
Keep functions warm:
Periodically invoking your Lambda functions (for example, using
CloudWatch Events) can keep them active and reduce the
likelihood of cold starts.
-
Lambda SnapStart: Use Lambda SnapStart for Java functions to reduce startup time.
-
Provisioned concurrency: This
feature keeps a specified number of function instances warm and
ready to serve requests, reducing cold start latency. However,
it increases costs as you're paying for the provisioned
instances even if they're not actively handling requests.
Fargate is not generally impacted by cold starts in the
same way as Lambda. The time it takes to start a Fargate task is directly correlated to
the time it takes to pull the container
images defined in the task from the image registry. Fargate also supports lazy
loading of container images that have been indexed with Seekable OCI (SOCI). Lazy loading container images with SOCI reduces the time
taken to launch Amazon ECS tasks on Fargate. Fargate runs containers that remain active for
as long as needed, meaning they're always ready to handle requests. However, if you need
to start new containers in response to scaling events, there might be some delay while the
containers are initialized, but this is typically less significant compared to Lambda cold
starts.
- Memory and CPU options
-
Fargate provides granular control
over both memory and CPU resources for your containerized
applications. When you launch a task in Fargate, you can specify the
exact CPU and memory requirements based on the needs of your
application. The CPU and memory allocations are independent,
allowing you to choose combinations that best suit your workload.
For instance, you can select CPU values ranging from 0.25 vCPUs to
16 vCPUs and memory from 0.5 GB to 120 GB per container, depending
on your configuration.
This flexibility is ideal for running applications that require
specific performance characteristics, such as memory-intensive
databases or CPU-bound computation tasks. Fargate allows you to
optimize your resource allocation to balance cost and performance
effectively.
In Lambda, memory and CPU are
linked, with the CPU automatically allocated in proportion to the
amount of memory you select. You can choose memory allocations
between 128 MB and 10 GB, in 1 MB increments. The CPU scales with
the memory, up to 6 vCPU, meaning higher memory settings result in
more CPU power, but you don't have direct control over the CPU
allocation itself.
This model is designed for simplicity, allowing developers to
quickly adjust memory settings without needing to manage CPU
configurations. However, it might be less flexible for workloads
that require a specific balance between CPU and memory resources.
Lambda's model is suitable for tasks where you want straightforward
scaling based on memory needs, but it might not be optimal for
applications with complex or highly specific resource demands.
- Networking
-
When you deploy tasks in Fargate, they run in an
Amazon VPC (Amazon Virtual Private Cloud), giving you full control over the networking environment. This includes
configuring security groups, network access control lists (ACLs), and routing tables. Each
Fargate task gets its own network interface, with a dedicated private IP address, and
can be assigned a public IP address if needed.
Fargate supports advanced networking features such as load balancing
(using AWS Elastic Load Balancing), VPC peering, and direct access
to other AWS services within the VPC. You can also use AWS PrivateLink for secure,
private connectivity to supported AWS services, without traversing the internet.
By default, Lambda functions are run in a managed
network environment without direct control over network interfaces or IP addresses.
However, Lambda can be attached to a customer-managed VPC using AWS Hyperplane, enabling you to
control access to resources inside your VPC.
When Lambda functions are attached to a customer-managed VPC, they inherit the VPC's
security groups and subnet configurations, allowing them to interact securely with other
AWS services (like RDS databases) within the same VPC.
The Lambda service uses a Network Function Virtualization platform to provide NAT
capabilities from the Lambda VPC to customer VPCs. This configures the required elastic
network interfaces (ENIs) at the point where Lambda functions are created or updated. It
also enables ENIs from your account to be shared across multiple execution environments,
which allows Lambda to make more efficient use of a limited network resource when functions
scale.
Since ENIs are an exhaustible resource and there is a soft limit of 250 ENIs per Region,
you should monitor elastic network interface usage if you are configuring Lambda functions
for VPC access. Lambda functions in the same AZ and same security group can share ENIs.
Generally, if you increase concurrency limits in Lambda, you should evaluate if you need an
elastic network interface increase. If the limit is reached, this causes invocations of
VPC-enabled Lambda functions to be throttled.
- Pricing model
-
Fargate pricing is based on the resources allocated to
your containers, specifically the vCPU and memory you select for each task. You are billed
per second, with a one-minute minimum charge, for the CPU and memory that your containers
use. The costs are directly tied to the resources your application consumes, meaning you
pay for what you provision, regardless of whether the application is actively processing
requests. Fargate is well-suited for predictable workloads where you need specific
resource configurations and can optimize costs by adjusting the allocated resources.
Additionally, there might be additional charges for related services, such as data
transfer, storage, and networking (for example, VPC, Elastic Load Balancing).
Lambda has a different pricing
structure that is event-driven and pay-per-execution. You are
charged based on the number of requests your functions receive and
the duration of each execution, measured in milliseconds. Lambda
also factors in the amount of memory you allocate to your
function, with costs scaling based on the memory used and the
execution time. The pricing model includes a free tier, offering 1
million free requests and 400,000 GB-seconds of compute time per
month, which makes Lambda particularly cost-effective for
low-volume, sporadic workloads.
The Lambda pricing model is ideal for applications with
unpredictable or bursty traffic patterns, as you only pay for
actual function invocations and execution time, without the need
to provision or pay for idle capacity.