Managed Service for Apache Flink application resources Apache Flink application resources

Review Managed Service for Apache Flink application resources

This section describes the system resources that your application uses. Understanding how Managed Service for Apache Flink provisions and uses resources will help you design, create, and maintain a performant and stable Managed Service for Apache Flink application.

Managed Service for Apache Flink application resources

Managed Service for Apache Flink is an AWS service that creates an environment for hosting your Apache Flink application. The Managed Service for Apache Flink service provides resources using units called Kinesis Processing Units (KPUs).

One KPU represents the following system resources:

One CPU core
4 GB of memory, of which one GB is native memory and three GB are heap memory
50 GB of disk space

KPUs run applications in distinct execution units called tasks and subtasks. You can think of a subtask as the equivalent of a thread.

The number of KPUs available to an application is equal to the application's Parallelism setting, divided by the application's ParallelismPerKPU setting.

For more information about application parallelism, see Implement application scaling.

Apache Flink application resources

The Apache Flink environment allocates resources for your application using units called task slots. When Managed Service for Apache Flink allocates resources for your application, it assigns one or more Apache Flink task slots to a single KPU. The number of slots assigned to a single KPU is equal to your application's ParallelismPerKPU setting. For more information about task slots, see Job Scheduling in the Apache Flink Documentation.

Operator parallelism

You can set the maximum number of subtasks that an operator can use. This value is called Operator Parallelism. By default, the parallelism of each operator in your application is equal to the application's parallelism. This means that by default, each operator in your application can use all of the available subtasks in the application if needed.

You can set the parallelism of the operators in your application using the setParallelism method. Using this method, you can control the number of subtasks each operator can use at one time.

For more information about operators, see Operators in the Apache Flink Documentation.

Operator chaining

Normally, each operator uses a separate subtask to execute, but if several operators always execute in sequence, the runtime can assign them all to the same task. This process is called Operator Chaining.

Several sequential operators can be chained into a single task if they all operate on the same data. The following are some of the criteria needed for this to be true:

The operators do 1-to-1 simple forwarding.
The operators all have the same operator parallelism.

When your application chains operators into a single subtask, it conserves system resources, because the service doesn't need to perform network operations and allocate subtasks for each operator. To determine if your application is using operator chaining, look at the job graph in the Managed Service for Apache Flink console. Each vertex in the application represents one or more operators. The graph shows operators that have been chained as a single vertex.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Run an application

Pricing