Amazon Managed Service for Apache Flink was previously known as Amazon Kinesis Data Analytics for Apache Flink.
Data skew
A Flink application is executed on a cluster in a distributed fashion. To scale out to multiple nodes,
Flink uses the concept of keyed streams, which essentially means that the events of a stream are partitioned according to a specific key, e.g., customer id, and Flink can then process different partitions on different nodes.
Many of the Flink operators are then evaluated based on these partitions, e.g., Keyed Windows
Choosing a partition key often depends on the business logic. At the same time, many of the best practices for, e.g., DynamoDB
ensuring a high cardinality of partition keys
avoiding skew in the event volume between partitions
You can identify skew in the partitions by comparing the records received/sent of subtasks (i.e., instances of the same operator)
in the Flink dashboard. In addition, Managed Service for Apache Flink monitoring can be configured to expose metrics for numRecordsIn/Out
and numRecordsInPerSecond/OutPerSecond
on a subtask level as well.