Enforce tagging of Amazon EMR clusters at launch
Created by Priyanka Chaudhary (AWS)
Environment: Production | Technologies: Analytics; Security, identity, compliance | AWS services: Amazon EMR; AWS Lambda; Amazon CloudWatch Events |
Summary
This pattern provides a security control that ensures that Amazon EMR clusters are tagged when they are created.
Amazon EMR is an Amazon Web Services (AWS) service for processing and analyzing vast amounts of data. Amazon EMR offers an expandable, low-configuration service as an easier alternative to running in-house cluster computing. You can use tagging to categorize AWS resources in different ways, such as by purpose, owner, or environment . For example, you can tag your Amazon EMR clusters by assigning custom metadata to each cluster. A tag consists of a key and value that you define. We recommend that you create a consistent set of tags to meet your organization's requirements. When you add a tag to an Amazon EMR cluster, the tag is also propagated to each active Amazon Elastic Compute Cloud (Amazon EC2) instance that is associated with the cluster. Similarly, when you remove a tag from an Amazon EMR cluster, that tag is removed from each associated, active EC2 instance as well.
The detective control monitors API calls and initiates an Amazon CloudWatch Events event for the RunJobFlow, AddTags, RemoveTags, and CreateTags APIs. The event calls AWS Lambda, which runs a Python script. The Python function gets the Amazon EMR cluster ID from the JSON input from the event and performs the following checks:
Check if the Amazon EMR cluster is configured with tag names that you specify.
If not, send an Amazon Simple Notification Service (Amazon SNS) notification to the user with the relevant information: the Amazon EMR cluster name, violation details, AWS Region, AWS account, and Amazon Resource Name (ARN) for Lambda that this notification is sourced from.
Prerequisites and limitations
Prerequisites
An active AWS account
An Amazon Simple Storage Service (Amazon S3) bucket to upload the provided Lambda code. Or, you can create an S3 bucket for this purpose, as described in the Epics section.
An active email address where you would like to receive violation notifications.
A list of mandatory tags you want to check for.
Limitations
This security control is regional. You must deploy it in each AWS Region that you want to monitor.
Product versions
Amazon EMR release 4.8.0 and later.
Architecture
Workflow architecture
Automation and scale
If you are using AWS Organizations
, you can use AWS Cloudformation StackSets to deploy this template in multiple accounts that you want to monitor.
Tools
AWS services
AWS CloudFormation – AWS CloudFormation helps you model and set up your AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle. You can use a template to describe your resources and their dependencies, and launch and configure them together as a stack, instead of managing resources individually. You can manage and provision stacks across multiple AWS accounts and AWS Regions.
Amazon CloudWatch Events - Amazon CloudWatch Events delivers a near real-time stream of system events that describe changes in AWS resources.
Amazon EMR - Amazon EMR is web service that simplifies running big data frameworks and processing vast amounts of data efficiently.
AWS Lambda – AWS Lambda is a compute service that supports running code without provisioning or managing servers. Lambda runs your code only when needed and scales automatically, from a few requests per day to thousands per second.
Amazon S3 – Amazon Simple Storage Service (Amazon S3) is an object storage service. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
Amazon SNS – Amazon Simple Notification Service (Amazon SNS) coordinates and manages the delivery or sending of messages between publishers and clients, including web servers and email addresses. Subscribers receive all messages published to the topics to which they subscribe, and all subscribers to a topic receive the same messages.
Code
This pattern includes the following attachments:
EMRTagValidation.zip
– The Lambda code for the security control.EMRTagValidation.yml
– The CloudFormation template that sets up the event and Lambda function.
Epics
Task | Description | Skills required |
---|---|---|
Define the S3 bucket. | On the Amazon S3 console | Cloud architect |
Upload the Lambda code. | Upload the Lambda code .zip file provided in the Attachments section to the S3 bucket. | Cloud architect |
Task | Description | Skills required |
---|---|---|
Launch the AWS CloudFormation template. | Open the AWS CloudFormation console | Cloud architect |
Complete the parameters in the template. | When you launch the template, you'll be prompted for the following information:
| Cloud architect |
Task | Description | Skills required |
---|---|---|
Confirm the subscription. | When the CloudFormation template deploys successfully, it sends a subscription email to the email address you provided. You must confirm this email subscription to start receiving violation notifications. | Cloud architect |
Related resources
Attachments
To access additional content that is associated with this document, unzip the following file: attachment.zip