Ensure encryption for Amazon EMR data at rest is enabled at launch
Created by Priyanka Chaudhary (AWS)
Environment: Production | Technologies: Security, identity, compliance; Analytics | Workload: Open-source |
AWS services: Amazon EMR; Amazon SNS; AWS KMS; AWS CloudFormation; AWS Lambda; Amazon S3 |
Summary
This pattern provides a security control for monitoring the encryption of Amazon EMR clusters on Amazon Web Services (AWS).
Data encryption helps prevent unauthorized users from reading data on a cluster and associated data storage systems. This includes data that may be intercepted as it travels the network, known as data in transit, and data that is saved to persistent media, known as data at-rest. Data at rest in Amazon Simple Storage Service (Amazon S3) can be encrypted in two ways.
Server-side encryption with Amazon S3–managed keys (SSE-S3)
Server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS), set up with policies that are suitable for Amazon EMR.
This security control monitors for API calls and initiates an Amazon CloudWatch Events event on RunJobFlow. The trigger invokes AWS Lambda, which runs a Python script. The function retrieves the EMR cluster ID from the event JSON input and determines whether there is a security violation by performing the following checks.
Check if an EMR cluster is associated with an Amazon EMR specific security configuration.
If an Amazon EMR specific security configuration is associated with the EMR cluster, check if Encryption-at-Rest is turned on.
If Encryption-at-Rest is not turned on, send an Amazon Simple Notification Service (Amazon SNS) notification that includes the EMR cluster name, violation details, AWS Region, AWS account, and the Lambda Amazon Resource Name (ARN) that this notification is sourced from.
Prerequisites and limitations
Prerequisites
An active AWS account
An S3 bucket for the Lambda code .zip file
An email address where you want to receive the violation notification
Amazon EMR logging turned off so that all the API logs can be retrieved
Limitations
This detective control is regional and must be deployed in the AWS Regions you intend to monitor.
Product versions
Amazon EMR release 4.8.0 and above
Architecture
Target technology stack
Amazon EMR
Amazon CloudWatch Events event
Lambda function
Amazon SNS
Target architecture
Automation and scale
If you are using AWS Organizations, you can use AWS Cloudformation StackSets to deploy this template in multiple accounts that you want to monitor.
Tools
Tools
AWS CloudFormation is a service that helps you model and set up AWS resources using infrastructure as code.
Amazon CloudWatch Events delivers a near real-time stream of system events that describe changes in AWS resources.
Amazon EMR is a managed cluster platform that simplifies running big data frameworks.
AWS Lambda supports running code without provisioning or managing servers.
Amazon S3 is a highly scalable object storage service that can be used for a wide range of storage solutions, including websites, mobile applications, backups, and data lakes.
Amazon SNS coordinates and manages the delivery or sending of messages between publishers and clients, including web servers and email addresses. Subscribers receive all messages published to the topics to which they subscribe, and all subscribers to a topic receive the same messages.
Code
The EMREncryptionAtRest.zip and EMREncryptionAtRest.yml files for this project available as an attachment.
Epics
Task | Description | Skills required |
---|---|---|
Define the S3 bucket. | On the Amazon S3 console, choose or create an S3 bucket with a unique name that does not contain leading slashes. An S3 bucket name is globally unique, and the namespace is shared by all AWS accounts. Your S3 bucket needs to be in the same Region as the Amazon EMR cluster that is being evaluated. | Cloud Architect |
Task | Description | Skills required |
---|---|---|
Upload the Lambda code to the S3 bucket. | Upload the Lambda code .zip file that's provided in the "Attachments" section to the defined S3 bucket. | Cloud Architect |
Task | Description | Skills required |
---|---|---|
Deploy the AWS CloudFormation template. | On the AWS CloudFormation console, in the same Region as your S3 bucket, deploy the AWS CloudFormation template that's provided as an attachment to this pattern. In the next epic, provide the values for the parameters. For more information about deploying AWS CloudFormation templates, see the “Related resources” section. | Cloud Architect |
Task | Description | Skills required |
---|---|---|
Name the S3 bucket. | Enter the name of the S3 bucket that you created in the first epic. | Cloud Architect |
Provide the Amazon S3 key. | Provide the location of the Lambda code .zip file in your S3 bucket, without leading slashes (for example, <directory>/<file-name>.zip). | Cloud Architect |
Provide an email address. | Provide an active email address to receive Amazon SNS notifications. | Cloud Architect |
Define the logging level. | Define the logging level and frequency for your Lambda function. “Info” designates detailed informational messages on the application’s progress. “Error” designates error events that could still allow the application to continue running. “Warning” designates potentially harmful situations. | Cloud Architect |
Task | Description | Skills required |
---|---|---|
Confirm the subscription. | When the template successfully deploys, it sends a subscription email message to the email address provided. You must confirm this email subscription to receive violation notifications. | Cloud Architect |
Related resources
Attachments
To access additional content that is associated with this document, unzip the following file: attachment.zip