AWS services in this solution - Scalable Analytics Using Apache Druid on AWS

AWS services in this solution

AWS service Description
Amazon S3

Core. The solution provisions the following S3 buckets:

  • Deep storage bucket to store the segments.

  • Installation bucket to store the installation files as needed by the solution.

  • Access logging bucket to store the access logs from ALB, and S3 buckets.

Amazon Elastic Compute Cloud Core. The solution provisions EC2 instances to run Apache Druid and Apache ZooKeeper.
Amazon Aurora Core. The solution provisions an Aurora PostgreSQL cluster to serve as the metadata storage.
Amazon Elastic Load Balancer Core. Application load balancer to distribute the incoming traffic among the Druid query nodes.
AWS Secrets Manager Core. Secrets to store master user credentials for Aurora DB cluster, and credentials of the users “admin” and “druid_system”.
AWS Key Management System Core. KMS keys used to encrypt the data in S3 buckets, Aurora cluster, SNS topic, and EFS.
Amazon Elastic Block Store Core. EBS volumes to serve as segment cache for historical nodes.
Amazon CloudWatch Supporting. The solution uses CloudWatch for logs, metrics, alarms, and dashboard.
Amazon Simple Notification Service Supporting. Topics to receive CloudWatch alarm notifications and auto scaling group scaling event notifications.
AWS WAF Supporting. Protect Druid web console and API endpoints from common application-layer exploits that can affect availability or consume excessive resources.
AWS Systems Manager Supporting. Provides application-level resource monitoring and visualization of resource operations and cost data.
Amazon EventBridge Supporting. The solution creates an EventBridge rule to receive the event from auto scaling group.
Amazon Elastic Kubernetes Service Optional. When opting for EKS deployment, the solution initializes an EKS cluster to execute the Apache Druid workload.
Amazon Elastic File System Optional. When opting for EKS Fargate deployment, the solution creates an EFS filesystem to provide storage to Fargate workloads.
Amazon Route 53 Optional. The solution provides the option for integration with Route 53 to manage the domain for accessing the Druid cluster.