Deploy a Cassandra cluster on Amazon EC2 with private static IPs to avoid rebalancing
Created by Dipin Jain (AWS)
Summary
The private IP of an Amazon Elastic Compute Cloud (Amazon EC2) instance is retained throughout its lifecycle. However, the private IP might change during a planned or unplanned system crash; for example, during an Amazon Machine Image (AMI) upgrade. In some scenarios, retaining a private static IP can enhance the performance and recovery time of workloads. For example, using a static IP for an Apache Cassandra seed node prevents the cluster from incurring a rebalancing overhead.
This pattern describes how to attach a secondary elastic network interface to EC2 instances to keep the IP static during rehosting. The pattern focuses on Cassandra clusters, but you can use this implementation for any architecture that benefits from private static IPs.
Prerequisites and limitations
Prerequisites
An active Amazon Web Service (AWS) account
Product versions
DataStax version 5.11.1
Operating system: Ubuntu 16.04.6 LTS
Architecture
Source architecture
The source could be a Cassandra cluster on an on-premises virtual machine (VM) or on EC2 instances in the AWS Cloud. The following diagram illustrates the second scenario. This example includes four cluster nodes: three seed nodes and one management node. In the source architecture, each node has a single network interface attached.
Target architecture
The destination cluster is hosted on EC2 instances with a secondary elastic network interface attached to each node, as illustrated in the following diagram.
Automation and scale
You can also automate attaching a second elastic network interface to an EC2 Auto Scaling group, as described in an AWS Knowledge Center video
Epics
Task | Description | Skills required |
---|---|---|
Launch EC2 nodes to host a Cassandra cluster. | On the Amazon EC2 console | Cloud engineer |
Confirm node communications. | Make sure that the four nodes can communicate with one another over the database and cluster management ports. | Network engineer |
Install DSE OpsCenter on the management node. | Install DSE OpsCenter 6.1 from the Debian package on the management node. For instructions, see the DataStax documentation | DBA |
Create a secondary network interface. | Cassandra generates a universal unique identifier (UUID) for each node based on the IP address of the EC2 instance for that node. This UUID is used for distributing virtual nodes (vnodes) on the ring. When Cassandra is deployed on EC2 instances, IP addresses are assigned automatically to the instances as they are created. In the event of a planned or unplanned outage, the IP address for the new EC2 instance changes, the data distribution changes, and the entire ring has to be rebalanced. This is not desirable. To preserve the assigned IP address, use a secondary elastic network interface with a fixed IP address.
For more information about creating a network interface, see the Amazon EC2 documentation. | Cloud Engineer |
Attach the secondary network interface to cluster nodes. |
For more information about attaching a network interface, see the Amazon EC2 documentation. | Cloud engineer |
Add routes in Amazon EC2 to address asymmetric routing. | When you attach the second network interface, the network will very likely perform asymmetric routing. To avoid this, you can add routes for the new network interfaces. For an in-depth explanation and remediation of asymmetric routing, see the AWS Knowledge Center video | Network engineer |
Update DNS entries to point to the secondary network interface IP. | Point the fully qualified domain name (FQDN) of the node to the IP of the secondary network interface. | Network engineer |
Install and configure the Cassandra cluster by using DSE OpsCenter. | When the cluster nodes are ready with the secondary network interfaces, you can install and configure the Cassandra cluster. | DBA |
Task | Description | Skills required |
---|---|---|
Create an AMI for the cluster seed node. | Make a backup of the nodes so you can restore them with database binaries in case of node failure. For instructions, see Create an AMI in the Amazon EC2 documentation. | Backup administrator |
Recover from node failure. | Replace the failed node with a new EC2 instance launched from the AMI, and attach the secondary network interface of the failed node. | Backup administrator |
Verify that the Cassandra cluster is healthy. | When the replacement node is up, verify cluster health in DSE OpsCenter. | DBA |
Related resources
Installing DSE OpsCenter 6.1 from the Debian package
(DataStax documentation) How to make a secondary network interface work in an Ubuntu EC2 instance
(AWS Knowledge Center video) Best Practices for Running Apache Cassandra on Amazon EC2
(AWS blog post)