Cluster configuration - SAP HANA on AWS

Cluster configuration

System logging

SUSE recommends using the rsyslogd daemon for logging in the SUSE cluster. Install the rsyslog package as a root user on all cluster nodes. logd is a subsystem to log additional information coming from the STONITH agent:

prihana:~ # zypper install rsyslog prihana:~ # systemctl enable logd prihana:~ # systemctl start logd

Corosync configuration

The cluster service (Pacemaker) should be in a stopped state when performing cluster configuration. Check the status and stop the Pacemaker service if it is running.

  • This is the command to check the Pacemaker status:

    prihana:~ # systemctl status pacemaker
  • This is the command to stop Pacemaker:

    prihana:~ # systemctl stop pacemaker

Create encryption keys

Run the following command to create a secret key which is used to encrypt all the cluster communication:

prihana:~ # corosync-keygen

A new key file called “authkey” is created at location /etc/corosync/. Copy this file to the same location on the second cluster node with the same permissions and ownership.

Create secondary IP addresses for a redundant cluster ring

For SUSE clusters, we recommend defining a redundant communication channel (a second ring) in corosync which the cluster nodes can use to communicate in case of disruptions.

To create a redundant communication channel, you must add a secondary IP address on both the nodes. These IPs are only used in cluster configurations. They provide the same fault tolerance as a secondary Elastic Network Interface (ENI). For more information, see Assign a secondary private IPv4 address.

Review instance settings that conflict with cluster actions

To ensure that restarts are predictable, we recommend disabling simplified automatic recovery and not configuring Amazon CloudWatch action based recovery for instances that are part of a pacemaker cluster. Use the following command to disable simplified automatic recovery.

aws ec2 modify-instance-maintenance-options --instance-id i-0abcdef1234567890 --auto-recovery disabled

You must ensure that stop protection is disabled for Amazon EC2 instances that are part of a pacemaker cluster. Use the following command to disable stop protection.

aws ec2 modify-instance-attribute --instance-id i-1234567890abcdef0 --no-disable-api-stop

Create the Corosync configuration file

All cluster nodes are required to have a local configuration file “/etc/corosync/corosync.conf”, as shown in the following example.

prihana:/etc/corosync # cat corosync.conf # Please read the corosync.conf.5 manual page totem { version: 2 token: 30000 consensus: 36000 token_retransmits_before_loss_const: 6 crypto_cipher: none crypto_hash: none clear_node_high_bit: yes rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 11.0.1.132 mcastport: 5405 ttl: 1 } transport: udpu } logging { fileline: off to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } nodelist { node { ring0_addr: 11.0.1.132 ring1_addr: 11.0.1.75 nodeid: 1 } node { ring0_addr: 11.0.2.139 ring1_addr: 11.0.2.35 nodeid: 2 } } quorum { # Enable and configure quorum subsystem (default: off) # see also corosync.conf.5 and votequorum.5 provider: corosync_votequorum expected_votes: 2 two_node: 1 }

Replace the values for the following variables with those for your environment:

  • bindnetaddr — IP address of the node where the file is being configured.

  • ring0_addr — Primary IP address of cluster node 1.

  • ring1_addr — Secondary IP address of cluster node 1.

  • ring0_addr — Primary IP address of cluster node 2.

  • ring1_addr — Secondary IP address of cluster node 2.

Also update the value of for crypto_cipher and crypto_hash as per your encryption requirements.

Update the hacluster password

Change the password of the user haclustser on both the nodes as shown in the following example:

prihana:~ # passwd hacluster
sechana:~ # passwd hacluster

Start the cluster

Start the cluster on both the primary and secondary nodes and check the status.

  • This is the command to check the Pacemaker status:

    prihana:~ # systemctl status pacemaker
  • This is the command to start Pacemaker:

    prihana:~ # systemctl start pacemaker

After the cluster service (Pacemaker) is started, check the cluster status with the crm_mon command as shown in the following example. You will see both nodes online and a full list of resources.

prihana:~ # crm_mon -r Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Wed Nov 11 16:20:40 2020 Last change: Wed Nov 11 16:20:21 2020 by root via crm_attribute on sechana 2 nodes configured 0 resources configured Online: [ prihana sechana ] Full list of resources: No resources

You can find the ring status and the associated IP address of the cluster with the corosync-cfgtool command as shown in the following example:

prihana:~ # corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 11.0.1.132 status = ring 0 active with no faults RING ID 1 id = 11.0.1.75 status = ring 1 active with no faults