Cluster configuration
Topics
System logging
SUSE recommends using the rsyslogd daemon for logging in the SUSE cluster. Install
the rsyslog
package as a root user on all cluster nodes. logd
is a subsystem to log additional information coming from the STONITH
agent:
prihana:~ # zypper install rsyslog prihana:~ # systemctl enable logd prihana:~ # systemctl start logd
Corosync configuration
The cluster service (Pacemaker) should be in a stopped state when performing cluster configuration. Check the status and stop the Pacemaker service if it is running.
-
This is the command to check the Pacemaker status:
prihana:~ # systemctl status pacemaker
-
This is the command to stop Pacemaker:
prihana:~ # systemctl stop pacemaker
Create encryption keys
Run the following command to create a secret key which is used to encrypt all the cluster communication:
prihana:~ # corosync-keygen
A new key file called “authkey
” is created at location
/etc/corosync/
. Copy this file to the same location on the second cluster
node with the same permissions and ownership.
Create secondary IP addresses for a redundant cluster ring
For SUSE clusters, we recommend defining a redundant communication channel (a second ring) in corosync which the cluster nodes can use to communicate in case of disruptions.
To create a redundant communication channel, you must add a secondary IP address on both the nodes. These IPs are only used in cluster configurations. They provide the same fault tolerance as a secondary Elastic Network Interface (ENI). For more information, see Assign a secondary private IPv4 address.
Review instance settings that conflict with cluster actions
To ensure that restarts are predictable, we recommend disabling simplified automatic recovery and not configuring Amazon CloudWatch action based recovery for instances that are part of a pacemaker cluster. Use the following command to disable simplified automatic recovery.
aws ec2 modify-instance-maintenance-options --instance-id i-0abcdef1234567890 --auto-recovery disabled
You must ensure that stop protection is disabled for Amazon EC2 instances that are part of a pacemaker cluster. Use the following command to disable stop protection.
aws ec2 modify-instance-attribute --instance-id i-1234567890abcdef0 --no-disable-api-stop
Create the Corosync configuration file
All cluster nodes are required to have a local configuration file
“/etc/corosync/corosync.conf
”, as shown in the following example.
prihana:/etc/corosync # cat corosync.conf # Please read the corosync.conf.5 manual page totem { version: 2 token: 30000 consensus: 36000 token_retransmits_before_loss_const: 6 crypto_cipher: none crypto_hash: none clear_node_high_bit: yes rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 11.0.1.132 mcastport: 5405 ttl: 1 } transport: udpu } logging { fileline: off to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } nodelist { node { ring0_addr: 11.0.1.132 ring1_addr: 11.0.1.75 nodeid: 1 } node { ring0_addr: 11.0.2.139 ring1_addr: 11.0.2.35 nodeid: 2 } } quorum { # Enable and configure quorum subsystem (default: off) # see also corosync.conf.5 and votequorum.5 provider: corosync_votequorum expected_votes: 2 two_node: 1 }
Replace the values for the following variables with those for your environment:
-
bindnetaddr
— IP address of the node where the file is being configured. -
ring0_addr
— Primary IP address of cluster node 1. -
ring1_addr
— Secondary IP address of cluster node 1. -
ring0_addr
— Primary IP address of cluster node 2. -
ring1_addr
— Secondary IP address of cluster node 2.
Also update the value of for crypto_cipher
and crypto_hash
as per your encryption requirements.
Update the
hacluster
password
Change the password of the user haclustser
on both the nodes as shown
in the following example:
prihana:~ # passwd hacluster
sechana:~ # passwd hacluster
Start the cluster
Start the cluster on both the primary and secondary nodes and check the status.
-
This is the command to check the Pacemaker status:
prihana:~ # systemctl status pacemaker
-
This is the command to start Pacemaker:
prihana:~ # systemctl start pacemaker
After the cluster service (Pacemaker) is started, check the cluster status with the
crm_mon
command as shown in the following example. You will see both
nodes online and a full list of resources.
prihana:~ # crm_mon -r Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Wed Nov 11 16:20:40 2020 Last change: Wed Nov 11 16:20:21 2020 by root via crm_attribute on sechana 2 nodes configured 0 resources configured Online: [ prihana sechana ] Full list of resources: No resources
You can find the ring status and the associated IP address of the cluster with the
corosync-cfgtool
command as shown in the following example:
prihana:~ # corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 11.0.1.132 status = ring 0 active with no faults RING ID 1 id = 11.0.1.75 status = ring 1 active with no faults