Reboot SAP HANA on node 2

Description — Simulate a crash of the primary node (on node 2) running the primary SAP HANA database.

Run node — Primary SAP HANA database node (on node 2)

Run steps:

Crash the node running primary SAP HANA (on node 2) using the following command as root:


shutdown


[root@sechana ~]# pcs status
Cluster name: rhelhanaha
Stack: corosync
Current DC: sechana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum
Last updated: Tue Nov 10 17:54:13 2020
Last change: Tue Nov 10 17:53:48 2020 by root via crm_attribute on prihana
2 nodes configured
6 resources configured
Online: [ prihana sechana ]
Full list of resources:
 clusterfence   (stonith:fence_aws):    Started prihana
 Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00]
     Started: [ prihana sechana ]
 Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00]
     Masters: [ sechana ]
     Slaves: [ prihana ]
 hana-oip       (ocf::heartbeat:aws-vpc-move-ip):       Started sechana
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@sechana ~]# echo 'b' > /proc/sysrq-trigger

Note

To simulate a system crash, you must first ensure that /proc/sys/kernel/sysrq is set to 1.

Expected result:

The cluster detects the failed node (node 2), declares it “UNCLEAN”, and sets the secondary node (node 1) to status “partition WITHOUT quorum”.

The cluster fences node 2 and promotes the secondary SAP HANA database (on node 1) to take over as primary.


[root@prihana ~]# pcs status
Cluster name: rhelhanaha
Stack: corosync
Current DC: prihana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum
Last updated: Tue Nov 10 18:22:00 2020
Last change: Tue Nov 10 18:21:49 2020 by root via crm_attribute on prihana

2 nodes configured
6 resources configured

Online: [ prihana ]
OFFLINE: [ sechana ]

Full list of resources:

 clusterfence   (stonith:fence_aws):    Started prihana
 Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00]
     Started: [ prihana ]
     Stopped: [ sechana ]
 Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00]
     Masters: [ prihana ]
     Stopped: [ sechana ]
 hana-oip       (ocf::heartbeat:aws-vpc-move-ip):       Started prihana

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@prihana ~]#

The overlay IP address is migrated to the new primary (on node 2).
Because AUTOMATED_REGISTER is set to true, the cluster restarts the failed SAP HANA database and registers it against the new primary when the EC2 instance is back up.

Recovery procedure:

Start node 2 (EC2 instance) using AWS Management Console or AWS CLI tools.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Reboot SAP HANA on node 1

Simulating a cluster network failure