Reboot SAP HANA on node 2 - SAP HANA on AWS

Reboot SAP HANA on node 2

Description — Simulate a crash of the primary node (on node 2) running the primary SAP HANA database.

Run node — Primary SAP HANA database node (on node 2)

Run steps:

  • Crash the node running primary SAP HANA (on node 2) using the following command as root:

    shutdown
    [root@sechana ~]# pcs status Cluster name: rhelhanaha Stack: corosync Current DC: sechana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum Last updated: Tue Nov 10 17:54:13 2020 Last change: Tue Nov 10 17:53:48 2020 by root via crm_attribute on prihana 2 nodes configured 6 resources configured Online: [ prihana sechana ] Full list of resources: clusterfence (stonith:fence_aws): Started prihana Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00] Started: [ prihana sechana ] Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00] Masters: [ sechana ] Slaves: [ prihana ] hana-oip (ocf::heartbeat:aws-vpc-move-ip): Started sechana Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@sechana ~]# echo 'b' > /proc/sysrq-trigger
Note

To simulate a system crash, you must first ensure that /proc/sys/kernel/sysrq is set to 1.

Expected result:

  • The cluster detects the failed node (node 2), declares it “UNCLEAN”, and sets the secondary node (node 1) to status “partition WITHOUT quorum”.

  • The cluster fences node 2 and promotes the secondary SAP HANA database (on node 1) to take over as primary.

    [root@prihana ~]# pcs status Cluster name: rhelhanaha Stack: corosync Current DC: prihana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum Last updated: Tue Nov 10 18:22:00 2020 Last change: Tue Nov 10 18:21:49 2020 by root via crm_attribute on prihana 2 nodes configured 6 resources configured Online: [ prihana ] OFFLINE: [ sechana ] Full list of resources: clusterfence (stonith:fence_aws): Started prihana Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00] Started: [ prihana ] Stopped: [ sechana ] Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00] Masters: [ prihana ] Stopped: [ sechana ] hana-oip (ocf::heartbeat:aws-vpc-move-ip): Started prihana Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@prihana ~]#
  • The overlay IP address is migrated to the new primary (on node 2).

  • Because AUTOMATED_REGISTER is set to true, the cluster restarts the failed SAP HANA database and registers it against the new primary when the EC2 instance is back up.

Recovery procedure:

  • Start node 2 (EC2 instance) using AWS Management Console or AWS CLI tools.