Reboot SAP HANA on node 2

Description — Simulate a crash of the primary site node (on node 2) running the primary SAP HANA database.

Run node — Primary SAP HANA database node (on node 2)

Run steps:

Crash the primary database system (on node 2) using the following command as root:


sechana:~ # crm status
Stack: corosync
Current DC: sechana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum
Last updated: Thu Nov 12 12:16:57 2020
Last change: Thu Nov 12 12:16:41 2020 by root via crm_attribute on sechana

2 nodes configured
6 resources configured

Online: [ prihana sechana ]

Full list of resources:

 res_AWS_STONITH        (stonith:external/ec2): Started prihana
 res_AWS_IP     (ocf::suse:aws-vpc-move-ip):    Started sechana
 Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00]
     Started: [ prihana sechana ]
 Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00]
     Masters: [ sechana ]
     Slaves: [ prihana ]

sechana:~ # echo 'b' > /proc/sysrq-trigger

Note

To simulate a system crash, you must first ensure that /proc/sys/kernel/sysrq is set to 1.

Expected result:

The cluster detects failed node (node 2), declares it “UNCLEAN”, and sets the secondary node (node 1) to status “partition WITHOUT quorum”.

The cluster fences node 2 and promotes the secondary SAP HANA database (on node 1) to take over as primary.


prihana:~ # crm status
Stack: corosync
Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum
Last updated: Thu Nov 12 12:28:51 2020
Last change: Thu Nov 12 12:28:31 2020 by root via crm_attribute on prihana

2 nodes configured
6 resources configured

Online: [ prihana ]
OFFLINE: [ sechana ]

Full list of resources:

 res_AWS_STONITH        (stonith:external/ec2): Started prihana
 res_AWS_IP     (ocf::suse:aws-vpc-move-ip):    Started prihana
 Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00]
     Started: [ prihana ]
     Stopped: [ sechana ]
 Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00]
     Masters: [ prihana ]
     Stopped: [ sechana ]

The overlay IP address is migrated to the new primary (on node 1).
With the AUTOMATIC_REGISTER parameter set to "true", the cluster restarts the failed SAP HANA database and automatically registers it against the new primary.

Recovery procedure:

Start node 2 (EC2 instance) with AWS Management Console or AWS CLI tools and start Pacemaker (if it’s not enabled by default).

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Reboot SAP HANA on node 1

Simulating a cluster network failure