Stop the SAP HANA database on the secondary node

Description — Stop the primary SAP HANA database (on Node 2) during normal cluster operation.

Run node — Primary SAP HANA database node (on Node 2)

Run steps:

Stop the SAP HANA database gracefully as <sid>adm on node 2.


[root@sechana ~]# su - hdbadm
hdbadm@sechana:/usr/sap/HDB/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 
00 -function Stop 400

12.11.2020 11:45:21
Stop
OK
Waiting for stopped instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol 
-prot NI_HTTP -nr 00 -function WaitforStopped 600 2


12.11.2020 11:45:53
WaitforStopped
OK
hdbdaemon is stopped.

Expected result:

The cluster detects the stopped primary SAP HANA database (on node 2) and promotes the secondary SAP HANA database (on node 1) to take over as primary.


[root@sechana ~]# pcs status
Cluster name: rhelhanaha
Stack: corosync
Current DC: sechana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum
Last updated: Tue Nov 10 18:04:01 2020
Last change: Tue Nov 10 18:04:00 2020 by root via crm_attribute on prihana

2 nodes configured
6 resources configured

Online: [ prihana sechana ]

Full list of resources:

 clusterfence   (stonith:fence_aws):    Started prihana
 Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00]
     Started: [ prihana sechana ]
 Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00]
     SAPHana_HDB_00     (ocf::heartbeat:SAPHana):       Promoting prihana
     Slaves: [ sechana ]
 hana-oip       (ocf::heartbeat:aws-vpc-move-ip):       Started prihana

Failed Actions:
* SAPHana_HDB_00_monitor_59000 on sechana 'master (failed)' (9): call=41, 
status=complete, exitreason='',
    last-rc-change='Tue Nov 10 18:03:49 2020', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@sechana ~]#

The overlay IP address is migrated to the new primary (on node 1).


[root@prihana ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 0a:38:1c:ce:b4:3d brd ff:ff:ff:ff:ff:ff
    inet xx.xx.xx.xx/24 brd 11.0.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet xx.xx.xx.xx/32 scope global eth0:1
       valid_lft forever preferred_lft forever
    inet 192.168.10.16/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::838:1cff:fece:b43d/64 scope link
       valid_lft forever preferred_lft forever

With AUTOMATED_REGISTER set to true, the cluster restarts the failed SAP HANA database and registers it against the new primary.

Check the status of the secondary using the following command:


hdbadm@sechana:/usr/sap/HDB/HDB00> sapcontrol -nr 00 -function GetProcessList

10.11.2020 18:08:47
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
hdbdaemon, HDB Daemon, GREEN, Running, 2020 11 10 18:05:44, 0:03:03, 6601
hdbcompileserver, HDB Compileserver, GREEN, Running, 2020 11 10 18:05:48, 0:02:59, 6725
hdbindexserver, HDB Indexserver-HDB, GREEN, Running, 2020 11 10 18:05:49, 0:02:58, 6828
hdbnameserver, HDB Nameserver, GREEN, Running, 2020 11 10 18:05:44, 0:03:03, 6619
hdbpreprocessor, HDB Preprocessor, GREEN, Running, 2020 11 10 18:05:48, 0:02:59, 6730
hdbwebdispatcher, HDB Web Dispatcher, GREEN, Running, 2020 11 10 18:05:58, 0:02:49, 7797
hdbxsengine, HDB XSEngine-HDB, GREEN, Running, 2020 11 10 18:05:49, 0:02:58, 6831
hdbadm@sechana:/usr/sap/HDB/HDB00>

Recovery procedure:

Clean up the cluster “failed actions” on node 2 as root using the following command:
```
[root@sechana ~]# pcs resource cleanup SAPHana_HDB_00 --node sechana
```

After resource cleanup, ensure the cluster “failed actions” are cleaned up.


root@sechana ~]# pcs status
Cluster name: rhelhanaha
Stack: corosync
Current DC: sechana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum
Last updated: Tue Nov 10 18:13:35 2020
Last change: Tue Nov 10 18:12:51 2020 by hacluster via crmd on sechana

2 nodes configured
6 resources configured

Online: [ prihana sechana ]

Full list of resources:

 clusterfence   (stonith:fence_aws):    Started prihana
 Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00]
     Started: [ prihana sechana ]
 Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00]
     Masters: [ prihana ]
     Slaves: [ sechana ]
 hana-oip       (ocf::heartbeat:aws-vpc-move-ip):       Started prihana

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@sechana ~]#

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Stop the SAP HANA database on the primary node

Crash the primary database on node 1