首页 >SAP 解决方案 >SAP 最佳实践 >SAP系统高可用环境维护指南

SAP系统高可用环境维护指南

版本管理

版本

修订日期

变更类型

生效日期

1.0

2019/4/15

1.1

2019/7/30

1.更新故障计数描述

2.更新启停顺序的说明

2019/7/30

SAP高可用环境维护概述

本文档适用基于SUSE HAE 12集群部署的SAP系统应用或SAP HANA ECS实例需要进行运维操作的场景,例如ECS实例规格升降配、SAP应用 / 数据库升级、主/备节点的常规维护、节点发生异常切换等场景的前置和后处理说明。

通过SUSE HAE管理的SAP系统,如果要在集群节点上执行维护任务,可能需要停止该节点上运行的资源、移动这些资源,或者关闭或重启该节点。此外,可能还需要暂时接管集群中资源的控制权。

下面列举的场景以SAP HANA高可用为例,SAP应用高可用维护操作类似。

注意

本文档无法代替标准的SUSE和SAP的安装/管理文档,更多高可用环境维护指导请参考SUSE和SAP的官方文档。

SUSE HAE操作手册请参考:

SAP HANA HSR配置手册请参考:

SAP HANA高可用常见维护场景

SUSE HAE的体系结构如下图:susehae

SUSE Pacemaker提供了多种选项用于不同需求的维护需求:

将集群设置为维护模式

使用全局集群属性 maintenance-mode 可以一次性将所有资源置于维护状态。集群将停止监控这些资源。

将节点设置为维护模式

一次性将指定节点上运行的所有资源置于维护状态。集群将停止监控这些资源。

将节点设置为待机模式

处于待机模式的节点不再能够运行资源。该节点上运行的所有资源将被移出或停止(如果没有其他节点可用于运行资源)。另外,该节点上的所有监控操作将会停止(设置了role=”Stopped” 的操作除外)。

如果您需要停止群集中的某个节点,同时继续提供另一个节点上运行的服务,则可以使用此选项。

将资源设置为维护模式

将某个资源设置成此模式后,将不会针对该资源触发监控操作。如果您需要手动调整此资源所管理的服务,并且不希望集群在此期间对该资源运行任何监控操作,则可以使用此选项。

将资源设置为不受管理模式

使用 is-managed 属性可以暂时“释放”某个资源,使其不受群集堆栈的管理。这意味着,您可以手动调整此资源管理的服务。不过,集群将继续监控该资源,并会报告错误的信息。如果您希望集群同时停止监控该资源,请改为使用按资源维护模式。

1.主节点异常后处理

注意

主节点异常时,HAE会触发主备切换,原备节点Node B会被promote为primary,但原主节点Node A仍然是primary角色,因此在原主节点Node A故障修复后启动Pacemaker服务前,需要手工重新配置HANA HSR,将原主节点Node A注册为Secondary

说明

本示例初始状态的主节点为saphana-01,备节点为saphana-02

1.1 SUSE HAE的正常状态

登录任意节点,使用

crm status

命令查询HAE的正常状态

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:33:22 2019
Last change: Mon Apr 15 14:33:19 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-01 ]
     Slaves: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

1.2 主节点出现异常后,HAE自动将备节点promote成primary

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:40:43 2019
Last change: Mon Apr 15 14:40:41 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-02 ]
OFFLINE: [ saphana-01 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Stopped: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-02 ]
     Stopped: [ saphana-01 ]

1.3 原主节点故障修复,需要重新注册HSR

警告

重新配置HSR之前,一定要先确认主、备节点,配置错误可能会导致数据被覆盖甚至丢失

用SAP HANA实例用户,登录原主节点,配置HSR

h01adm@saphana-01:/usr/sap/H01/HDB00> hdbnsutil -sr_register --remoteHost=saphana-02 --remoteInstance=00 --replicationMode=syncmem --name=saphana-01 --operationMode=logreplay
adding site ...
checking for inactive nameserver ...
nameserver saphana-01:30001 not responding.
collecting information ...
updating local ini files ...
done.

1.4 检查SBD状态

如果发现节点槽的状态不是 “clear”,需要将其设置为 “clear”

# sbd -d /dev/vdc list
0       saphana-01      reset   saphana-02
1       saphana-02      reset   saphana-01
# sbd -d /dev/vdc message saphana-01 clear
# sbd -d /dev/vdc message saphana-02 clear

# sbd -d /dev/vdc list
0       saphana-01      clear   saphana-01
1       saphana-02      clear   saphana-01

1.5 启动pacemaker服务,HAE会自动拉起SAP HANA服务

# systemctl start pacemaker

此时,原备节点成为新主节点,当前HAE状态如下:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:10:58 2019
Last change: Mon Apr 15 15:09:56 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

1.6 检查SAP HANA HSR状态

1.6.1 通过SAP HANA自带python脚本检查

使用SAP HANA实例用户登录主节点,确保所有SAP HANA进程“Replication Status”都是“ACTIVE”

saphana-02:~ # su - h01adm
h01adm@saphana-02:/usr/sap/H01/HDB00> cdpy
h01adm@saphana-02:/usr/sap/H01/HDB00/exe/python_support> python systemReplicationStatus.py 
| Database | Host       | Port  | Service Name | Volume ID | Site ID | Site Name  | Secondary  | Secondary | Secondary | Secondary  | Secondary     | Replication | Replication | Replication    | 
|          |            |       |              |           |         |            | Host       | Port      | Site ID   | Site Name  | Active Status | Mode        | Status      | Status Details | 
| -------- | ---------- | ----- | ------------ | --------- | ------- | ---------- | ---------- | --------- | --------- | ---------- | ------------- | ----------- | ----------- | -------------- | 
| SYSTEMDB | saphana-02 | 30001 | nameserver   |         1 |       2 | saphana-02 | saphana-01 |     30001 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                | 
| H01      | saphana-02 | 30007 | xsengine     |         3 |       2 | saphana-02 | saphana-01 |     30007 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                | 
| H01      | saphana-02 | 30003 | indexserver  |         2 |       2 | saphana-02 | saphana-01 |     30003 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                |

status system replication site "1": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 2
site name: saphana-02

1.6.2 通过SUSE提供的SAPHanaSR工具,查看复制状态,确保备节点的 sync_state为SOK

saphana-02:~ # SAPHanaSR-showAttr
Global cib-time                 
--------------------------------
global Mon Apr 15 15:17:12 2019 


Hosts      clone_state lpa_h01_lpt node_state op_mode   remoteHost roles                            site       srmode  standby sync_state version                vhost      
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
saphana-01 DEMOTED     30          online     logreplay saphana-02 4:S:master1:master:worker:master saphana-01 syncmem         SOK        2.00.020.00.1500920972 saphana-01 
saphana-02 PROMOTED    1555312632  online     logreplay saphana-01 4:P:master1:master:worker:master saphana-02 syncmem off     PRIM       2.00.020.00.1500920972 saphana-02

1.7 重置故障计数(可选)

如果资源失败,它将自动重新启动,但是每次失败都会增加资源的故障计数。如果为该资源设置了migration-threshold,当故障数量达到阈值前,节点将不再允许运行该资源,因此我们需要手工清理这个故障计数。

清理故障计数的命令如下:

# crm resource cleanup [resouce name] [node]

例如:节点saphana-01的rsc_SAPHana_HDB的资源已经被修复,这时我们需要cleanup这个监控报警,命令如下:

crm resource cleanup rsc_SAPHana_HDB saphana-01

2.备节点异常后处理

注意

备节点异常时,主节点不受任何影响,不会触发主备切换动作。当备节点故障恢复后,启动pacemaker服务,会自动拉起SAP HANA服务,主备角色不会发生变化,无需人工干预。

说明

本示例初始状态的主节点为saphana-02,备节点为saphana-01。

2.1 SUSE HAE的正常状态

登录任意节点,使用

crm status

命令查询HAE的正常状态

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

2.2 备节点故障恢复后,先 检查SBD,再重启pacemaker

# systemctl start pacemaker

HSR保持原主备关系,当前HAE状态如下:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:43:28 2019
Last change: Mon Apr 15 15:43:25 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

2.3 检查SAP HANA HSR状态

2.4 重置故障计数(可选)

3.主备节点停机维护

注意

将集群设置为维护模式,依次关停备和主节点

说明

本示例初始状态的主节点为saphana-02,备节点为saphana-01

3.1 SUSE HAE的正常状态

登录任意节点,使用

crm status

命令查询HAE的正常状态

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

3.2 将集群和master/slave资源集设置为维护模式

登录主节点,设置集群为维护模式

# crm configure property maintenance-mode=true

将master/slave资源集设置为维护模式,本示例master/slave资源集为rsc_SAPHana_HDB和rsc_SAPHanaTopology_HDB

# crm resource maintenance rsc_SAPHana_HDB true
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=true

# crm resource maintenance rsc_SAPHanaTopology_HDB true
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=true

3.3 当前HAE的状态如下

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:02:13 2019
Last change: Mon Apr 15 16:02:11 2019 by root via crm_resource on saphana-02

2 nodes configured
6 resources configured

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02 (unmanaged)
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (unmanaged)
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Slave saphana-01 (unmanaged)
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Master saphana-02 (unmanaged)
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-01 (unmanaged)
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-02 (unmanaged)

3.4 停止备-主节点SAP HANA服务,关停ECS进行停机维护任务

用SAP HANA实例用户登录两个节点,再停备节点SAP HANA服务,再停主节点SAP HANA服务

saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:46:42
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:46:54
WaitforStopped
OK
hdbdaemon is stopped.

saphana-02:~ # su - h01adm
h01adm@saphana-02:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.

3.5 启动SAP HANA ECS主备节点,并将集群和资源集恢复为正常模式

依次登录主和备节点,启动pacemaker服务

# systemctl start pacemaker

将集群和资源集恢复为正常模式

# crm configure property maintenance-mode=false
# crm resource maintenance rsc_SAPHana_HDB false
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=false
# crm resource maintenance rsc_SAPHanaTopology_HDB false
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=false

SUSE HAE集群会自动将主备节点的SAP HANA服务拉起,并保持原主备角色不变

3.6 当前HAE状态如下

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:56:49 2019
Last change: Mon Apr 15 16:56:43 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

3.7 检查SAP HANA HSR状态

3.8 重置故障计数(可选)

4.主节点停机维护

注意

主节点将被设置为standby模式,集群将触发切换

说明

本示例初始状态的主节点为saphana-02,备节点为saphana-01

4.1 SUSE HAE的正常状态

登录任意节点,使用

crm status

命令查询HAE的正常状态

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

4.2 将主节点设置standby模式

本示例主节点是saphana-02

# crm node standby saphana-02

集群会停掉saphana-02节点的SAP HANA,并将saphana-01节点的SAP HANA设置为主节点

4.2 当前HAE的状态如下

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 17:07:56 2019
Last change: Mon Apr 15 17:07:38 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Node saphana-02: standby
Online: [ saphana-01 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
     Masters: [ saphana-01 ]
     Stopped: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 ]
     Stopped: [ saphana-02 ]

4.3 关停ECS,执行停机维护任务

4.4 启动维护节点,重新注册HSR

登录被维护节点,注册HSR

# hdbnsutil -sr_register --remoteHost=saphana-01 --remoteInstance=00 --replicationMode=syncmem --name=saphana-02 --operationMode=logreplay

4.5 启动pacemaker服务,并将standby节点恢复成online模式

# systemctl start pacemaker
# crm node online saphana-02

SUSE HAE集群会自动将备节点的SAP HANA服务拉起。

4.6 当前HAE状态如下

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-01 ]
     Slaves: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

4.6 检查SAP HANA HSR状态

4.7 重置故障计数(可选)

5.备节点停机维护

注意

将备节点设置为维护模式

说明

本示例初始状态的主节点为saphana-02,备节点为saphana-01

5.1 SUSE HAE的正常状态

登录任意节点,使用

crm status

命令查询HAE的正常状态

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

5.2 将备节点设为维护模式

# crm node maintenance saphana-01

设置生效后,HAE状态如下

Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:18:10 2019
Last change: Mon Apr 15 18:17:49 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Node saphana-01: maintenance
Online: [ saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Slave saphana-01 (unmanaged)
     Masters: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-01 (unmanaged)
     Started: [ saphana-02 ]

5.3 停止备节点SAP HANA服务,关停ECS进行停机维护任务

用SAP HANA实例用户登录备节点,停止SAP HANA服务

saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.

5.4 启动SAP HANA ECS备节点,并将节点恢复为正常模式

登录备节点,启动pacemaker服务

# systemctl start pacemaker

将备节点恢复为正常模式

saphana-02:~ # crm node ready saphana-01

SUSE HAE集群会自动将备节点的SAP HANA服务拉起,并保持原主备角色不变

5.4 当前HAE状态如下

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

5.5 检查SAP HANA HSR状态

5.6 重置故障计数(可选)

阿里云首页 SAP 解决方案 相关技术圈