如何从SBD fence方案迁移到Fence aliyun方案?
更新时间:
问题描述
你希望将在阿里云上部署的SAP高可用环境(SBD fence方案)迁移到Fence aliyun方案。
适用于
- 阿里云ECS实例上部署的SAP高可用环境(SAP HANA、SAP ASCS/SCS)
- SAP ASCS/SCS高可用环境的ERS实例安装在本机,并且使用高可用虚拟IP产品管理服务地址
使用限制和说明
- 使用此迁移方案前请确保当前您的SAP高可用环境(SAP HANA、SAP ASCS/SCS)运行正常。
- SAP ASCS/SCS高可用环境没有安装ERS实例的环境,不适用此方案。
- 操作系统的版本需要SLES for SAP 12 SP4及以上。
- 此迁移方案需要业务停机,请提前规划停机窗口。
- 强烈建议做变更前对ECS的系统盘和数据盘创建快照,您可以参考单块云盘快照或者多个云盘快照。
方案
场景一:SAP HANA高可用环境
以下是SAP HANA高可用环境的操作流程,具体如下:
- 登录集群的主节点,执行以下命令,查看所有资源的状态。
说明:未特殊说明的步骤只需要在集群的一个节点上操作即可。
crm_mon -r
系统显示类似如下,示例有两台ECS,hana001和hana002,集群状态和被管理的资源状态正常。
Stack: corosync
2 nodes configured
6 resources configured
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
Masters: [ hana001 ]
Slaves: [ hana002 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ hana001 hana002 ] - 执行以下命令,查找当前SBD的块设备名。
cat /etc/sysconfig/sbd | grep SBD_DEVICE
命令返回类似如下,本示例的SBD块设备名是/dev/vdf
。# SBD_DEVICE specifies the devices to use for exchanging sbd messages
登录阿里云控制台,"存储与快照"->"共享块存储",点击共享块存储实例ID查看详情,再次确认ECS实例挂载的设备名跟上面查询到的设备名一致。
SBD_DEVICE="/dev/vdf"
请确认这里显示的设备名去掉x字符跟上面查询到的结果一致
- 本示例执行以下命令,查询ASCS和ERS的高可用虚拟IP的设置。
crm configure show | grep -E "primitive rsc_vip|params ip"
命令返回类似如下。primitive rsc_vip IPaddr2 \
params ip=192.168.10.101请根据实际情况替换对应的参数名
- 参考SAP HANA同可用区高可用部署中的5.3.2 方案二:fence_aliyun章节,完成全部配置。
- 执行以下命令,将集群设置为维护模式。
crm configure property maintenance-mode=true
如果集群中存在maintenance属性的设定,会弹出类似提示,输入y即可。
'maintenance' attribute already exists in rsc_sbd. Remove it (y/n)? y
'is-managed' conflicts with 'maintenance' in cln_SAPHanaTopology_HDB. Remove it (y/n)? y - 设置成功后,执行以下命令,确认所有资源都是unmanaged状态。
crm_mon -r
命令返回类似如下。
2 nodes configured
6 resources configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001 (unmanaged)
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable) (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave hana002 (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Master hana001 (unmanaged)
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana002 (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana001 (unmanaged)说明:如果还存在没被unmanaged的资源,需要手工将其设置成unmanaged,命令语法如下:
语法:
以SAP HANA的资源没有被正常设置为unmanaged为例。
crm resource maintenance [resource name] true2 nodes configured
执行以下命令来完成设置:
6 resources configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001 (unmanaged)
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable) (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave hana002
rsc_SAPHana_HDB (ocf::suse:SAPHana): Master hana001
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana002 (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana001 (unmanaged)crm resource maintenance rsc_SAPHana_HDB true
请再次确认所有资源都已经处于unmanaged状态
- 将所有资源设置为stop状态。
语法:
crm resource stop ID1 ID2 ...
本示例运行的命令:
crm resource stop rsc_sbd rsc_vip rsc_SAPHana_HDB rsc_SAPHanaTopology_HDB请替换成您的环境的资源ID
- 删除所有资源。
语法:
crm configure delete ID1 ID2 ...
本示例命令:
crm configure delete rsc_sbd rsc_vip rsc_SAPHana_HDB rsc_SAPHanaTopology_HDB - 分别在两个节点上重启pacemaker服务
systemctl restart pacemaker
- 退出集群维护模式
crm configure property maintenance-mode=false
- 清空资源后,确认集群中只有两个node,资源数为0。
crm_mon -r
Stack: corosync
Current DC: hana001 (version 2.0.1+20190417.13d370ca9-3.24.1-2.0.1+20190417.13d370ca9) - partition with quorum
Last updated: Thu Feb 24 11:57:13 2022
Last change: Thu Feb 24 11:57:09 2022 by root via cibadmin on hana001
2 nodes configured
0 resources configured
Online: [ hana001 hana002 ]
No resources - 参考SAP HANA同可用区高可用部署,11.2章节完成fence agent的脚本配置。
- 执行以下命令,验证集群配置。
Stack: corosync
Current DC: hana001 (version 2.0.1+20190417.13d370ca9-3.24.1-2.0.1+20190417.13d370ca9) - partition with quorum
Last updated: Thu Feb 24 17:51:44 2022
Last change: Thu Feb 24 17:51:41 2022 by root via crm_attribute on hana001
2 nodes configured
7 resources configured
Online: [ hana001 hana002 ]
Full list of resources:
res_ALIYUN_STONITH_1 (stonith:fence_aliyun): Started hana002
res_ALIYUN_STONITH_2 (stonith:fence_aliyun): Started hana001
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
Masters: [ hana001 ]
Slaves: [ hana002 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ hana001 hana002 ]注意:请确认集群的主备节点角色符合预期。
- 高可用环境切换测试验证,可参考SUSE官网文档或SAP系统高可用环境维护指南
- 执行以下命令,禁用SBD服务
systemctl disable sbd
- 释放共享存储产品
登录阿里云控制台,"存储与快照"->"共享块存储",找到本次操作的共享块存储实例,从ECS上卸载掉。
场景二:SAP ASCS/SCS高可用环境
以下是SAP S/4HANA ASCS高可用环境的操作流程,具体如下:
- 登录集群的主节点,执行以下命令,查看所有资源的状态。
说明:未特殊说明的步骤只需要在集群的一个节点上操作即可。crm_mon -r
系统显示类似如下,示例有两台ECS,SAPAPP01和SAPAPP02上安装了ASCS高可用环境,集群状态和被管理的资源状态正常。
Stack: corosync
2 nodes configured
5 resource instances configured
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started SAPAPP01
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 - 执行以下命令,查找当前SBD的块设备名。
cat /etc/sysconfig/sbd | grep SBD_DEVICE
命令返回类似如下,本示例的SBD块设备名是/dev/vdc
。# SBD_DEVICE specifies the devices to use for exchanging sbd messages
登录阿里云控制台,"存储与快照"->"共享块存储",点击共享块存储实例ID查看详情,再次确认ECS实例挂载的设备名跟上面查询到的设备名一致。
SBD_DEVICE="/dev/vdc"
请确认这里显示的设备名去掉x字符跟上面查询到的结果一致 - 参考SAP S/4HANA同可用区高可用部署中的4.4 方案二:Fence_aliyun实现fence功能章节,完成全部配置。
- 执行以下命令,将集群设置为维护模式。
crm configure property maintenance-mode=true
如果集群中存在maintenance属性的设定,会弹出类似提示,输入y即可。
'maintenance' attribute already exists in rsc_sap_S4A_ERS10. Remove it (y/n)? - 设置成功后,执行以下命令,确认所有资源都是unmanaged状态。
-
crm_mon -r
命令返回类似如下。
2 nodes configured
说明:如果还存在没被unmanaged的资源,需要手工将其设置成unmanaged,命令语法如下:
5 resource instances configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started SAPAPP01 (unmanaged)
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01 (unmanaged)
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01 (unmanaged)
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02 (unmanaged)
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 (unmanaged)语法:
以rsc_ip_S4A_ASCS00资源为例,执行以下命令来完成设置:
crm resource maintenance [resource name] truecrm resource maintenance rsc_ip_S4A_ASCS00 true
请再次确认所有资源都已经处于unmanaged状态 - 将所有资源设置为stop状态。
语法:
请替换成您的环境的资源ID
crm resource stop ID1 ID2 ...
本示例运行的命令:
crm resource stop stonith-sbd rsc_ip_S4A_ERS10 rsc_sap_S4A_ERS10 rsc_ip_S4A_ASCS00 rsc_sap_S4A_ASCS00 - 删除所有资源。
语法:
crm configure delete ID1 ID2 ...
本示例命令:
crm configure delete stonith-sbd rsc_ip_S4A_ERS10 rsc_sap_S4A_ERS10 rsc_ip_S4A_ASCS00 rsc_sap_S4A_ASCS00 - 分别在两个节点上重启pacemaker服务
systemctl restart pacemaker
- 退出集群维护模式
crm configure property maintenance-mode=false
- 清空资源后,确认集群中只有两个node,资源数为0。
crm_mon -r
2 nodes configured
0 resource instances configured
Online: [ SAPAPP01 SAPAPP02 ]
No resources - 参考SAP S/4HANA同可用区高可用部署,7.5.4 方案二Fence_aliyun实现fence功能章节完成fence agent的脚本配置。
- 执行以下命令,验证集群配置。
Stack: corosync
注意:请确认集群的主备节点角色符合预期。
2 nodes configured
6 resources configured
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
res_ALIYUN_STONITH_1 (stonith:fence_aliyun): Started SAPAPP02
res_ALIYUN_STONITH_2 (stonith:fence_aliyun): Started SAPAPP01
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 - 高可用环境切换测试验证,可参考SUSE官网文档或SAP系统高可用环境维护指南
- 执行以下命令,禁用SBD服务
systemctl disable sbd
- 释放共享存储产品
登录阿里云控制台,"存储与快照"->"共享块存储",找到本次操作的共享块存储实例,从ECS上卸载掉。
相关文档
文档内容是否对您有帮助?