全部产品
云市场

SAP HANA 同/跨可用区高可用部署(基于Fence agent)

更新时间:2019-11-25 10:41:13

SAP HANA 高可用部署通用解决方案

版本管理

版本 修订日期 变更类型 生效日期
1.0 初始发布 2019/11/15

概述

本文档描述了如何在阿里云同可用区或跨可用区,使用SUSE Linux Enterprise Server的Fence agent和Resource agent来部署SAP HANA的高可用环境。

本文档基于SUSE Linux Enterprise Server for SAP Applications 12 SP4版本,同样适用于更高版本。

相对于传统的SAP高可用方案中使用的SBD fence设备和高可用虚拟IP,fence agentResource agent(RAs)通过调用Open API实现对阿里云云资源的灵活调度和管理,支持同可用区或跨可用区的SAP系统高可用部署,结合PrivateZone可以实现纯内网API调用,满足企业对SAP核心应用的内网访问需求。

fence_aliyun是针对阿里云环境开发的,用于隔离SAP系统高可用环境故障节点的fence agent。

aliyun-vpc-move-ip是针对阿里云环境开发的,用于SAP系统高可用环境中管理浮动IP和Overlay 浮动IP设置的Resource agent(RAs)。

SUSE Enterprise Server for SAP Applications 12 SP4及之后的版本已经原生集成了fence_aliyun和aliyun-vpc-move-ip,可以直接用于阿里云公有云环境上的SAP系统高可用环境部署。

SAP HANA HA架构

本示例中,SAP HANA部署在北京Region下两个不同的可用区(C和G),通过HANA HSR+SUSE HAE实现高可用,保证业务的连续性。

mc-01

安装前准备

SAP系统安装介质

  1. 通过SAP Download Manager下载介质到跳板机

  2. 上传介质到OSS Bucket OSS快速入门

  3. 使用OSS工具挂载或下载文件到ECS OSS常用工具汇总

VPC网络规划

网络规划

网络 位置 用途 分配网段
业务网 华东2 可用区C For Business/SR 192.168.10.0/24
业务网 华东2 可用区C For HA 192.168.11.0/24
业务网 华东2 可用区G For Business/SR 192.168.30.0/24
业务网 华东2 可用区G For HA 192.168.31.0/24
主机名 角色 业务地址 心跳地址 高可用虚拟IP
hana-master HANA主节点 192.168.10.1 192.168.11.1 192.168.100.100
hana-slave HANA备节点 192.168.30.1 192.168.31.1 192.168.100.100

当HA触发切换时,SUSE Pacemaker通过Resource agent调用阿里云的Open API实现Overlay IP的调整。

因此,“高可用虚拟IP” 不能是当前SAP HANA所在的VPC下已经存在的子网地址,它可以是一个虚拟不存在的地址,本示例中192.168.100.100这个IP地址就是一个不存在的虚拟地址。

文件系统规划

本示例中,HANA的文件系统使用了LVM条带化,/usr/sap使用一块单独的云盘,实际文件系统划分如下:

属性 文件系统大小 文件系统 VG LVM条带化 挂载点
数据盘 800G XFS datavg /hana/data
数据盘 400G XFS datavg /hana/log
数据盘 300G XFS datavg /hana/shared
数据盘 50G XFS sapvg /usr/sap

VPC网络创建

专有网络VPC(Virtual Private Cloud)是基于阿里云构建的一个隔离的网络环境,专有网络之间逻辑上彻底隔离。专有网络是您自己独有的的云上私有网络。您可以完全掌控自己的专有网络,例如选择IP地址范围、配置路由表和网关等。具体详细信息和文档请参考 产品文档

按规划创建专有网络及SAP HANA业务子网和心跳子网,本示例创建了一个CIDR为“192.168.0.0/16”的VPC专有网络和对应的子网网段如下:

vsw

SAP HANA ECS实例创建

ECS 产品购买页面

访问 https://www.aliyun.com/product/ecs 购买页面,选择实例类型输入配置参数,点击确认订单。

选择付费方式

选择 付费方式:包年包月 或者 按量付费

选择对应区域和可用区

选择地域和可用区。系统默认随机分配可用区,您可以选择适用的可用区。如何选择地域和可用区,请参见 地域和可用区
本示例需要分别在华东2 可用区C和G各创建一台相同规格的实例。

选择实例规格

目前阿里云通过SAP HANA认证的实例规格如下:

实例规格 vCPU 内存(GiB) 架构
ecs.r5.8xlarge 32 256 Skylake
ecs.r5.16xlarge 64 512 Skylake
ecs.se1.14xlarge 56 480 Broadwell
ecs.re4.20xlarge 80 960 Broadwell
ecs.re4.40xlarge 160 1920 Broadwell
ecs.re4e.40xlarge 160 3840 Broadwell

详情请见 SAP Certified IaaS Platforms

本示例选择 ecs.r5.8xlarge 作为测试实例。

选择镜像

您可以选择公共镜像、自定义镜像、共享镜像或从镜像市场选择镜像。
SAP HANA的镜像,可按实际需求选择对应的镜像类型和版本。

本示例使用的是“SUSE 12 SP4 for SAP”镜像。

配置存储

系统盘:必选项,用于安装操作系统。指定系统盘的云盘类型和容量。
数据盘:可选项。如果在此时创建云盘作为数据盘,必须选择云盘类型、容量、数量,并设置是否 加密。您可以创建空云盘,也可以使用快照创建云盘。最多可以添加16块云盘作数据盘。
数据盘的大小需要根据SAP HANA实例的需求做相应的调整。

本示例中, /hana/data,/hana/log,/hana/shared 使用三块同等容量的SSD云盘用LVM进行条带化处理以满足SAP HANA的性能要求,文件系统为XFS。

mc-05

有关SAP HANA存储需求请参考 SAP HANA TDI-Storage Requirements

选择网络类型

单击 下一步:网络和安全组,完成网络和安全组设置:

1、选择网络类型
按规划选择对应的专有网络和交换机。

2、设置公网带宽
当您的SAP HANA实例临时需要通过公网访问时,这里请勾选“分配公网IPv4地址”,并配置适合的带宽。如果需要长期通过公网访问,建议使用NAT网关

本示例为ECS实例分配了5M的外网访问带宽:

mc-06

选择安全组

如果您还没有创建安全组,需要手工创建安全组用于管理ECS出和入的网络访问策略。系统默认的安全组规则需要结合企业自身精细化管理的需要,自定义策略添加安全组规则

网卡配置

Note:先不增加第二张弹性网卡,ECS创建成功之后再添加第二张网卡。

完成系统配置、分组设置,完成ECS的购买。

配置弹性网卡

弹性网卡(ENI)是一种可以附加到专有网络VPC类型ECS实例上的虚拟网卡,通过弹性网卡,您可以实现高可用集群搭建、低成本故障转移和精细化的网络管理。所有地域均支持弹性网卡。具体说明请参考 弹性网卡

创建弹性网卡

本示例,按规划创建两块ENI,用于心跳网卡登录 ECS管理控制台,在左侧导航栏中,选择 网络和安全 > 弹性网卡。选择地域。单击 创建弹性网卡

ENI创建完成后,绑定对应的ECS实例,绑定成功后如下图:

mc-15

控制台创建完ENI后,还需要登录操作系统配置网卡,本示例的操作系统是SUSE linux,配置命令如下:
yast2 network

将新加的ENI的IP地址、子网掩码等按规划配置如下:

不要在操作系统上直接修改主网卡的IP地址,这会导致无法连接ECS。如一定要修改主网卡IP,请参考 修改私有IP地址。 如果已经修改并保存了主网卡的IP地址,可以通过控制台->ECS->远程连接,恢复为默认的设置即将主网卡再设置成DHCP并重启ECS。

nw_01

nw_02

同理配置完成备节点的心跳网卡后如下:

  1. #主节点
  2. hana-master:~ # ifconfig -a
  3. eth0 Link encap:Ethernet HWaddr 00:16:3E:16:D5:78
  4. inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0
  5. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
  6. RX packets:10410623 errors:0 dropped:0 overruns:0 frame:0
  7. TX packets:2351934 errors:0 dropped:0 overruns:0 carrier:0
  8. collisions:0 txqueuelen:1000
  9. RX bytes:15407579259 (14693.8 Mb) TX bytes:237438291 (226.4 Mb)
  10. eth1 Link encap:Ethernet HWaddr 00:16:3E:16:B3:CF
  11. inet addr:192.168.11.1 Bcast:192.168.11.255 Mask:255.255.255.0
  12. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
  13. RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  14. TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
  15. collisions:0 txqueuelen:1000
  16. RX bytes:0 (0.0 b) TX bytes:126 (126.0 b)
  17. lo Link encap:Local Loopback
  18. inet addr:127.0.0.1 Mask:255.0.0.0
  19. UP LOOPBACK RUNNING MTU:65536 Metric:1
  20. RX packets:3584482 errors:0 dropped:0 overruns:0 frame:0
  21. TX packets:3584482 errors:0 dropped:0 overruns:0 carrier:0
  22. collisions:0 txqueuelen:1000
  23. RX bytes:8501733500 (8107.8 Mb) TX bytes:8501733500 (8107.8 Mb)
  24. #备节点
  25. hana-slave:~ # ifconfig -a
  26. eth0 Link encap:Ethernet HWaddr 00:16:3E:16:D1:7B
  27. inet addr:192.168.30.1 Bcast:192.168.30.255 Mask:255.255.255.0
  28. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
  29. RX packets:10305987 errors:0 dropped:0 overruns:0 frame:0
  30. TX packets:1281821 errors:0 dropped:0 overruns:0 carrier:0
  31. collisions:0 txqueuelen:1000
  32. RX bytes:15387761017 (14674.9 Mb) TX bytes:176376293 (168.2 Mb)
  33. eth1 Link encap:Ethernet HWaddr 00:16:3E:0C:27:58
  34. inet addr:192.168.31.1 Bcast:192.168.31.255 Mask:255.255.255.0
  35. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
  36. RX packets:1 errors:0 dropped:0 overruns:0 frame:0
  37. TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
  38. collisions:0 txqueuelen:1000
  39. RX bytes:42 (42.0 b) TX bytes:168 (168.0 b)
  40. lo Link encap:Local Loopback
  41. inet addr:127.0.0.1 Mask:255.0.0.0
  42. UP LOOPBACK RUNNING MTU:65536 Metric:1
  43. RX packets:3454980 errors:0 dropped:0 overruns:0 frame:0
  44. TX packets:3454980 errors:0 dropped:0 overruns:0 carrier:0
  45. collisions:0 txqueuelen:1000
  46. RX bytes:8244241992 (7862.3 Mb) TX bytes:8244241992 (7862.3 Mb)

SAP HANA ECS配置

维护主机名

分别在 HA 集群两台SAP HANA ECS上维护主机名称解析。
本示例的 /etc/hosts 文件内容如下:

  1. 127.0.0.1 localhost
  2. 192.168.10.1 hana-master
  3. 192.168.11.1 hana-01
  4. 192.168.30.1 hana-slave
  5. 192.168.31.1 hana-02

配置ECS SSH互信

HA集群的两台SAP HANA ECS的需要配置SSH互信,配置过程如下。

配置认证公钥

在SAP HANA主节点执行如下命令:

  1. hana-master:~ # ssh-keygen -t rsa
  2. Generating public/private rsa key pair.
  3. Enter file in which to save the key (/root/.ssh/id_rsa):
  4. Enter passphrase (empty for no passphrase):
  5. Enter same passphrase again:
  6. Your identification has been saved in /root/.ssh/id_rsa.
  7. Your public key has been saved in /root/.ssh/id_rsa.pub.
  8. The key fingerprint is:
  9. SHA256:Glxb95CX3CYmyvIWP1+dGNjTAKyOA6OQVsGawLB1Mwc root@hana-master
  10. The key's randomart image is:
  11. +---[RSA 2048]----+
  12. |+ o.E.. .. |
  13. |.+ + + ..o o |
  14. |o = . o =.* o|
  15. | * + . = oo*oo |
  16. |. . . = S +. +.. |
  17. | . = + o + o|
  18. | . . o o. .o|
  19. | . o . |
  20. | . |
  21. +----[SHA256]-----+
  22. hana-master:~ # ssh-copy-id -i /root/.ssh/id_rsa.pub root@hana-slave
  23. /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
  24. The authenticity of host 'hana-slave (192.168.30.1)' can't be established.
  25. ECDSA key fingerprint is SHA256:FkjnE833pcHvtcDTFfOLDYzblmAp1wvBE5cT9xk69Po.
  26. Are you sure you want to continue connecting (yes/no)? yes
  27. /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
  28. /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
  29. Password:
  30. Number of key(s) added: 1
  31. Now try logging into the machine, with: "ssh 'root@hana-slave'"
  32. and check to make sure that only the key(s) you wanted were added.

在SAP HANA备节点上执行如下命令:

  1. hana-slave:~ # ssh-keygen -t rsa
  2. Generating public/private rsa key pair.
  3. Enter file in which to save the key (/root/.ssh/id_rsa):
  4. Enter passphrase (empty for no passphrase):
  5. Enter same passphrase again:
  6. Your identification has been saved in /root/.ssh/id_rsa.
  7. Your public key has been saved in /root/.ssh/id_rsa.pub.
  8. The key fingerprint is:
  9. SHA256:lYh/w8U+vRET/E5wCAgZx990sinegTAmHsech5CIF4M root@hana-slave
  10. The key's randomart image is:
  11. +---[RSA 2048]----+
  12. | ooo+X.+.o.. |
  13. | E oo=o%o. *.o|
  14. | ....=o=o+oO |
  15. | ..o o+.=oo|
  16. | S +.ooo+ |
  17. | . ....o.|
  18. | . |
  19. | |
  20. | |
  21. +----[SHA256]-----+
  22. hana-slave:~ # ssh-copy-id -i /root/.ssh/id_rsa.pub root@hana-master
  23. /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
  24. The authenticity of host 'hana-master (192.168.10.1)' can't be established.
  25. ECDSA key fingerprint is SHA256:zi+gKx4IFe6Ea12thsdVW9L3J93ZwFymo0+YOLjLJ18.
  26. Are you sure you want to continue connecting (yes/no)? yes
  27. /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
  28. /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
  29. Password:
  30. Number of key(s) added: 1
  31. Now try logging into the machine, with: "ssh 'root@hana-master'"
  32. and check to make sure that only the key(s) you wanted were added.

验证配置结果

分别在两个节点上,使用 SSH 登录另外一个节点,如果不需要密码登录,则说明互信已经建立。

  1. hana-master:~ # ssh hana-slave
  2. Last login: Wed Nov 13 10:16:35 2019 from 106.11.34.9
  3. hana-slave:~ # ssh hana-master
  4. Last login: Wed Nov 13 10:16:34 2019 from 106.11.34.9

ECS Metrics Collector for SAP监控代理

ECS Metrics Collector监控代理程序,用于云平台上SAP系统收集需要的虚拟机配置信息和底层物理资源使用相关的信息,供日后做性能统计和问题分析使用。

每台SAP应用和数据库都需要安装Metrics Collector,监控代理的部署请参考 ECS Metrics Collector for SAP部署指南

HANA文件系统划分

按前面的文件系统规划,使用LVM来管理和配置云盘

有关LVM分区的介绍,请参考 LVM HOWTO

  • 创建PV和VG
  1. # pvcreate /dev/vdb /dev/vdc /dev/vdd /dev/vdg
  2. Physical volume "/dev/vdb" successfully created
  3. Physical volume "/dev/vdc" successfully created
  4. Physical volume "/dev/vdd" successfully created
  5. Physical volume "/dev/vdg" successfully created
  6. # vgcreate hanavg /dev/vdb /dev/vdc /dev/vdd
  7. Volume group "hanavg" successfully created
  8. # vgcreate sapvg /dev/vdg
  9. Volume group "sapvg" successfully created
  • 创建LV(将三块500G的SSD云盘做条带化)
  1. # lvcreate -l 100%FREE -n usrsaplv sapvg
  2. Logical volume "usrsaplv" created.
  3. # lvcreate -L 800G -n datalv -i 3 -I 64 hanavg
  4. Rounding size (204800 extents) up to stripe boundary size (204801 extents).
  5. Logical volume "datalv" created.
  6. # lvcreate -L 400G -n loglv -i 3 -I 64 hanavg
  7. Rounding size (102400 extents) up to stripe boundary size (102402 extents).
  8. Logical volume "loglv" created.
  9. # lvcreate -l 100%FREE -n sharedlv -i 3 -I 64 hanavg
  10. Rounding size (38395 extents) down to stripe boundary size (38394 extents)
  11. Logical volume "sharedlv" created.
  • 创建挂载点并格式化文件系统
  1. # mkdir -p /usr/sap /hana/data /hana/log /hana/shared
  2. # mkfs.xfs /dev/sapvg/usrsaplv
  3. # mkfs.xfs /dev/hanavg/datalv
  4. # mkfs.xfs /dev/hanavg/loglv
  5. # mkfs.xfs /dev/hanavg/sharedlv
  • 挂载文件系统并设置开机自启动
  1. # vim /etc/fstab
  2. 添加下列项:
  3. /dev/mapper/hanavg-datalv /hana/data xfs defaults 0 0
  4. /dev/mapper/hanavg-loglv /hana/log xfs defaults 0 0
  5. /dev/mapper/hanavg-sharedlv /hana/shared xfs defaults 0 0
  6. /dev/mapper/sapvg-usrsaplv /usr/sap xfs defaults 0 0
  7. /dev/vdf swap swap defaults 0 0
  8. # mount -a
  9. # df -h
  10. Filesystem Size Used Avail Use% Mounted on
  11. devtmpfs 32G 0 32G 0% /dev
  12. tmpfs 48G 55M 48G 1% /dev/shm
  13. tmpfs 32G 768K 32G 1% /run
  14. tmpfs 32G 0 32G 0% /sys/fs/cgroup
  15. /dev/vda1 99G 30G 64G 32% /
  16. tmpfs 6.3G 16K 6.3G 1% /run/user/0
  17. /dev/mapper/hanavg-datalv 800G 34M 800G 1% /hana/data
  18. /dev/mapper/sapvg-usrsaplv 50G 33M 50G 1% /usr/sap
  19. /dev/mapper/hanavg-loglv 400G 33M 400G 1% /hana/log
  20. /dev/mapper/hanavg-sharedlv 300G 33M 300G 1% /hana/shared

aliyun_agent和aliyun-vpc-move-ip安装和配置

环境准备

1.python和pip安装和检查fence-agent支持Python2.x版本,因此需要确保Python和pip版本为2.x

  1. #检查Python版本
  2. python -V
  3. Python 2.7.13
  4. # 检查Python包管理工具pip版本
  5. pip -V
  6. pip 19.3.1 from /usr/lib/python2.7/site-packages/pip (python 2.7)

2.安装aliyun python sdk

  1. pip install aliyun-python-sdk-core # 阿里云SDK核心库
  2. pip install aliyun-python-sdk-ecs # 阿里云ECS管理库
  3. #安装后检查
  4. pip list | grep aliyun-python
  5. aliyun-python-sdk-core 2.13.10
  6. aliyun-python-sdk-ecs 4.17.6

3.安装Aliyun CLI命令行工具最新版本 下载地址

3.1 当前最新版本为3.0.29,下载并安装配置如下:

  1. #下载aliyun cli
  2. wget https://github.com/aliyun/aliyun-cli/releases/download/v3.0.29/aliyun-cli-linux-3.0.29-amd64.tgz
  3. #解压
  4. tar -xvf aliyun-cli-linux-3.0.29-amd64.tgz
  5. cp aliyun /usr/local/bin

3.2 配置RAM Role

Aliyun cli结合RAM role的配置方式,可以减少AK泄漏导致的安全风险,同时也能满足企业对资产的精细化管理需要。

3.2.1 登录阿里云控制台-》访问控制-》RAM角色管理,点击“新建RAM角色”

3.2.2 “新建权限策略”,确保只对需要的ECS和VPC做有限权限的操作,请按实际情况替换掉实例ID和VPC ID,本示例如下:

  1. {
  2. "Version": "1",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "ecs:CreateInstance",
  8. "ecs:StopInstance",
  9. "ecs:RebootInstance",
  10. "ecs:DescribeInstances"
  11. ],
  12. "Resource": [
  13. "acs:ecs:cn-beijing::i-2ze2ujq5zpxxmaemlyfn",
  14. "acs:ecs:cn-beijing::i-2zefdluqx20n43jos4vj"
  15. ],
  16. "Condition": {}
  17. },
  18. {
  19. "Effect": "Allow",
  20. "Action": [
  21. "vpc:CreateRouteEntry",
  22. "vpc:DeleteRouteEntry",
  23. "vpc:DescribeRouteTables"
  24. ],
  25. "Resource": [
  26. "acs:vpc:cn-beijing::vpc-2zepuzul7un96diuhhxhz"
  27. ],
  28. "Condition": {}
  29. }
  30. ]
  31. }

3.2.3 将刚才新建的权限策略,授权给SAP-HA-ROLE这个角色

mc-15

3.2.4 RAM Role授权给SAP HANA ECS

控制台-》ECS-》更多-》实例设置-》授予/收回RAM角色,授予刚才新建的“SAP-HA-ROLE”角色mc-15

3.3 配置aliyun cli命令行工具

  1. # aliyun configure --profile ecsRamRoleProfile --mode EcsRamRole
  2. Configuring profile 'ecsRamRoleProfile' in 'EcsRamRole' authenticate mode...
  3. Ecs Ram Role []: SAP-HA-ROLE
  4. Default Region Id []: cn-beijing
  5. Default Output Format [json]: json (Only support json)
  6. Default Language [zh|en] en:
  7. Saving profile[ecsRamRoleProfile] ...Done.
  8. Configure Done!!!
  9. ..............888888888888888888888 ........=8888888888888888888D=..............
  10. ...........88888888888888888888888 ..........D8888888888888888888888I...........
  11. .........,8888888888888ZI: ...........................=Z88D8888888888D..........
  12. .........+88888888 ..........................................88888888D..........
  13. .........+88888888 .......Welcome to use Alibaba Cloud.......O8888888D..........
  14. .........+88888888 ............. ************* ..............O8888888D..........
  15. .........+88888888 .... Command Line Interface(Reloaded) ....O8888888D..........
  16. .........+88888888...........................................88888888D..........
  17. ..........D888888888888DO+. ..........................?ND888888888888D..........
  18. ...........O8888888888888888888888...........D8888888888888888888888=...........
  19. ............ .:D8888888888888888888.........78888888888888888888O ..............

fence_aliyun安装和配置

1.下载最新版fence_aliyun

  1. curl https://raw.githubusercontent.com/ClusterLabs/fence-agents/master/agents/aliyun/fence_aliyun.py > /usr/sbin/fence_aliyun
  2. chmod 755 /usr/sbin/fence_aliyun
  3. chown root:root /usr/sbin/fence_aliyun

2.适配用户环境

  1. # 指定解释器为python
  2. sed -i "1s|@PYTHON@|$(which python)|" /usr/sbin/fence_aliyun
  3. # 指定Fence agent lib 目录
  4. sed -i "s|@FENCEAGENTSLIBDIR@|/usr/share/fence|" /usr/sbin/fence_aliyun

3.验证安装

  1. # stonith_admin -I |grep fence_aliyun
  2. 100 devices found
  3. fence_aliyun

aliyun-vpc-move-ip安装和配置

1.下载最新版aliyun-vpc-move-ip

  1. mkdir -p /usr/lib/ocf/resource.d/aliyun
  2. curl https://raw.githubusercontent.com/ClusterLabs/resource-agents/master/heartbeat/aliyun-vpc-move-ip > /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
  3. chmod 755 /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
  4. chown root:root /usr/lib/ocf/resource.d/aliyun/vpc-move-ip

2.验证安装

  1. # ll /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
  2. -rwxr-xr-x 1 root root 9983 Nov 14 19:44 /usr/lib/ocf/resource.d/aliyun/vpc-move-ip

PrivateZone配置

调用阿里云的OpenAPI默认需要公网访问,aliyun_agent和aliyun-vpc-move-ip通过SUSE pacemaker调用ECS和VPC的OpenAPI,达到隔离故障节点和管理浮动IP的效果,结合PrivateZone实现内网访问ECS和VPC OpenAPI。

登录阿里云控制台-》云解析DNS-》PrivateZone,新建ECS和VPC的Zone,示例如下:

1.新建名为vpc.cn-beijing.aliyuncs.com和ecs.cn-beijing.aliyuncs.com的Zone(不用勾选“子域名递归解析代理”)mc-15

2.分别点击“解析设置”,添加CNAME解析

vpc.cn-beijing.aliyuncs.com的CNAME如下:

mc-15

ecs.cn-beijing.aliyuncs.com的CNAME如下:

mc-15

3.关联这两个Zone所在的VPCmc-15

4.验证配置请确认两台ECS没有挂载EIP或者配置NAT网关,当前处于内网环境。ping结果返回正常,表示privatezone配置成功。

  1. # ping vpc.cn-beijing.aliyuncs.com
  2. PING popunify-vpc.cn-beijing.aliyuncs.com (100.100.80.162) 56(84) bytes of data.
  3. 64 bytes from 100.100.80.162: icmp_seq=1 ttl=102 time=0.065 ms
  4. 64 bytes from 100.100.80.162: icmp_seq=2 ttl=102 time=0.087 ms
  5. 64 bytes from 100.100.80.162: icmp_seq=3 ttl=102 time=0.106 ms
  6. 64 bytes from 100.100.80.162: icmp_seq=4 ttl=102 time=0.107 ms
  7. --- popunify-vpc.cn-beijing.aliyuncs.com ping statistics ---
  8. 4 packets transmitted, 4 received, 0% packet loss, time 3058ms
  9. rtt min/avg/max/mdev = 0.065/0.091/0.107/0.018 ms
  10. # ping ecs.cn-beijing.aliyuncs.com
  11. PING popunify-vpc.cn-beijing.aliyuncs.com (100.100.80.162) 56(84) bytes of data.
  12. 64 bytes from 100.100.80.162: icmp_seq=1 ttl=102 time=0.065 ms
  13. 64 bytes from 100.100.80.162: icmp_seq=2 ttl=102 time=0.093 ms
  14. 64 bytes from 100.100.80.162: icmp_seq=3 ttl=102 time=0.129 ms
  15. 64 bytes from 100.100.80.162: icmp_seq=4 ttl=102 time=0.102 ms
  16. --- popunify-vpc.cn-beijing.aliyuncs.com ping statistics ---
  17. 4 packets transmitted, 4 received, 0% packet loss, time 3059ms
  18. rtt min/avg/max/mdev = 0.065/0.097/0.129/0.023 ms

5.当前VPC和ECS支持的region和对应的域名如下:

地域 地域ID CNAME记录 接入地址(ECS Zone名称) 接入地址(VPC Zone名称)
华北 2(北京) cn-beijing popunify-vpc.cn-beijing.aliyuncs.com ecs.cn-beijing.aliyuncs.com vpc.cn-beijing.aliyuncs.com
华东 1(杭州) cn-hangzhou popunify-vpc.cn-hangzhou.aliyuncs.com ecs.cn-hangzhou.aliyuncs.com vpc.cn-hangzhou.aliyuncs.com
华东 2(上海) cn-shanghai popunify-vpc.cn-shanghai.aliyuncs.com ecs.cn-shanghai.aliyuncs.com vpc.cn-shanghai.aliyuncs.com
华南 1(深圳) cn-shenzhen popunify-vpc.cn-shenzhen.aliyuncs.com ecs.cn-shenzhen.aliyuncs.com vpc.cn-shenzhen.aliyuncs.com
华北 5(呼和浩特) cn-huhehaote popunify-vpc.cn-huhehaote.aliyuncs.com ecs.cn-huhehaote.aliyuncs.com vpc.cn-huhehaote.aliyuncs.com
华北 3(张家口) cn-zhangjiakou popunify-vpc.cn-zhangjiakou.aliyuncs.com ecs.cn-zhangjiakou.aliyuncs.com vpc.cn-zhangjiakou.aliyuncs.com
德国(法兰克福) eu-central-1 popunify-vpc.eu-central-1.aliyuncs.com ecs.eu-central-1.aliyuncs.com vpc.eu-central-1.aliyuncs.com

HANA数据库安装

Note: HANA的主节点和备节点的System ID和Instance ID要相同。本示例的HANA的System ID为H01,Instance ID为00。

有关SAP HANA的安装和配置请参考 SAP HANA Platform

配置HANA System Replication

有关SAP HANA System Replication的配置请参考 How To Perform System Replication for SAP HANA

SLES Cluster HA安装配置

安装SUSE HAE软件

SUSE HAE操作手册请参考:SUSE Linux Enterprise High Availability Extension 12

在主、备节点,检查是否已经安装HAE和SAPHanaSR组件

本示例使用的是SUSE CSP(Cloud Service Provider)镜像,此镜像已经预置了阿里云SUSE SMT Server配置,可直接进行组件检查和安装。如果是自定义镜像或其他镜像,需要先购买SUSE订阅,再注册到SUSE官方的SMT Server或者手工配置Zypper repository源。

配置HAE和集成HANA资源的管理,需要以下组件:

patterns-ha-ha_sles 包含:

  • Kernel,pacemaker,sbd,crm_mon,hawk2,corosync,fence-agent等

patterns-sles-sap_server 包含:

  • sapconf,uuidd,sap-locale,python等

patterns-sle-gnome-basic 包含:

  • gnome desktop等

安装组件的命令如下:

  1. saphana-01:~ # zypper in patterns-sle-gnome-basic patterns-ha-ha_sles SAPHanaSR sap_suse_cluster_connector

配置集群

生成集群配置文件

本示例使用VNC打开图形界面,在HANA主节点上配置Corosync

  1. # yast2 cluster

配置communication channel

  • Channel选择心跳网段,Redundant Channel选择业务网段
  • 按正确的顺序依次添加Member address(前心跳地址,后业务地址)
  • Excepted Votes: 2
  • Transport: Unicast

coro_01

配置Security

勾选”Enable Security Auth”,并点击 Generate Auth Key File

coro_02

配置Csync2

  • 添加Sync host
  • 点击Add Suggested Files
  • 点击Generate Pre-Shared-Keys
  • 点击Turn csync2 ON

coro_03

Configure conntrackd这一步使用默认,直接下一步

coro_04

配置Service

  • 确认Cluster服务不要设成开机自启动

coro_05

配置完成后保存退出,将Corosync配置文件复制到SAP HANA备节点。

  1. # scp -pr /etc/corosync/authkey /etc/corosync/corosync.conf root@hana-slave:/etc/corosync/

启动集群
在两个节点里执行如下命令:

  1. # systemctl start pacemaker

查看集群状态

现在两个node都online了,resources集成后面做配置

  1. # crm_mon -r
  2. Stack: corosync
  3. Current DC: hana-slave (version 1.1.19+20181105.ccd6b5b10-3.13.1-1.1.19+20181105.ccd6b5b10) - partition with quorum
  4. Last updated: Thu Nov 14 14:47:00 2019
  5. Last change: Thu Nov 14 13:40:57 2019 by hacluster via crmd on hana-slave
  6. 2 nodes configured
  7. 0 resources configured
  8. Online: [ hana-master hana-slave ]
  9. No resources

启动WEB网页图形化配置

(1)激活两台ECS的Hawk2服务

  1. # passwd hacluster
  2. New password:
  3. Retype new password:
  4. passwd: password updated successfully
  5. # systemctl restart hawk.service

(2)访问Hawk2

https://<HANA ECS IP address>\:7630,输入用户名 hacluster和密码登录。

coro_05

SAP HANA与SUSE HAE集成

SAPHanaSR配置SAP HANA资源

在任何集群节点,新建脚本文件,替换脚本中的HANA SID、Instance Number和HAVIP三个参数:

本示例中,HANA SID:H01,Instance Number:00,HAVIP:192.168.100.100,脚本文件名HANA_HA_script.sh

  1. primitive res_ALIYUN_STONITH_1 stonith:fence_aliyun \
  2. op monitor interval=120 timeout=60 \
  3. params plug=i-2ze2ujq5zpxxmaemlyfn ram_role=SAP-HA-ROLE region=cn-beijing \
  4. meta target-role=Started
  5. primitive res_ALIYUN_STONITH_2 stonith:fence_aliyun \
  6. op monitor interval=120 timeout=60 \
  7. params plug=i-2zefdluqx20n43jos4vj ram_role=SAP-HA-ROLE region=cn-beijing \
  8. meta target-role=Started
  9. ###SAP HANA Topology is a resource agent that monitors and analyze the HANA landscape and communicate the status between two nodes##
  10. primitive rsc_SAPHanaTopology_HDB ocf:suse:SAPHanaTopology \
  11. operations $id=rsc_SAPHanaTopology_HDB-operations \
  12. op monitor interval=10 timeout=600 \
  13. op start interval=0 timeout=600 \
  14. op stop interval=0 timeout=300 \
  15. params SID=H01 InstanceNumber=00
  16. ###This file defines the resources in the cluster together with the HAVIP###
  17. primitive rsc_SAPHana_HDB ocf:suse:SAPHana \
  18. operations $id=rsc_SAPHana_HDB-operations \
  19. op start interval=0 timeout=3600 \
  20. op stop interval=0 timeout=3600 \
  21. op promote interval=0 timeout=3600 \
  22. op monitor interval=60 role=Master timeout=700 \
  23. op monitor interval=61 role=Slave timeout=700 \
  24. params SID=H01 InstanceNumber=00 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false
  25. #This is for Overlay IP resource setting##
  26. primitive rsc_vip ocf:aliyun:vpc-move-ip \
  27. op monitor interval=60 \
  28. meta target-role=Started \
  29. params address=192.168.100.100 routing_table=vtb-2zequdzq4luddui4voe6x endpoint=vpc.cn-beijing.aliyuncs.com interface=eth0
  30. ms msl_SAPHana_HDB rsc_SAPHana_HDB \
  31. meta is-managed=true notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true maintenance=false
  32. clone cln_SAPHanaTopology_HDB rsc_SAPHanaTopology_HDB \
  33. meta is-managed=true clone-node-max=1 target-role=Started interleave=true maintenance=false
  34. colocation col_saphana_ip_HDB 2000: rsc_vip:Started msl_SAPHana_HDB:Master
  35. location loc_hana-master_stonith_not_on_hana-master res_ALIYUN_STONITH_1 -inf: hana-master
  36. #Stonith 1 should not run on primary node because it is controling primary node
  37. location loc_hana-slave_stonith_not_on_hana-slave res_ALIYUN_STONITH_2 -inf: hana-slave
  38. order ord_SAPHana_HDB Optional: cln_SAPHanaTopology_HDB msl_SAPHana_HDB
  39. property cib-bootstrap-options: \
  40. have-watchdog=false \
  41. cluster-infrastructure=corosync \
  42. cluster-name=cluster \
  43. stonith-enabled=true \
  44. stonith-action=off \
  45. stonith-timeout=150s \
  46. no-quorum-policy=ignore
  47. rsc_defaults rsc-options: \
  48. migration-threshold=5000 \
  49. resource-stickiness=1000
  50. op_defaults op-options: \
  51. timeout=600

用root用户运行如下命令:

  1. crm configure load update HANA_HA_script.sh

验证集群状态

登录Hawk2 web控制台,访问地址 https://[IP address]:7630

查看Cluster的Status和Dashboard

hanasr_05

hanasr_06

也可以登录任意一个节点,使用crmsh检查当前集群状态

  1. # crm_mon -r
  2. 2 nodes configured
  3. 6 resources configured
  4. Online: [ saphana-01 saphana-02 ]
  5. Full list of resources:
  6. rsc_sbd (stonith:external/sbd): Started saphana-01
  7. rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-01
  8. Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
  9. Masters: [ saphana-01 ]
  10. Slaves: [ saphana-02 ]
  11. Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
  12. Started: [ saphana-01 saphana-02 ]

SAP HANA HA测试和维护