Linux 下安装及配置heartbeat

傷城~ 2022-08-10 04:47 367阅读 0赞

Heartbeat 是一个基于Linux开源的,被广泛使用的高可用集群系统。主要包括心跳服务和资源接管两个高可用集群组件。本文简要描述了在Linux环境下安装heartbeat 2.1.4,同时描述了heartbeat的3个重要配置文件的配置方法。

有关heartbeat集群组件相关概念可参考: HeartBeat 集群组件概述

一、安装heartbeat

  1. ###准备安装文件
  2. ###由于heartbeat V2版本已经不再更新,V2版本最终版为2.1.4。
  3. ###对于需要在Linux对于需要在Linux 6下安装的可以从以下链接下载:
  4. ###对于Linux 5系列的可以在此下载:和https://dl.fedoraproject.org/pub/epel/5/x86_64/repoview/letter_h.group.html
  5. # rpm -Uvh PyXML-0.8.4-19.el6.x86_64.rpm
  6. # rpm -Uvh perl-MailTools-2.04-4.el6.noarch.rpm
  7. # rpm -Uvh perl-TimeDate-1.16-11.1.el6.noarch.rpm
  8. # rpm -Uvh libnet-1.1.6-7.el6.x86_64.rpm
  9. # rpm -Uvh ipvsadm-1.26-2.el6.x86_64.rpm
  10. # rpm -Uvh lm_sensors-libs.x86_64 0:3.1.1-17.el6
  11. # rpm -Uvh net-snmp-libs.x86_64.rpm
  12. # rpm -Uvh heartbeat-pils-2.1.4-12.el6.x86_64.rpm
  13. # rpm -Uvh heartbeat-stonith-2.1.4-12.el6.x86_64.rpm
  14. # rpm -Uvh heartbeat-2.1.4-12.el6.x86_64.rpm
  15. ###以下2个rpm包根据需要安装,一个是Heartbeat development package,一个是针对lvs
  16. # rpm -Uvh heartbeat-devel-2.1.4-12.el6.x86_64.rpm
  17. # rpm -Uvh heartbeat-ldirectord-2.1.4-12.el6.x86_64.rpm
  18. ###验证安装包
  19. # rpm -qa |grep -i heartbeat
  20. heartbeat-2.1.4-12.el6.x86_64
  21. heartbeat-pils-2.1.4-12.el6.x86_64
  22. heartbeat-stonith-2.1.4-12.el6.x86_64
  23. heartbeat-ldirectord-2.1.4-12.el6.x86_64
  24. heartbeat-devel-2.1.4-12.el6.x86_64
  25. #复制样本配置文件到/etc/ha.d目录下并作相应修改
  26. # cp /usr/share/doc/heartbeat-2.1.4/ha.cf /etc/ha.d/
  27. # cp /usr/share/doc/heartbeat-2.1.4/haresources /etc/ha.d/
  28. # cp /usr/share/doc/heartbeat-2.1.4/authkeys /etc/ha.d/
  29. #

二、配置heartbeat

heartbeat配置主要由3个文件组成,一个是ha.cf,一个是authkeys,一个是haresources,下面分别描述。

1、ha.cf

  1. 该文件是heartbeat的主要配置文件,大致包括如下信息:
  2. heartbeat日志文件输出级别,位置;
  3. 心跳时长,告警时长,脑裂时长,初始化时长等;
  4. 心跳通讯方式,IP,端口号,串口设备,波特率等;
  5. 节点名称,隔离方式等。
  6. 示例文件描述
  7. [root@orasrv1 ha.d]# more ha.cf
  8. #
  9. # There are lots of options in this file. All you have to have is a set
  10. # of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
  11. # and a value for "auto_failback".
  12. # 必须设置的有节点列表集{node ...},{serial,bcast,mcast,或ucast}中的一个,auto_failback的值
  13. #
  14. # ATTENTION: As the configuration file is read line by line,
  15. # THE ORDER OF DIRECTIVE MATTERS!
  16. # 配置文件是逐行读取的,并且选项的顺序是会影响最终结果的。
  17. #
  18. # In particular, make sure that the udpport, serial baud rate
  19. # etc. are set before the heartbeat media are defined!
  20. # debug and log file directives go into effect when they
  21. # are encountered.
  22. #
  23. # 确保在udpport,serial baud rate在heartbeat检测前预先定义或预留可用
  24. # 也就是是在定义网卡,串口等心跳检测接口前先要定义端口号。
  25. #
  26. # All will be fine if you keep them ordered as in this example.
  27. # 如果保持本样例中的定义顺序,本配置将会正常工作。
  28. #
  29. # Note on logging:
  30. # If all of debugfile, logfile and logfacility are not defined,
  31. # logging is the same as use_logd yes. In other case, they are
  32. # respectively effective. if detering the logging to syslog,
  33. # logfacility must be "none".
  34. # 记录日志方面的注意事项:
  35. # 如果debugfile,logfile和logfacility都没有定义,日志记录就相当于use_logd yes。
  36. # 否则,他们将分别生效。如果要阻止记录日志到syslog,那么logfacility必须设置为“none”
  37. #
  38. # File to write debug messages to
  39. #写入debug消息的文件
  40. #debugfile /var/log/ha-debug
  41. #
  42. #
  43. # File to write other messages to
  44. #
  45. #单独指定日志文件
  46. logfile /var/log/ha-log
  47. #
  48. #
  49. # Facility to use for syslog()/logger
  50. #用于syslog()/logger的设备,通常情况下不建议与logfile同时启用
  51. #logfacility local0
  52. #
  53. #
  54. # A note on specifying "how long" times below...
  55. #
  56. # The default time unit is seconds
  57. # 10 means ten seconds
  58. #
  59. # You can also specify them in milliseconds
  60. # 1500ms means 1.5 seconds
  61. #
  62. #
  63. # keepalive: how long between heartbeats?
  64. #心跳时长
  65. #keepalive 2
  66. #
  67. # deadtime: how long-to-declare-host-dead?
  68. #
  69. # If you set this too low you will get the problematic
  70. # split-brain (or cluster partition) problem.
  71. # See the FAQ for how to use warntime to tune deadtime.
  72. # 如果这个时间值设置得过长将导致脑裂或集群分区的问题。
  73. #心跳丢失后死亡时长
  74. #deadtime 30
  75. #
  76. # warntime: how long before issuing "late heartbeat" warning?
  77. # See the FAQ for how to use warntime to tune deadtime.
  78. #
  79. #
  80. #心跳丢失后警告时长
  81. #warntime 10
  82. #
  83. #
  84. # Very first dead time (initdead)
  85. #
  86. # On some machines/OSes, etc. the network takes a while to come up
  87. # and start working right after you've been rebooted. As a result
  88. # we have a separate dead time for when things first come up.
  89. # It should be at least twice the normal dead time.
  90. # 在某些机器/操作系统等中,网络在机器启动或重启后需要花一定的时间启动并正常工作。
  91. # 因此我们必须分开他们初次起来的dead time,这个值应该最少设置为两倍的正常dead time。
  92. #
  93. #初始死亡时长
  94. #initdead 120
  95. #
  96. #
  97. # What UDP port to use for bcast/ucast communication?
  98. #
  99. #端口号的配置
  100. #udpport 694
  101. #
  102. # Baud rate for serial ports...
  103. #
  104. #波特率的配置
  105. #baud 19200
  106. #
  107. # serial serialportname ...
  108. #串口名称
  109. #serial /dev/ttyS0 # Linux
  110. #serial /dev/cuaa0 # FreeBSD
  111. #serial /dev/cuad0 # FreeBSD 6.x
  112. #serial /dev/cua/a # Solaris
  113. #
  114. #
  115. # What interfaces to broadcast heartbeats over?
  116. #
  117. #广播的网络接口名称
  118. #bcast eth0 # Linux
  119. #bcast eth1 eth2 # Linux
  120. #bcast le0 # Solaris
  121. #bcast le1 le2 # Solaris
  122. #
  123. # Set up a multicast heartbeat medium
  124. # mcast [dev] [mcast group] [port] [ttl] [loop]
  125. #
  126. # [dev] device to send/rcv heartbeats on
  127. # [mcast group] multicast group to join (class D multicast address
  128. # 224.0.0.0 - 239.255.255.255)
  129. # [port] udp port to sendto/rcvfrom (set this value to the
  130. # same value as "udpport" above)
  131. # [ttl] the ttl value for outbound heartbeats. this effects
  132. # how far the multicast packet will propagate. (0-255)
  133. # Must be greater than zero.
  134. # [loop] toggles loopback for outbound multicast heartbeats.
  135. # if enabled, an outbound packet will be looped back and
  136. # received by the interface it was sent on. (0 or 1)
  137. # Set this value to zero.
  138. #
  139. #有关多播的配置
  140. #mcast eth0 225.0.0.1 694 1 0
  141. #
  142. # Set up a unicast / udp heartbeat medium
  143. # ucast [dev] [peer-ip-addr]
  144. #
  145. # [dev] device to send/rcv heartbeats on
  146. # [peer-ip-addr] IP address of peer to send packets to
  147. #
  148. #
  149. #ucast eth0 192.168.1.2
  150. #
  151. #对于广播,单播或多播,各有优缺点。
  152. #单播多用于2节点情形,但是2节点上则不能使用相同的配置文件,因为ip地址不一样
  153. #
  154. #
  155. # About boolean values... 关于boolean值
  156. #
  157. # 下面的任意不区分大小写敏感值将被当作true
  158. # Any of the following case-insensitive values will work for true:
  159. # true, on, yes, y, 1
  160. # 下面的任意不区分大小写敏感值将被当作false
  161. # Any of the following case-insensitive values will work for false:
  162. # false, off, no, n, 0
  163. #
  164. #
  165. #
  166. #
  167. # auto_failback: determines whether a resource will
  168. # automatically fail back to its "primary" node, or remain
  169. # on whatever node is serving it until that node fails, or
  170. # an administrator intervenes.
  171. # 决定一个resource是否自动恢复到它的初始primary节点,
  172. # 或者继续运行在转移后的节点直到出现故障或管理员进行干预。
  173. #
  174. # The possible values for auto_failback are:
  175. # on - enable automatic failbacks
  176. # off - disable automatic failbacks
  177. # legacy - enable automatic failbacks in systems
  178. # where all nodes do not yet support
  179. # the auto_failback option.
  180. #
  181. # auto_failback "on" and "off" are backwards compatible with the old
  182. # "nice_failback on" setting.
  183. #
  184. # See the FAQ for information on how to convert
  185. # from "legacy" to "on" without a flash cut.
  186. # (i.e., using a "rolling upgrade" process)
  187. #
  188. # The default value for auto_failback is "legacy", which
  189. # will issue a warning at startup. So, make sure you put
  190. # an auto_failback directive in your ha.cf file.
  191. # (note: auto_failback can be any boolean or "legacy")
  192. #
  193. #自动failback配置
  194. auto_failback on
  195. #
  196. #
  197. # Basic STONITH support
  198. # Using this directive assumes that there is one stonith
  199. # device in the cluster. Parameters to this device are
  200. # read from a configuration file. The format of this line is:
  201. #
  202. # stonith <stonith_type> <configfile>
  203. #
  204. # NOTE: it is up to you to maintain this file on each node in the
  205. # cluster!
  206. #
  207. #基本STONITH支持
  208. #stonith baytech /etc/ha.d/conf/stonith.baytech
  209. #
  210. # STONITH support
  211. # You can configure multiple stonith devices using this directive.
  212. # The format of the line is:
  213. # stonith_host <hostfrom> <stonith_type> <params...>
  214. # <hostfrom> is the machine the stonith device is attached
  215. # to or * to mean it is accessible from any host.
  216. # <stonith_type> is the type of stonith device (a list of
  217. # supported drives is in /usr/lib/stonith.)
  218. # <params...> are driver specific parameters. To see the
  219. # format for a particular device, run:
  220. # stonith -l -t <stonith_type>
  221. #
  222. #
  223. # Note that if you put your stonith device access information in
  224. # here, and you make this file publically readable, you're asking
  225. # for a denial of service attack ;-)
  226. #
  227. # To get a list of supported stonith devices, run
  228. # stonith -L
  229. # For detailed information on which stonith devices are supported
  230. # and their detailed configuration options, run this command:
  231. # stonith -h
  232. #
  233. #stonith_host * baytech 10.0.0.3 mylogin mysecretpassword
  234. #stonith_host ken3 rps10 /dev/ttyS1 kathy 0
  235. #stonith_host kathy rps10 /dev/ttyS1 ken3 0
  236. #
  237. # Watchdog is the watchdog timer. If our own heart doesn't beat for
  238. # a minute, then our machine will reboot.
  239. # NOTE: If you are using the software watchdog, you very likely
  240. # wish to load the module with the parameter "nowayout=0" or
  241. # compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
  242. # an orderly shutdown of heartbeat will trigger a reboot, which is
  243. # very likely NOT what you want.
  244. #
  245. #watchdog计时器的配置
  246. #watchdog /dev/watchdog
  247. #
  248. # Tell what machines are in the cluster
  249. # node nodename ... -- must match uname -n
  250. #
  251. #节点名称配置,重要,必须与uname -n获得的名字等同
  252. #node ken3
  253. #node kathy
  254. #
  255. # Less common options...
  256. #
  257. # Treats 10.10.10.254 as a psuedo-cluster-member
  258. # Used together with ipfail below...
  259. # note: don't use a cluster node as ping node
  260. # 将10.10.10.254看成一个伪集群成员,与下面的 ipfail一起使用。
  261. # 注意:不要使用一个集群节点作为ping节点,通常可以设置为Ping 网关。
  262. # 此作用用于觉定集群重构的仲裁票数
  263. #
  264. #ping 10.10.10.254
  265. #
  266. # Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster-member
  267. # called group1. If either 10.10.10.254 or 10.10.10.253 are up
  268. # then group1 is up
  269. # Used together with ipfail below...
  270. # 同上,意思是两个IP当中,任意一个ping通即可
  271. #
  272. #ping_group group1 10.10.10.254 10.10.10.253
  273. #
  274. # HBA ping derective for Fiber Channel
  275. # Treats fc-card-name as psudo-cluster-member
  276. # used with ipfail below ...
  277. #
  278. # You can obtain HBAAPI from http://hbaapi.sourceforge.net. You need
  279. # to get the library specific to your HBA directly from the vender
  280. # To install HBAAPI stuff, all You need to do is to compile the common
  281. # part you obtained from the sourceforge. This will produce libHBAAPI.so
  282. # which you need to copy to /usr/lib. You need also copy hbaapi.h to
  283. # /usr/include.
  284. #
  285. # The fc-card-name is the name obtained from the hbaapitest program
  286. # that is part of the hbaapi package. Running hbaapitest will produce
  287. # a verbose output. One of the first line is similar to:
  288. # Apapter number 0 is named: qlogic-qla2200-0
  289. # Here fc-card-name is qlogic-qla2200-0.
  290. #
  291. #hbaping fc-card-name
  292. #
  293. #
  294. # Processes started and stopped with heartbeat. Restarted unless
  295. # they exit with rc=100
  296. # 指定当一个heartbeat服务或节点宕机时如何处理。
  297. # 开启ipfail则是重启对应的节点,该进程被自动监视,遇到故障则重新启动。
  298. # ipfail进程用于检测和处理网络故障,需要配合ping语句指定的ping node来检测网络连接。
  299. #
  300. #respawn userid /path/name/to/run
  301. #respawn hacluster /usr/lib/heartbeat/ipfail
  302. #
  303. # Access control for client api
  304. # default is no access
  305. #
  306. #apiauth client-name gid=gidlist uid=uidlist
  307. #apiauth ipfail gid=haclient uid=hacluster
  308. ######################################
  309. #
  310. # Unusual options. 不常用选项
  311. #
  312. ######################################
  313. #
  314. # hopfudge maximum hop count minus number of nodes in config
  315. #hopfudge 1
  316. #
  317. # deadping - dead time for ping nodes
  318. #deadping 30
  319. #
  320. # hbgenmethod - Heartbeat generation number creation method
  321. # Normally these are stored on disk and incremented as needed.
  322. #hbgenmethod time
  323. #
  324. # realtime - enable/disable realtime execution (high priority, etc.)
  325. # defaults to on
  326. #realtime off
  327. #
  328. # debug - set debug level
  329. # defaults to zero
  330. #debug 1
  331. #
  332. # API Authentication - replaces the fifo-permissions-based system of the past
  333. #
  334. # You can put a uid list and/or a gid list.
  335. # If you put both, then a process is authorized if it qualifies under either
  336. # the uid list, or under the gid list.
  337. #
  338. # The groupname "default" has special meaning. If it is specified, then
  339. # this will be used for authorizing groupless clients, and any client groups
  340. # not otherwise specified.
  341. #
  342. # There is a subtle exception to this. "default" will never be used in the
  343. # following cases (actual default auth directives noted in brackets)
  344. # ipfail (uid=HA_CCMUSER) Author : Leshami
  345. # ccm (uid=HA_CCMUSER) Blog : http://blog.csdn.net/leshami
  346. # ping (gid=HA_APIGROUP)
  347. # cl_status (gid=HA_APIGROUP)
  348. #
  349. # This is done to avoid creating a gaping security hole and matches the most
  350. # likely desired configuration.
  351. # 这避免生成一个安全漏洞缺口,可以实现能很多人最渴望的安全配置。
  352. #
  353. #apiauth ipfail uid=hacluster
  354. #apiauth ccm uid=hacluster
  355. #apiauth cms uid=hacluster
  356. #apiauth ping gid=haclient uid=alanr,root
  357. #apiauth default gid=haclient
  358. # message format in the wire, it can be classic or netstring,
  359. # default: classic
  360. #msgfmt classic/netstring
  361. # Do we use logging daemon?
  362. # If logging daemon is used, logfile/debugfile/logfacility in this file
  363. # are not meaningful any longer. You should check the config file for logging
  364. # daemon (the default is /etc/logd.cf)
  365. # more infomartion can be fould in http://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
  366. # Setting use_logd to "yes" is recommended
  367. #
  368. # use_logd yes/no
  369. #
  370. # the interval we reconnect to logging daemon if the previous connection failed
  371. # default: 60 seconds
  372. #conn_logd_time 60
  373. #
  374. #
  375. # Configure compression module
  376. # It could be zlib or bz2, depending on whether u have the corresponding
  377. # library in the system.
  378. #compression bz2
  379. #
  380. # Confiugre compression threshold
  381. # This value determines the threshold to compress a message,
  382. # e.g. if the threshold is 1, then any message with size greater than 1 KB
  383. # will be compressed, the default is 2 (KB)
  384. #compression_threshold 2

2、authkeys 认证信息配置

  1. 该文件主要用于配置heartbeat的认证信息。共有三种可用的方式:crcmd5sha1
  2. 三种方式安全性依次提高,但同时占用的系统资源也依次扩大。
  3. crc安全性最低,适用于物理上比较安全的网络,sha1提供最为有效的鉴权方式,占用的系统资源也最多。
  4. authkeys文件的文件其许可权应该设为600(即-rw-------)。命令为: chmod 600 authkeys
  5. 其配置语句格式如下:
  6. auth <number>
  7. <number> [desc]
  8. 举例说明:
  9. auth 1
  10. 1 sha1 key-for-sha1
  11. 其中键值key-for-sha1可以任意指定,number设置必须保证上下一致。
  12. auth 2
  13. 2 crc
  14. crc方式不需要指定键值。
  15. 示例文件描述
  16. [root@orasrv1 ha.d]# more authkeys
  17. #
  18. # Authentication file. Must be mode 600
  19. #
  20. # Must have exactly one auth directive at the front.
  21. # auth send authentication using this method-id
  22. #
  23. # Then, list the method and key that go with that method-id
  24. #
  25. # Available methods: crc sha1, md5. Crc doesn't need/want a key.
  26. #
  27. # You normally only have one authentication method-id listed in this file
  28. #
  29. # Put more than one to make a smooth transition when changing auth
  30. # methods and/or keys.
  31. #
  32. #
  33. # sha1 is believed to be the "best", md5 next best.
  34. #
  35. # crc adds no security, except from packet corruption.
  36. # Use only on physically secure networks.
  37. #
  38. #auth 1
  39. #1 crc
  40. #2 sha1 HI!
  41. #3 md5 Hello!

3、haresources 资源配置

  1. haresources文件用于指定集群系统的节点、集群IP、子网掩码、广播地址以及启动的相关服务等。
  2. 其配置语句格式如下:
  3. node-name network-config resouce-group
  4. node-name:指定集群系统的节点名称,取值必须匹配ha.cf文件中node选项设置的主机名中的相同。
  5. network-config:用于网络设置,包括指定集群IP、子网掩码、广播地址等。
  6. resource-group:用于设置heartbeat管理的相关集群服务,也就是这些服务可以由Heartbeat来启动和关闭。
  7. 对于使用heartbeat接管的相关服务,必须将服务写成可以通过start/stop来启动和关闭的脚步,然后放到/etc /init.d/
  8. 或者/etc/ha.d/resource.d/目录下,Heartbeat(TE)会根据脚本的名称自动去上述目录下找到相应脚本进行启动或关闭操作。
  9. 示例描述:
  10. node1 IPaddr::192.168.21.10/24/eth0/ Filesystem:: /dev/sdb2::/webdata::ext3 httpd tomcat
  11. node1:
  12. 节点名称
  13. IPaddr::192.168.21.10/24/eth0/
  14. IPaddrheartbeat提供的一个脚本,位于/etc/ha.d/resource.d目录
  15. 执行/etc/ha.d/resource.d/IPaddr 192.168.21.10/24 start的操作
  16. 虚拟出一个子网掩码为255.255.255.0IP192.168.21.10的地址。
  17. IPHeartbeat对外提供服务的网络地址,同时指定此IP使用的网络接口为eth0
  18. Filesystem:: /dev/sdb2::/webdata::ext3
  19. Filesystemheartbeat提供的一个脚本,位于/etc/ha.d/resource.d目录
  20. 执行共享磁盘分区的挂载操作,等同于命令行下的mount -t ext3 /dev/sdb2 /webdata
  21. httpd tomcat
  22. 依次启动httpd,以及tomcat服务
  23. 注:对于多个网络接口,不同子网的情行,IP地址,通常会使用别名绑定在跟VIP在同一网段内的网络接口上。
  24. 如: eth0 : 172.16.100.6 eth1 : 192.168.0.6 VIP : 172.16.100.5
  25. VIP 会绑定在eth0上,因为2个地址在同一网段,由这个命令来完成/usr/lib64/heartbeat/findif
  26. 示例文件描述
  27. [root@orasrv1 ha.d]# more haresources
  28. #
  29. # This is a list of resources that move from machine to machine as
  30. # nodes go down and come up in the cluster. Do not include
  31. # "administrative" or fixed IP addresses in this file.
  32. # 集群中的节点停机和启动时,这里配置的资源列表会从一个节点转移到另一个节点,
  33. # 不过资源列表中不要包含管理或已经配置在服务器上的IP地址在这个文件中。
  34. # <VERY IMPORTANT NOTE>
  35. # The haresources files MUST BE IDENTICAL on all nodes of the cluster.
  36. # 此haresources文件在所有的集群节点中都必须相同
  37. #
  38. # The node names listed in front of the resource group information
  39. # is the name of the preferred node to run the service. It is
  40. # not necessarily the name of the current machine. If you are running
  41. # auto_failback ON (or legacy), then these services will be started
  42. # up on the preferred nodes - any time they're up.
  43. #
  44. # 列在resource组信息前的节点名称是主机的hostname,它不需要是当前机器的名称,如果你配置auto_failback on
  45. # (或者legacy),那么这些服务将会在首选的节点上启动,只要主机是运行的。
  46. #
  47. # If you are running with auto_failback OFF, then the node information
  48. # will be used in the case of a simultaneous start-up, or when using
  49. # the hb_standby {foreign,local} command.
  50. # 如果你配置的是auto_failback off,在集群重构或者使用hb_standby {foreign,local}命令,节点信息将被使用
  51. #
  52. # BUT FOR ALL OF THESE CASES, the haresources files MUST BE IDENTICAL.
  53. # If your files are different then almost certainly something
  54. # won't work right.
  55. # 但是对于所有的这些情况,此haresources文件都必须相同。如果你的文件不同那么肯定有某些功能将不能正常工作。
  56. # </VERY IMPORTANT NOTE>
  57. #
  58. #
  59. # We refer to this file when we're coming up, and when a machine is being
  60. # taken over after going down.
  61. # 在节点启动和一个节点停机后被接管的时候会参考这个文件。
  62. #
  63. # You need to make this right for your installation, then install it in
  64. # /etc/ha.d
  65. # 安装时把它放到/etc/ha.d目录
  66. #
  67. # Each logical line in the file constitutes a "resource group".
  68. # A resource group is a list of resources which move together from
  69. # one node to another - in the order listed. It is assumed that there
  70. # is no relationship between different resource groups. These
  71. # resource in a resource group are started left-to-right, and stopped
  72. # right-to-left. Long lists of resources can be continued from line
  73. # to line by ending the lines with backslashes ("\").
  74. #
  75. # 在文件里面的每个逻辑行组成一个“resource group”,下简称资源组
  76. # 一个资源组就是从一个节点切换到另一个节点时的资源顺序列表。
  77. # 假定不同的资源组之间是没有关系的,资源组的启动时是从左到右的。关闭时是从右到左的。
  78. # 过长的resources列表可以以反斜杠(“\”)结尾来续行。
  79. #
  80. # These resources in this file are either IP addresses, or the name
  81. # of scripts to run to "start" or "stop" the given resource.
  82. # 在这个文件里面的resources可以是IP地址,也可以是用于“start”或“stop”给定的resource的脚本名称
  83. #
  84. # The format is like this:
  85. # 样例
  86. #
  87. #node-name resource1 resource2 ... resourceN
  88. #
  89. #
  90. # If the resource name contains an :: in the middle of it, the
  91. # part after the :: is passed to the resource script as an argument.
  92. # Multiple arguments are separated by the :: delimeter
  93. # 如果资源的名称包含一个::在它的中间,在::后面的部分会传递给资源的脚本中作为一个参数,多个参数会以::分割。
  94. #
  95. # In the case of IP addresses, the resource script name IPaddr is
  96. # implied.
  97. # 在IP地址的情况中,resource脚本名称IPaddr是隐含的。
  98. #
  99. # For example, the IP address 135.9.8.7 could also be represented
  100. # as IPaddr::135.9.8.7
  101. # 例如:IP地址135.9.8.7也可以被表示为IPaddr::135.9.8.7
  102. #
  103. # THIS IS IMPORTANT!! vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  104. #
  105. # The given IP address is directed to an interface which has a route
  106. # to the given address. This means you have to have a net route
  107. # set up outside of the High-Availability structure. We don't set it
  108. # up here -- we key off of it.
  109. # 给定的IP地址会直接连到有路由到给定的地址的接口上,
  110. # 这也就意味着你必须要在 High-Availability 外部配置一个网络路由。
  111. #
  112. # The broadcast address for the IP alias that is created to support
  113. # an IP address defaults to the highest address on the subnet.
  114. # IP别名的广播地址将被缺省创建为支持IP地址的子网里的最高地址
  115. #
  116. # The netmask for the IP alias that is created defaults to the same
  117. # netmask as the route that it selected in in the step above.
  118. # IP别名的子网掩码将被缺省创建为与上一步骤选择的路由相同的子网掩码
  119. #
  120. # The base interface for the IPalias that is created defaults to the
  121. # same netmask as the route that it selected in in the step above.
  122. # IP别名的基础接口将被缺省创建为与上面选择的路由相同的子网掩码
  123. #
  124. # If you want to specify that this IP address is to be brought up
  125. # on a subnet with a netmask of 255.255.255.0, you would specify
  126. # this as IPaddr::135.9.8.7/24 .
  127. # 上面为子网掩码指定示例
  128. #
  129. # If you wished to tell it that the broadcast address for this subnet
  130. # was 135.9.8.210, then you would specify that this way:
  131. # IPaddr::135.9.8.7/24/135.9.8.210
  132. # 上面为广播地址指定示例
  133. #
  134. # If you wished to tell it that the interface to add the address to
  135. # is eth0, then you would need to specify it this way:
  136. # IPaddr::135.9.8.7/24/eth0
  137. # 如果你希望指明要增加地址的接口是eth0,那么你需要像这样指定 IPaddr::135.9.8.7/24/eth0
  138. #
  139. # And this way to specify both the broadcast address and the
  140. # interface:
  141. # IPaddr::135.9.8.7/24/eth0/135.9.8.210
  142. # 同时指定广播地址和接口的方法为: IPaddr::135.9.8.7/24/eth0/135.9.8.210
  143. #
  144. # The IP addresses you list in this file are called "service" addresses,
  145. # since they're they're the publicly advertised addresses that clients
  146. # use to get at highly available services.
  147. # 这个文件中的IP地址列表,叫做服务地址,它们是客户端用于获取高可用服务的公共通告地址
  148. #
  149. # For a hot/standby (non load-sharing) 2-node system with only
  150. # a single service address,
  151. # you will probably only put one system name and one IP address in here.
  152. # The name you give the address to is the name of the default "hot"
  153. # system.
  154. # 对于一个双机热备(非共享负载)单服务地址的系统,你可能只需要放置一个系统名称和一个IP地址在这里。
  155. # 你指定的地址对应的名字就是缺省的"hot"系统的名字。
  156. #
  157. # Where the nodename is the name of the node which "normally" owns the
  158. # resource. If this machine is up, it will always have the resource
  159. # it is shown as owning.
  160. # 节点名称就是正常情况下拥有resource的节点的名称。
  161. # 如果此机器是up的,他将一直拥有显示的resource。
  162. #
  163. # The string you put in for nodename must match the uname -n name
  164. # of your machine. Depending on how you have it administered, it could
  165. # be a short name or a FQDN.
  166. # 节点名应当与uname -n查看的结果一致
  167. #-------------------------------------------------------------------
  168. #
  169. # Simple case: One service address, default subnet and netmask
  170. # No servers that go up and down with the IP address
  171. # 单服务地址,缺省子网和掩码,没有服务与IP地址一起启动和关闭
  172. #
  173. #just.linux-ha.org 135.9.216.110
  174. #
  175. #-------------------------------------------------------------------
  176. #
  177. # Assuming the adminstrative addresses are on the same subnet...
  178. # A little more complex case: One service address, default subnet
  179. # and netmask, and you want to start and stop http when you get
  180. # the IP address...
  181. # 假定管理地址在相同的子网...
  182. # 稍微复杂一些的情况:一个服务地址,缺省子网和子网掩码,同时你想要获得IP地址的时候启动和停止http。
  183. #
  184. #just.linux-ha.org 135.9.216.110 http
  185. #-------------------------------------------------------------------
  186. #
  187. # A little more complex case: Three service addresses, default subnet
  188. # and netmask, and you want to start and stop http when you get
  189. # the IP address...
  190. # 稍微复杂一些的情况:三个服务地址,缺省子网和掩码,同时你要在获得IP地址的时候启动和停止http。
  191. #
  192. #just.linux-ha.org 135.9.216.110 135.9.215.111 135.9.216.112 httpd
  193. #-------------------------------------------------------------------
  194. #
  195. # One service address, with the subnet, interface and bcast addr
  196. # explicitly defined.
  197. # 一个服务地址,显式指定子网,接口,广播地址
  198. #
  199. #just.linux-ha.org 135.9.216.3/28/eth0/135.9.216.12 httpd
  200. #
  201. #-------------------------------------------------------------------
  202. #
  203. # An example where a shared filesystem is to be used.
  204. # Note that multiple aguments are passed to this script using
  205. # the delimiter '::' to separate each argument.
  206. # 一个使用共享文件系统的例子
  207. # 需要注意用'::'分隔的多个参数被传递到了这个脚本
  208. #
  209. #node1 10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
  210. #
  211. # Regarding the node-names in this file:
  212. #
  213. # They must match the names of the nodes listed in ha.cf, which in turn
  214. # must match the `uname -n` of some node in the cluster. So they aren't
  215. # virtual in any sense of the word.
  216. #

三、使用集群的其他几个相关配置(具体描述略)

a、配置主机host解析
b、配置等效验证
c、高可用的相关服务配置(如httpd,myqld等),关闭自启动
d、如需要用到共享存储,还应配置相关存储系统

发表评论

表情:
评论列表 (有 0 条评论,367人围观)

还没有评论,来说两句吧...

相关阅读

    相关 Linux 安装配置heartbeat

    > Heartbeat 是一个基于Linux开源的,被广泛使用的高可用集群系统。主要包括心跳服务和资源接管两个高可用集群组件。本文简要描述了在Linux环境下安装heartbe