您的当前位置:首页RHSC配置文档

RHSC配置文档

2024-09-16 来源:乌哈旅游


一:配置主机信息 (1)所需ip地址信息 pacsserver1主机 192.168.0.91 eth0 10.10.10.91 eth1

192.168.0.96 Fence设备地址 Pacsserver2主机 192.168.0.92 eth0 10.10.10.92 eth1

192.168.0.97 Fence设备地址 vip地址 192.168.0.90

注:Fence设备ip地址修改在bios中进行设置,详细设置请参考IBM服务器配置官方文档。 (2)修改/etc/hosts文件内容为以下内容,保持两台主机内容一致。 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.0.91 pacsserver1 10.10.10.91 pacsserver1 192.168.0.92 pacsserver2 10.10.10.92 pacsserver2

二:创建仲裁盘

在共享存储上划分一个分区作为qdisk盘.建议10M分区。 [root@pascserver1 ~]## fdisk /dev/sdb

The number of cylinders for this disk is set to 1044.

There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Command (m for help): n Command action e extended

p primary partition (1-4) p

Partition number (1-4): q Partition number (1-4): [root@pacsserver1 ~]#

[root@pacsserver1 ~]# fdisk /dev/sdb

The number of cylinders for this disk is set to 1044.

There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n Command action e extended

p primary partition (1-4) p

Partition number (1-4): 1

First cylinder (1-1044, default 1): Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-1044, default 1044): +10M Command (m for help): w

The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.

[root@pascserver1 ~]#mkqdisk -c /dev/sdb1 -l ha_qdisk mkqdisk v0.6.0

Writing new quorum disk label 'ha_qdisk' to /dev/sdb1.

WARNING: About to destroy all data on /dev/sdb1; proceed [N/y] ? y Initializing status block for node 1... Initializing status block for node 2... Initializing status block for node 3... Initializing status block for node 4... Initializing status block for node 5... Initializing status block for node 6... Initializing status block for node 7... Initializing status block for node 8... Initializing status block for node 9... Initializing status block for node 10... Initializing status block for node 11... Initializing status block for node 12... Initializing status block for node 13... Initializing status block for node 14... Initializing status block for node 15... Initializing status block for node 16... [root@pascserver1 ~]#mkqdisk -L mkqdisk v0.6.0

/dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0-part1: /dev/disk/by-uuid/8f7b3faa-d745-4bcf-8ddf-b39b6459f18c: /dev/sdb1:

Magic: eb7a62c2 Label: ha_qdisk

Created: Tue Aug 9 06:06:44 2011 Host: pacsserver1 Kernel Sector Size: 512 Recorded Sector Size: 512

注:在pacsserver2中看不到该设备请重启节点服务器。 三:RHCS集群配置

登陆其中一台linux服务器在终端中输入system-config-cluster命令启动图形配置界面。如下图:

点击CREATE NEW CONFIGURATION创建配置文件。

设置集群配置名称为PACS_CLUSTER.其他配置信息如上图所示。完成后点击OK。

点击cluster nodes中的add a cluster node创建节点pacsserver1,vote设置为1.同样的方法添

加节点pacsserver2.

点击FENCE DEVICES,为集群添加fence设备。如下图所示:

配置内容如上图所示,添加fence设备pacsserver1imm.同样方法创建pacsserver2imm fence

设备。

为主机添加fence设备。

选择pacsserver1,添加一个 new fence level,然后为这个fence level添加fence设备。如下所示:

同样方法为pacsserver2创建同样的fence设备。如下图所示创建fail domain,

Fail domai名称为pacs domain.添加节点pacsserver1、pacsserver2到这个domain中。配置如下:

添加集群资源ip addres与数据库脚本到集群中。

添加集群服务,选中services,点击creat a resource.设置集群服务名称为pacs service.

添加集群资源到集群服务中,点击add a shared resource to this servce添加资源,设置如下如所示。

如上图为配置最终界面,保存配置文件到/etc/cluster目录中。

四:测试

1:关闭正在服务主机的数据库服务进行测试(pacs service 在pacsserver2主机上) 在pacsserver2主机中可以看到日志信息为:

Aug 9 14:12:15 pacsserver2 clurgmgrd: [22285]: script:oracle10g: status of /etc/init.d/oracle10g failed (returned 1)

Aug 9 14:12:15 pacsserver2 clurgmgrd[22285]: status on script \"oracle10g\" returned 1 (generic error)

Aug 9 14:12:15 pacsserver2 clurgmgrd[22285]: Stopping service service:pacs service Aug 9 14:12:26 pacsserver2 clurgmgrd[22285]: Service service:pacs service is recovering

Aug 9 14:12:46 pacsserver2 clurgmgrd[22285]: Service service:pacs service is now running on member 1

在pacsserver1主机中可以看到日志信息为:

Aug 9 14:12:14 pacsserver1 kernel: kjournald starting. Commit interval 5 seconds Aug 9 14:12:14 pacsserver1 kernel: EXT3 FS on sdb2, internal journal

Aug 9 14:12:14 pacsserver1 kernel: EXT3-fs: mounted filesystem with ordered data mode. Aug 9 14:12:15 pacsserver1 kernel: kjournald starting. Commit interval 5 seconds Aug 9 14:12:15 pacsserver1 kernel: EXT3 FS on sdc1, internal journal

Aug 9 14:12:15 pacsserver1 kernel: EXT3-fs: mounted filesystem with ordered data mode. Aug 9 14:12:33 pacsserver1 clurgmgrd[22727]: Service service:pacs service started

查看图形界面如下图:

服务切换成功,pacsserver1主机成功接管服务。 2:关闭正在服务主机的rgmanager和cman服务测试切换效果(此时服务在pacsserver1上) 在Pacsserver1主机日志如下所示:

Aug 9 14:26:35 pacsserver1 rgmanager: [32111]: Shutting down Cluster Service Manager...

Aug 9 14:26:35 pacsserver1 clurgmgrd[22727]: Shutting down

Aug 9 14:26:35 pacsserver1 clurgmgrd[22727]: Stopping service service:pacs service Aug 9 14:27:20 pacsserver1 clurgmgrd[22727]: Service service:pacs service is stopped Aug 9 14:27:20 pacsserver1 clurgmgrd[22727]: Shutdown complete, exiting

Aug 9 14:27:20 pacsserver1 rgmanager: [32111]: Cluster Service Manager is stopped. Aug 9 14:27:53 pacsserver1 ccsd[22623]: Stopping ccsd, SIGTERM received.

Aug 9 14:27:54 pacsserver1 openais[22629]: [SERV ] AIS Executive exiting (reason: CMAN requested shutdown).

Aug 9 14:27:54 pacsserver1 fenced[22654]: cluster is down, exiting

Aug 9 14:27:54 pacsserver1 gfs_controld[22666]: cluster is down, exiting Aug 9 14:27:54 pacsserver1 dlm_controld[22660]: cluster is down, exiting Aug 9 14:27:54 pacsserver1 kernel: dlm: closing connection to node 2 Aug 9 14:27:54 pacsserver1 kernel: dlm: closing connection to node 1 在Pacsserver2主机日志如下所示:

Aug 9 14:27:33 pacsserver2 clurgmgrd[22285]: Member 1 shutting down

Aug 9 14:27:38 pacsserver2 clurgmgrd[22285]: Starting stopped service service:pacs

service

Aug 9 14:27:39 pacsserver2 kernel: kjournald starting. Commit interval 5 seconds Aug 9 14:27:39 pacsserver2 kernel: EXT3 FS on sdb2, internal journal

Aug 9 14:27:39 pacsserver2 kernel: EXT3-fs: mounted filesystem with ordered data mode. Aug 9 14:27:39 pacsserver2 kernel: kjournald starting. Commit interval 5 seconds Aug 9 14:27:39 pacsserver2 kernel: EXT3 FS on sdc1, internal journal

Aug 9 14:27:39 pacsserver2 kernel: EXT3-fs: mounted filesystem with ordered data mode. Aug 9 14:27:43 pacsserver2 last message repeated 3 times

Aug 9 14:27:47 pacsserver2 scim-bridge: Panel client has not yet been prepared Aug 9 14:27:47 pacsserver2 last message repeated 281 times

Aug 9 14:27:58 pacsserver2 clurgmgrd[22285]: Service service:pacs service started 登录pacsserver2查看图形界面如下图所示:

服务切换成功,pacsserver2主机成功接管服务。

3:拔掉正在运行pacs service服务的主机的外网线进行测试(此时服务运行在pacsserver2上)

在pacsserver1主机日志中可以看到pacsserver2点被fence设备重启。Pacsserver1接管服务。 Aug 9 14:40:15 pacsserver1 fenced[397]: pacsserver2 not a cluster member after 0 sec post_fail_delay

Aug 9 14:40:16 pacsserver1 kernel: bnx2: eth1 NIC Copper Link is Down

Aug 9 14:40:19 pacsserver1 kernel: bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON

Aug 9 14:40:23 pacsserver1 fenced[397]: fence \"pacsserver2\" success Aug 9 14:40:24 pacsserver1 kernel: bnx2: eth1 NIC Copper Link is Down

Aug 9 14:40:25 pacsserver1 clurgmgrd[443]: Taking over service service:pacs service from down member pacsserver2

Aug 9 14:40:26 pacsserver1 kernel: kjournald starting. Commit interval 5 seconds Aug 9 14:40:26 pacsserver1 kernel: EXT3 FS on sdb2, internal journal

Aug 9 14:40:26 pacsserver1 kernel: EXT3-fs: mounted filesystem with ordered data mode.

Aug 9 14:40:27 pacsserver1 kernel: bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON

Aug 9 14:40:28 pacsserver1 qdiskd[343]: Assuming master role

Aug 9 14:40:29 pacsserver1 qdiskd[343]: Writing eviction notice for node 2 Aug 9 14:40:30 pacsserver1 qdiskd[343]: Node 2 evicted

Aug 9 14:40:45 pacsserver1 clurgmgrd[443]: Service service:pacs service started 登录pacsserver1查看图形界面如下:

切换成成功,pacserver1主机接管服务。

4:关闭正在服务的主机进行测试(此时pacs service运行在pacsserver1上)。 在pacsserver2主机中查看日志:

Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] Members Left:

Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] r(0) ip(192.168.0.91) Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] Members Joined:

Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] CLM CONFIGURATION CHANGE Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] New Configuration:

Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] r(0) ip(192.168.0.92) Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] Members Left: Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] Members Joined:

Aug 9 14:59:07 pacsserver2 openais[5926]: [SYNC ] This node is within the primary component and will provide service.

Aug 9 14:59:07 pacsserver2 openais[5926]: [TOTEM] entering OPERATIONAL state.

Aug 9 14:59:07 pacsserver2 openais[5926]: [CLM ] got nodejoin message 192.168.0.92 Aug 9 14:59:07 pacsserver2 openais[5926]: [CPG ] got joinlist message from node 2 Aug 9 14:59:10 pacsserver2 kernel: bnx2: eth1 NIC Copper Link is Down

Aug 9 14:59:12 pacsserver2 kernel: bnx2: eth1 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON

Aug 9 14:59:18 pacsserver2 clurgmgrd[5988]: Service service:pacs service started 登录图形界面查看如下所示:

服务器换成功,pacsserver2接管服务。

测试成功之后将qdiskd、cman、rgmanager三个服务添加到系统运行级别中,如下: chkconfig –level 35 qdiskd on chkconfig –level 35 cman on

chkconfig –level 35 rgmanager on

五:RHCS管理

1:启动双机软件服务

在pacsserver1和pacsserver2 上操作,用root 用户登录

进入图形界面,点击桌面右键open terminal,输入如下命令: 1) service qdiskd start (在pacsserver1上执行) 2) service qdiskd start (在pacsserver2上执行)

以上两命令可以同时执行,等执行完成后,继续 3) service cman start (在pacsserver1上执行) 4) service cman start (在pacsserver2 上执行)

以上两命令可以同时执行,等执行完成后,继续 5) service rgmanager start (在pacsserver1上执行) 6) service rgmanager start (在pacsserver2上执行) 2:停止双机软件服务

进入图形界面,点击桌面右键open terminal,输入如下命令: 1) service rgmanager stop (在pacsserver1上执行) 2) service rgmanager stop (在pacsserver2上执行) 等执行完成后,继续

3) service cman stop (在pacsserver1上执行) 4) service cman stop (在pacsserver2上执行) 等执行完成后,继续

5) service qdiskd stop (在pacsserver1上执行) 6) service qdiskd stop (在pacsserver2上执行)

备注:

本次环境中需要挂在磁盘sdb2、sdc1(sdd1该盘作备用,暂未挂载。).这两个磁盘设备的挂载通过oracle10g脚本中添加挂载磁盘命令进行挂载。需要对磁盘参数进行调整使磁盘不进行检测,如下命令:

tune2fs -i 0 /dev/sdb2 tune2fs –c 0 /dev/sdb2 tune2fs -i 0 /dev/sdc1 tune2fs -c 0 /dev/sdc1 tune2fs -i 0 /dev/sdd1 tune2fs -c 0 /dev/sdd1

因篇幅问题不能全部显示,请点此查看更多更全内容