Test cases for cluster components in Ubuntu 10.04
Each test will be enumerated. Following this steps you shouldn't have problems. Note that each step is marked with [ALL] or [ONE]. If it's marked with [ALL], you should repeat it on each server in your cluster. If it's marked with [ONE], pick one server and do that step only on that server.
1. [ALL] Add testing PPA
Add this PPA to your /etc/apt/sources.list:
deb http://ppa.launchpad.net/ivoks/ppa/ubuntu lucid main
2. [ALL] install pacemaker
sudo apt-get install pacemaker
edit /etc/default/corosync and enable corosync (START=yes)
3. [ONE] generate corosync authkey
(this can take a while if there's no enough entropy; download ubuntu iso image on the same machine while generating to speed it up or use keyboard to generate entropy)
copy /etc/corosync/authkey to all servers that will form this cluster (make sure it is owned by root:root and has 400 permissions).
4. [ALL] configure corosync
In /etc/corosync/corosync.conf replace bindnetaddr (by defaults it's 127.0.0.1) with network address of your server, replacing last digit with 0. For example, if your IP is 192.168.1.101, then you would put 192.168.1.0.
5. [ALL] start corosync
sudo /etc/init.d/corosync start
Now your cluster is configured and ready to monitor, stop and start your services on all your cluster servers.
6. [ALL] install services that will fail over between servers
In this example, I'm installing apache2 and vsftpd. You may install any other service...
sudo apt-get install apache2 vsftpd
Disable their init scripts:
update-rc.d -f apache2 remove update-rc.d -f vsftpd remove
7. [ONE] add some services
In this example, I'll create failover for apache2 and vsftpd service. I'll also add two additional IPs and tie apache2 with one of them, while vsftpd will be grouped with another one.
sudo crm configure edit
It you get empty file, close it and wait for couple of seconds (10-20) and try again. You should get something like this:
node lucidcluster1 node lucidcluster2 node lucidcluster3 property $id="cib-bootstrap-options" \ dc-version="1.0.6-fdba003eafa6af1b8d81b017aa535a949606ca0d" \ cluster-infrastructure="openais" \ expected-quorum-votes="2"
Add following lines bellow 'node' declarations. Replace X.X.X.X and X.X.X.Y with addresses that will fail over - do not put IPs of your servers there. Do not save and exit after adding this lines:
primitive apache2 lsb:apache2 op monitor interval="5s" primitive vsftpd lsb:vsftpd op monitor interval="5s" primitive ip1 ocf:heartbeat:IPaddr2 params ip="X.X.X.X" nic="eth0" primitive ip2 ocf:heartbeat:IPaddr2 params ip="X.X.X.Y" nic="eth0" group group1 ip1 apache2 group group2 ip2 vsftpd order apache_after_ip inf: ip1:start apache2:start order vsftpd_after_ip inf: ip2:start vsftpd:start
Now that you've put some services into configuration, you should also define how many servers are needed for a quorum and what stonith devices will be used. For this test, we won't use stonith devices.
Under property, add expected-quorum-votes and stonith-enabled, so that it looks like this (don't forget '\'!). Replace 'X' with number of servers needed for quorum (X should be less or equal to N-1, but not 1 unless there are only two servers in cluster, where N is number of servers):
property $id="cib-bootstrap-options" \ dc-version="1.0.6-fdba003eafa6af1b8d81b017aa535a949606ca0d" \ cluster-infrastructure="openais" \ expected-quorum-votes="X" \ stonith-enabled="false"
Save and quit.
8. [ALL] monitor and stress test
On each server start crm_mon (sudo crm_mon) and monitor how services are grouped and started. Then, one by one, reboot or shutdown servers, leaving at least on running.
First test with normal shutdown, then with pulling the AC plug (destroying domains in KVM).
In all this cases, once servers are up, they should be Online (monitor servers status in crm_mon) after some time. Services should migrate between them without problems.