Natty

Revision 13 as of 2011-03-18 22:09:31

Clear message

Contents

Machine Overview

Machine

Hostname

IP Address

Machine 1

natty1

192.168.122.100

Machine 2

natty2

192.168.122.101

Pacemaker/Corosync

[both] Install Pacemaker and Corosync

sudo apt-get install pacemaker corosync

[natty1] Generate Corosync keys

The cluster nodes, to be able to communicate within each other, they need a key to be able to authenticate the packages sent between them by corosync. This key is generated once and copied to the other nodes in the cluster:

sudo corosync-keygen

After executing the command, you need to generate entropy to have enough random bits to be able to generate the key. This can be achieved in many ways, such as keyboard pressing, mouse movement, or downloading files from the Internet, or even installing/upgrading packages of your system.

Once the key generation has finished, you need to copy the keys to the other nodes. From natty1 do as follows:

sudo scp /etc/corosync/authkey <user>@<cluster-node-ip>:/home/<user>

In natty2 do as follows:

sudo mv ~/authkey /etc/corosync

[both] Configure Corosync

Now, we need to configure corosync to be able to listen the messages that are going to be sent in the network. For this we edit the bindnetaddr field in /etc/corosync/corosync.conf and we set it to the "network address" of the subnet in used. In our case, is as follows:

[...]
        interface {
                # The following values need to be set based on your environment 
                ringnumber: 0
                bindnetaddr: 192.168.122.0
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
[...]

[both] Starting Corosync

Before being able to start corosync, we need to edit /etc/default/corosync and enable it by changing the no to yes:

START=yes

Once this is done, we can start corosync as follows:

sudo service corosync start

[natty1] Verifying Cluster Nodes

After starting corosync and a few seconds have passed, the nodes should be part of the cluster. To verify this we do as follows:

sudo crm_mon

And we will see similar to the following when the cluster nodes join the cluster:

============
Last updated: Fri Mar 18 16:27:48 2011
Stack: openais
Current DC: natty1 - partition with quorum
Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ natty1 natty2 ]

DRBD

Prepare Partitions

The first step in setting up DRBD is to prepare the partitions/disks to be used as DRBD devices. We are assuming that we have a disk (vdb) on which we will create a partition table, and a primary partition of 1Gb for the DRBD device.

[natty1] Create the partitions

First, we create the partition table as follows:

sudo fdisk /dev/vdb

And we select the following options:

o -> To create an empty partition
w -> To write the newly created partition table

Once, this is done, we create a primary partition of 1Gb:

sudo fdisk /dev/vdb

And we use the following options:

n -> New partition
p -> Primary
1 -> Partition # 1
1 -> Where does partition starts
+1024 -> Size of partition (1Gb = 1024 bytes)
w -> To write the changes

[natty2] Copy partition table to natty2

First, on natty1 we issue the following:

sudo sfdisk -d /dev/vdb

This will generate an output such as:

# partition table of /dev/vdb
unit: sectors

/dev/vdb1 : start=       63, size=  1033137, Id=83
/dev/vdb2 : start=        0, size=        0, Id= 0
/dev/vdb3 : start=        0, size=        0, Id= 0
/dev/vdb4 : start=        0, size=        0, Id= 0

We need to copy the above output to natty2 to be able to create the partition table. First, we do the following:

sudo sfdisk -d /dev/vdb <<EOF

then we copy the output, and we finish it with EOF. This looks like:

sudo sfdisk -d /dev/vdb <<EOF
> # partition table of /dev/vdb
> unit: sectors

> /dev/vdb1 : start=       63, size=  1033137, Id=83
> /dev/vdb2 : start=        0, size=        0, Id= 0
> /dev/vdb3 : start=        0, size=        0, Id= 0
> /dev/vdb4 : start=        0, size=        0, Id= 0
>
> EOF

NOTE: In case the above doesn't work for any reason, we can create the partition for natty2 the same way the partition was create for natty1.

Installing and Configuring DRBD Resources

[both] Installing DRBD

First, we need to install drbd utils, as the DRBD kernel module is already in mainline kernel.

sudo apt-get install drbd8-utils

Then, we need to bring up the kernel module:

sudo modprobe drbd

And add it to /etc/modules:

[...]
loop
lp
drbd # -> added

[both] Creating the DRBD resource

Now that DRBD is up, we need to create a resource. For this, we create /etc/drbd.d/nfsexport.res, and we copy:

resource export {
        device /dev/drbd0;
        disk /dev/vdb1;
        meta-disk internal;
        on natty1 {
                address 192.168.122.100:7788;
        }
        on natty2 {
                address 192.168.122.101:7788;
        }
        syncer {
                rate 10M;
        }
}

The configuration file is explained as follows:

  • resource export: Specifies the name that identifies the resourced

  • device /dev/drbd0: Specifies the name of the DRBD block device to be used

  • disk /dev/vbd1: Specifies that we are using /dev/vdb1 for the device above.

  • on natty1 | on natty2: Note that these are the hostnames of the nodes and contains the respective IP address and port on which each DRBD node will communicate.

Once the configuration file is saved, we can test its correctness as follows:

sudo drbdadm dump nfsexport

Now, that we have created the resources, we need to create the metadata as follows:

sudo drbdadm create-md 

And, then we have to bring it up:

sudo drbdadm up nfsexport

Once this command is issued, we can check that both DRBD nodes have made communication and we'll see that the data is inconsistent as no initial synchronization has been made. For this we do the following:

sudo drbd-overview 

And the result will be similar to:

  0:export  Connected Secondary/Secondary Inconsistent/Inconsistent C r-----

[natty1] Initial DRBD Synchronization

Now, to be able to synchronize the data (and only if devices are identical) we do as follows:

sudo drbdadm -- --clear-bitmap new-current-uuid nfsexport

And this should show something similar to:

  0:export  Connected Secondary/Secondary UpToDate/UpToDate C r-----

In case the above command didn't work or wasn't used, then we can do as follows:

drbdadm -- --overwrite-data-of-peer primary nfsexport

Now, we have to make sure that natty1 is primary to be able to create the filesystem. For this we do as follows:

sudo drbdadm primary nfsexport
sudo mkfs -t ext3 /dev/drbd0

Master/Slave Integration with Pacemaker

It is not necessary for DRBD init scripts to bring up the resources configured, given that now pacemaker will manage the resources. So we simply remove the symlinks as follows:

sudo update-rc.d -f drbd remove

[natty1] Configuring the resources

Now, we can configure the resources for pacemaker. The resources to configure are simply the resources needed to bring up the DRBD resources, and to be mount the DRBD partition into the filesystem. Additionally, other parameters are required to handle the master/slave states of drbd as well as when and where to become Master in one node. To be able to do so, we do as follows:

sudo crm configure

And we copy the following:

primitive res_drbd_export ocf:linbit:drbd params drbd_resource="export" # Configures the DRBD resource, specifying its name
primitive res_fs ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mnt/nfsexport" fstype="ext3" # Mounts the filesystem specifying the DRBD device and mount point.
ms ms_drbd_export res_drbd_export meta notify="true" # Master/Slave notification.
colocation c_export_on_drbd inf: res_fs ms_drbd_export:Master # Tells Cluster to Mount the filesystem after DRBD has become primary in the node.
order o_drbd_before_nfs inf: ms_drbd_nfsexport:promote res_fs:start # Tells the cluster to promote DRBD, and after that start the Fielsystem.
property stonith-enabled=false 
property no-quorum-policy=ignore

Then, to finish, we do the following to commit the changes:

commit

Now, to verify the changes we do:

sudo crm status

which should show something similar to:

============
Last updated: Fri Mar 18 17:12:36 2011
Stack: openais
Current DC: natty2 - partition with quorum
Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ natty1 natty2 ]

 Master/Slave Set: ms_drbd_export
     Masters: [ natty1 ]
     Slaves: [ natty2 ]
 res_fs (ocf::heartbeat:Filesystem):    Started natty1

HA VIP

HA NFS

HA MySQL