Building a redundant iSCSI and NFS cluster with Debian - Part 3


Note : This page may contain outdated information and/or broken links; some of the formatting may be mangled due to the many different code-bases this site has been through in over 20 years; my opinions may have changed etc. etc.

This is part 3 of a series on building a redundant iSCSI and NFS SAN with Debian.

Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!


In the last two guides, we set up a DRBD resource and LVM volume group which we could manually migrate between the two cluster nodes. In this guide, we’ll set up the Heartbeat cluster software to handle automatic migration of services between the two nodes in our cluster (“failover”).

The version of Heartbeat included in Debian Etch is 1.x. It is a very simple system, and is limited to two node clusters, making it ideal for something simple such as failover for services between two nodes. The current 2.x branch is a lot more complicated, and has a new XML configuration format, although it can still be used with the original 1.x format files. Although it adds many useful features, it’s overkill for our needs at the moment - plus, sticking to 1.x avoids the need to install software not included in the current stable distribution.


Before we set up Heartbeat, we’ll need to ensure the communication channels the cluster will be using are configured. If you refer back to the original network diagram, you’ll see that we’re using two different interconnects: A serial cable, and a network connection across eth1. To recap: the reason for this is that the interconnects are vital to the functioning of the cluster. If one node cannot “see” the other, it will assume control of the resources. If this was due to a faulty interconnect (or, a network misconfiguration), you would end up with a “split-brain” scenario in which both nodes try to gain control over the resources. At best, this would lead to service outages and confusion; at worst, you could be facing total data loss.

Hence, the two channels for cluster communication - the null-modem serial cable is a great “fallback” channel, which should always be available even if you do something like apply an erroneous firewall rule blocking the communication over eth1. If you have been following the instructions up until now, you should already be able to send data between the hosts over the serial connection, and ping each node from the other over their eth1 interfaces (we’ve already been using this interface for the DRBD synching). Assuming this all works, you’re good to proceed.


Simply install from apt-get on both nodes :

# apt-get install heartbeat
This will give an error warning at the end ("Heartbeat not configured"), which you can ignore. You now need to setup authentication for both nodes - this is very simple, and just uses a shared secret key. Create /etc/ha.d/authkeys on both systems with the following content:
auth 1
1 sha1 secret
In this sample file, the auth 1 directive says to use key number 1 for signing outgoing packets. The 1 sha1... line describes how to sign the packets. Replace the word "secret" with the passphrase of your choice. As this is stored in plaintext, make sure that it is owned by root and has a restrictive set of permissions on it :
# chown root:root /etc/ha.d/authkeys
# chmod 600 /etc/ha.d/authkeys
Make sure that copies of this file are identical across both nodes, and don't have any blank lines etc. in them. Now, we need to set up the global cluster configuration file. Create the /etc/ha.d/ file on both nodes as follows :
# Interval in seconds between heartbeat packets
keepalive 1
# How long to wait in seconds before deciding node is dead
deadtime 10
# How long to wait in seconds before warning node is dead
warntime 5
# How long to wait in seconds before deciding node is dead
# When heartbeat is first started
initdead 60
# If using serial port for heartbeat
baud 9600
serial /dev/ttyS0
# If using network for heartbeat
udpport 694
# eth1 is our dedicated cluster link (see diagram in part 1)
bcast eth1
# Don't want to auto failback, let admin check and do it manually if needed
auto_failback off
# Nodes in our cluster
node otter
node weasel

We now need to tell Heartbeat about what resources we want it to manage. This is configured in the /etc/ha.d/haresources file. The format for this is again very simple - it just takes the form :
<hostname> resource[::arg1:arg2:arg3:........:argN]
Resources can either be one of the supplied scripts in /etc/ha.d/resource.d :
# ls /etc/ha.d/resource.d
AudibleAlarm  db2  Delay  drbddisk  Filesystem  ICP  IPaddr  IPaddr2  IPsrcaddr 
IPv6addr  LinuxSCSI  LVM  LVSSyncDaemonSwap  MailTo  OCF  portblock  SendArp  ServeRAID 
WAS  WinPopup  Xinetd
Or, they can be one of the init scripts in /etc/init.d, and Heartbeat will search those locations in that order. To start with, we'll want to move the DRBD resource we configured in part 2 between the two nodes. This can be accomplished via the "drbddisk" script, provided by the drbd0.7-utils package.  The configuration /etc/ha.d/haresources file should therefore look like the following :
weasel drbddisk::r0
This says that the node "weasel" should be the preferred node for this service. The resource script is "drbddisk", which can be found under /etc/ha.d/resource.d, and we're passing it the argument "r0", which is our DRBD resource configured in part 2. To test this out, make the DRBD resource secondary by running the following on both nodes :
# drbdadm secondary r0
And then start the cluster on both nodes :
# /etc/init.d/heartbeat start
Starting High-Availability services:
Once they've started up, check the cluster status using the cl_status tool. First, let's check which nodes Heartbeat thinks are in the cluster :
# cl_status listnodes
Now, check both nodes are up :
# cl_status nodestatus weasel
# cl_status nodestatus otter
We can also use the cl_status tool to see which cluster links are available (which should be eth1 and /dev/ttyS0) :
# cl_status listhblinks otter
# cl_status hblinkstatus otter eth1
# cl_status hblinkstatus otter /dev/ttyS0
And we can also use it to check which resources each node has :
[root@otter] # cl_status rscstatus
[root@weasel] # cl_status rscstatus
You should be able to check the output of /proc/drbd on both systems and see that r0 has been made the master on weasel. To failover to otter, simply restart the Heartbeat services on weasel :
# /etc/init.d/heartbeat restart
Stopping High-Availability services:
Waiting to allow resource takeover to complete:
Starting High-Availability services:
Now, check /proc/drbd and you should see that it is now the master on otter. You can confirm this with cl_status :
[root@otter] # cl_status rscstatus
[root@weasel] # cl_status rscstatus
If you want to try a more dramatic approach, try yanking the power out of otter. You should see output similar to the following appear in /var/log/ha-log on weasel :
heartbeat: 2009/02/03_15:06:29 info: Resources being acquired from otter.
heartbeat: 2009/02/03_15:06:29 info: acquire all HA resources (standby).
heartbeat: 2009/02/03_15:06:29 info: Acquiring resource group: weasel drbddisk::r0
heartbeat: 2009/02/03_15:06:29 info: Local Resource acquisition completed.
heartbeat: 2009/02/03_15:06:29 info: all HA resource acquisition completed (standby).
heartbeat: 2009/02/03_15:06:29 info: Standby resource acquisition done [all].
heartbeat: 2009/02/03_15:06:29 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/02/03_15:06:29 info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat: 2009/02/03_15:06:39 WARN: node otter: is dead
heartbeat: 2009/02/03_15:06:39 info: Dead node otter gave up resources.
Play around with this a few times, and make sure you're familiar with your resource moving between systems. Once you're happy with this, we'll add our LVM volume group into the configuration. Edit the /etc/ha.d/haresources file, and modify it so that it looks like the following :
weasel drbddisk::r0 \
The backslash (\) character just tells Heartbeat that this should all be treated as one resource group - the same as a backslash indicates a line continuation in a shell script. Things can be on just one line, but I find it easier to read when it's split up like this. Restart Heartbeat on each node in turn, and you should then be able to see the DRBD resource and the LVM volume group move between systems. The next step will cover setting up an iSCSI target, and adding that into the cluster configuration along with a group of managed IP addresses.