
You are connected using IPv4. The logo above will turn green when you connect using IPv6.
Ask your ISP about IPv6 connectivity, and check your status here.
Archives
Quicksearch
Syndicate This Blog
Building a redundant iSCSI and NFS cluster with Debian - Part 4
This is part 4 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
In this part, we'll configure Heartbeat to manage IP address failover on the two storage interfaces. We'll also install and configure an iSCSI target to provide block-level storage to clients.
IP address failover
We want Heartbeat to manage the two IP addresses we will be providing iSCSI services over. From looking back at our original network plan, we can see that these are 192.168.1.1, and 192.168.2.1. These are on two separate subnets to ensure that packets go out of the correct interface when our clients connect to them using multipathing (which will be done in the next part). There are other ways of accomplishing this (using ip routing), but this is the easiest.
Edit your /etc/ha.d/haresources configuration file on both nodes, so that it looks like the following :
weasel drbddisk::r0 \
LVM::storage
IPaddr::192.168.1.1/24/eth2 \
IPaddr::192.168.2.1/24/eth3 \
arp_filter
You can see that it's using the built-in IPaddr script (in /etc/ha.d/resource.d) to bring up the IP addresses on the designated interfaces. The last line, arp_filter will call a script we'll now create. Put the following contents in the file /etc/init.d/arp_filter :
#!/bin/bash
for FILTER in /proc/sys/net/ipv4/conf/eth*/arp_filter; do
echo "$FILTER was $(cat $FILTER), setting to 1..."
echo 1 > $FILTER
done
And then make sure it is executable :
# chmod +x /etc/init.d/arp_filter
The reason we need this additional script is documented at http://linux-ip.net/html/ether-arp.html#ether-arp-flux-arpfilter. It causes the nodes to perform a route lookup to determine the interface through which to send ARP responses, instead of the default behavior, which is to replying from all Ethernet interfaces. This is needed as our cluster nodes are connected to several different networks.
Now, restart Heartbeat on the two nodes, and you should see your eth2 and eth3 interfaces come up and have an IP address assigned to them. Try restarting Heartbeat on the two nodes to observe the resources migrating between them. We can now move on to setting up the iSCSI server.
iSCSI overview
A great overview of iSCSI is on Wikipedia : http://en.wikipedia.org/wiki/ISCSI. Essentially, it allows you to run SCSI commands over an IP network, which lets you create a low-cost SAN without having to invest in expensive Fibre Channel cabling and switches. The shared block devices we'll create appear to the clients as regular SCSI devices - you can partition, format and mount them the same as you would a regular directly attached device. iSCSI clients are called "initiators", and the server part is called a "target".
On Linux, there least four different targets :
- SCST (http://scst.sourceforge.net)
- STGT (http://stgt.berlios.de)
- IET (http://iscsitarget.sourceforge.net)
- LIO (http://linux-iscsi.org)
Out of these four, the STGT and IET targets seem to be the most commonly used. The STGT target in particular is worth investigation as it is included in Red Hat Enterprise Linux and derivatives. We'll be using the IET target, however. It seems to be one of the more popular iSCSI target implementations, builds cleanly on Debian, and critically allows the service to be stopped while initiators are logged in - which we need to do in a failover scenario.
Note 1 : Check the README.vmware if you are going to use it as a backing store for VMWare!
Note 2 : As the IET target contains a kernel module, you will need to build and install it again each time you update or install a new kernel. This means you will have to double check it each time you run a system update!
First, we'll make sure we have the necessary tools to build the target :
# apt-get install build-essential linux-headers-`uname -r` libssl-dev
Now, download the target from http://sourceforge.net/project/showfiles.php?group_id=108475. The current version as of the time of writing is 0.4.17; adjust the version numbers below if necessary. Once downloaded, you'll need to unpack it and build it :
# tar xzvf iscsitarget-0.4.17.tar.gz
# cd iscsitarget-0.4.17
# make KERNELSRC=/usr/src/linux-headers-`uname -r`
# make install
Now, copy /etc/ietd.conf to /etc/ietd.default for reference, and repeat the above installation steps for the other node in the cluster.
Creating iSCSI targets
Once it's installed, start the daemon on the current "master" node :
# /etc/init.d/iscsi-target start
Starting iSCSI enterprise target service: succeeded.
You should now have two empty files under /proc/net/iet, session and volume, and output similar to the following will show up in /var/log/messages :
Feb 9 14:09:40 otter kernel: iSCSI Enterprise Target Software - version 0.4.17
Feb 9 14:09:40 otter kernel: iscsi_trgt: Registered io type fileio
Feb 9 14:09:40 otter kernel: iscsi_trgt: Registered io type blockio
Feb 9 14:09:40 otter kernel: iscsi_trgt: Registered io type nullio
The first target we'll create using a backing store (the underlying storage) of our LVM volume created earlier (/dev/storage/test). On the master node with the DRBD device and LVM volume active, run the following commands :
# ietadm --op new --tid=1 --params Name=iqn.2009-02.com.example:test
# ietadm --op new --tid=1 --lun=0 --params Type=fileio,Path=/dev/storage/test
The first command creates a new target, with an ID of 1 and a name of "iqn.2009-02.com.example:test". See http://en.wikipedia.org/wiki/ISCSI#Addressing for more details on the naming conventions of iSCSI targets.
The second command adds a LUN (ID 0) to this target, assigns the LVM volume /dev/storage/test as the backing store, and tells the target to provide access to this device via the "fileio" method. Check the ietd.conf man page for details on the various options you can use - in particular, you may want to try benchmarking the fileio,blockio, and using write-back caching.
If you now check the contents of /proc/net/iet/volume, you'll see the target listed :
# cat /proc/net/iet/volume
tid:1 name:iqn.2009-02.com.example:test
lun:0 state:0 iotype:fileio iomode:wt path:/dev/storage/test
However, if you restart the target daemon, you'll see the target disappear. To make a permanent entry, edit /etc/ietd.conf and add the following :
Target iqn.2009-02.com.example:test
Lun 0 Path=/dev/storage/test,Type=fileio
Alias test
See the /etc/ietd.default file created earlier to see some of the other options you can set - although you can safely stick to the bare minimum defaults for the moment. Now, when you restart the daemon, you'll see your volumes being created at startup.
Heartbeat integration
We'll now add this to our Heartbeat configuration. Make sure the iSCSI service is stopped and that /etc/ietd.conf is the same on both nodes, and then edit /etc/ha.d/haresources to manage the iscsi-target init script :
weasel drbddisk::r0 \
LVM::storage
IPaddr::192.168.1.1/24/eth2 \
IPaddr::192.168.2.1/24/eth3 \
arp_filter \
iscsi-target
Restart heartbeat, and you should then see the iSCSI volumes moving across the nodes - you can check by looking at /proc/net/iet/volume.


















Thursday, March 5. 2009 at 14:23 (Reply)
Thursday, March 5. 2009 at 14:42 (Reply)
Tuesday, July 14. 2009 at 12:35 (Reply)
Tuesday, July 14. 2009 at 13:22 (Reply)
Thursday, February 4. 2010 at 12:40 (Reply)
what happens to the iscsi initiator when a failover happens? I can imagine there is quite a delay involved with this. First you have the time to failover which, granted, can be reduced by correctly configuring ha.cf but what I wonder mostly about is the ARP cache for the storage interfaces: they will point to the MAC adresses of the last active node. Once a failover has happened, how long will it take for them to understand that the IP stack should now talk to a different MAC address to which the shared IP is linked?
If this is a shorter amount of time than the iscsi initiatior or higher level file systems or applications can deal with it's no biggie, but if not ...
Thursday, February 4. 2010 at 12:46 (Reply)
[1]=http://wiki.wireshark.org/Gratuitous_ARP
Tuesday, August 23. 2011 at 00:44 (Link) (Reply)
I have not configured these interfaces with any ip's on any of the servers through /etc/network/interfaces. Is this correct? If not, how would I configure them?
Thanks in advance. Great article!
Tuesday, August 23. 2011 at 23:07 (Link) (Reply)
I have two nodes, SAN02 and SAN03.
Everything works fine untill I kill SAN02.
Initiator looses connection to the underlying iSCSI disk when I so. cat /proc/net/iet/volumes shows that it migrated to SAN03 correctly. But client cannot connect to the drive. If I run /etc/init.d/iscsitarget restart on SAN03 while SAN02 is still down, the client reconnects to the drive. I am puzzled by this behaviour. I cannot manually go do this each time something happens to one of the nodes. Any ideas?