Archives
Categories
Quicksearch
Syndicate This Blog
Centreon review
It was always an interesting question to ask, because it gave you an insight into the kind of sysadmin tasks someone had been doing before, and it also served as a nice, relaxed "ice breaker" type question. For my money, aside from some tools like rsync and screen which I couldn't live without, a decent monitoring package would have to be top of my priorities. There are a bunch out there: some of them free; some of them commercial, but the one that would make it on to my USB stick would have to be Nagios.
It's open source, extremely well documented and widely implemented, and there are a ton of useful add-ons and plugins available for it. The only draw backs I can find with it are it's ugly web interface, the complexity involved in setting up a new system for monitoring, and the disjoint between availability and performance monitoring. If you have money to throw at a problem, then software like Uptime or Hyperic neatly deal with all of these issues, but they can be quite pricey if you have a large number of systems to manage and a tight budget.
So, you can imagine my excitement when I first discovered Centreon. It's essentially a monitoring platform that uses Nagios at it's core. You could think of it as a fancy frontend to "stock" Nagios, but it's so much more than that: besides the attractive interface, it also bridges the gap between availability and performance monitoring, and makes Nagios administration a snap. Due to the reliance on Nagios though, I'd go so far as to say that before you experiment with Centreon, you really should have set up "stock" Nagios, and be familiar with the plugin architecture, NRPE and how alerts / escalations are managed. Ideally, you should have a stock Nagios installation you can use to duplicate on Centreon/Nagios.
Continue reading "Centreon review"
Cacti iostat scripts now support FreeBSD
Thanks to the awesome work of Boogie Shafer, there is now a FreeBSD port of my iostat scripts and templates for Cacti. I have included the modified tarball that was sent to me, this is inside the archive as "cacti-iostat-1.x-boogie_freebsd_linux_changes.tar.gz".
FreeBSD users should unpack this archive and follow the instructions inside. I have not had time to go through and merge these changes into one unified distribution yet, but as people were asking for the FreeBSD port, here it is! The next release of these scripts should see the FreeBSD scripts and templates etc. merged in, much the same as the Solaris modifications by Marwan Shaher and Eric Schoeller.
Follow the link to the original post to find the download link.
no comments yet, be the first! Trackbacks (0)
Dell MD3000i
We've got it configured with dual controllers, 8x300Gb and 7x146GB 15k SAS drives. Throughput is around GigE wire speed - 110MB/s for both reads and writes. I'm also seeing a respectable IOPS figure depending on workloads: During an iozone run, I could see it sustaining around 1.5k IOPS.
True, the management features fall a little short when compared to the usual Sun and HP storage kit I'm used to, but it does the job. My main gripes are :
- No built in graphing (seriously, Dell - WTF?), but you can do it from the CLI - see here.
- Can't resize or change the I/O profile of a virtual disk once
it's setup. This is a real pain, so make sure you set things up correctly
the first time! You can however change the RAID level of a disk group
once it's been created.
- You need a Windows or RHEL box to run the administration GUI on - I'm sure you can probably hack a way to get the CLI running under Debian, but I haven't tried. You're probably straight out of luck if you want to run it on anything else like Solaris.
- Can't mix SAS and SATA in the same enclosure. The controllers
do support SATA as well as SAS, although SATA drives don't show up as
options in the Dell pricing configuration thingy. Our account manager
advised us that although technically you can mix SAS and SATA in the
same enclosure, they'd experienced a higher than average number of disk
failures in that configuration, due to the vibration patterns created
by disks spinning at different rates (15K SAS and 7.2K SATA). If you
need to mix the two types, your only real option is to attach a MD1000
array to the back (you can add up to two of these) and have each
chassis filled with just one type of drive.
Multipath support under RHEL/CentOS with multipath-tools (dm-multipath) works fine with some tweaking - it uses the RDAC modules which lead to some oddness on CentOS 5.3. What tends to happen is that the first time device mapper picks up the paths, RDAC doesn't get a chance to initialise things properly (scsi_dh_rdac module isn't loaded) so you end up with all sorts of SCSI errors showing up in your logs. After flushing your paths (multipath -F) and restarting multipathd, things are OK. This is apparently fixed in RHEL 5.4, so should make it's way out to CentOS from there. I'm unsure what the status is on other distros, though.
My multipath.conf contains the following :
devices {
device {
vendor "DELL"
product "MD3000i"
product_blacklist "Universal Xport"
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_checker rdac
prio_callout "/sbin/mpath_prio_rdac /dev/%n"
hardware_handler "1 rdac"
failback immediate
}
}
360026b90002ab6f40000056a4aa9e87b dm-12 DELL,MD3000i [size=409G][features=0][hwhandler=1 rdac][rw] _ round-robin 0 [prio=200][active] _ 21:0:0:1 sdi 8:128 [active][ready] _ 22:0:0:1 sdj 8:144 [active][ready] _ round-robin 0 [prio=0][enabled] _ 20:0:0:1 sdg 8:96 [active][ghost] _ 23:0:0:1 sdh 8:112 [active][ghost]
Update: It looks like the admin tool and SMcli are just shell script wrappers that run Java apps. I tried a quick'n'dirty hack of installing everything under RHEL, tarring up /opt/dell and /var/opt/SM and then transferring them over to a Debian Lenny host. All I had to change was the #!/bin/sh to #!/bin/bash at the top of the SMcli and SMclient wrappers, and they seem to work. I haven't put them through any serious testing though...
no comments yet, be the first! Trackbacks (0)
Building a redundant iSCSI and NFS cluster with Debian - Part 5
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
In this part of the series, we'll configure an iSCSI client ("initiator"), connect it to the storage servers and set up multipathing. Note : Since Debian Lenny has been released since this series of articles started, that's the version we'll use for the client.
If you refer back to part one to refresh your memory of the network layout, you can see that the storage client ("badger" in that diagram) should have 3 network interfaces :
- eth0 : 172.16.7.x for the management interface, this is what you'll use to SSH into it.
And two storage interfaces. As the storage servers ("targets") are using 192.168.x.1 and 2, I've given this client the following addresses :
- eth1: 192.168.1.10
- eth2: 192.168.2.10
Starting at .10 on each range keeps things clear - I've found it can help to have a policy of servers being in a range of, say, 1 to 10, and clients being above this. Before we continue, make sure that these interfaces are configured, and you can ping the storage server over both interfaces, e.g. try pinging 192.168.1.1 and 192.168.2.1.
Assuming the underlying networking is configured and working, the first thing we need to do is install open-iscsi (which is the "initiator" - the iSCSI client). This is done by a simple :
# aptitude install open-iscsi
You should see the package get installed, and the service started :
Setting up open-iscsi (2.0.870~rc3-0.4) ... Starting iSCSI initiator service: iscsid. Setting up iSCSI targets: iscsiadm: No records found!
At this point, we have all we need to start setting up some connections.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 5"Updated Cacti iostat package now supports Solaris
Just a quick update to my Cacti iostat monitoring scripts and templates - thanks to the work of Marwan Shaher and Eric Schoeller, the package now supports Solaris! The updated package is available here : cacti-iostat-1.4.tar.gz.
I have also updated the original blog post with the new package.
no comments yet, be the first! Trackbacks (0)
Oracle to buy Sun
Cracking dictionary passwords
I was talking with my wife a few days ago, and the subject of password security came up. Now, we all know that we're supposed to pick a secure password, use at least 8 characters and never to pick a word from the dictionary. But then she asked how long it would take to brute-force a password using a dictionary attack, and I had to admit I had no idea. I knew it would only be a matter of minutes, but wanted to give it a try.
So, For anyone who is interested, I knocked up a quick BASH script to compare a MD5 hashed password against the contents of /usr/share/dict/words, which on a Red Hat 5.3 system contains 479,623 words. The script is as follows :
#!/bin/bash
TARGET_HASH=$1
while read WORD; do
WORD_HASH=$(echo $WORD | md5sum | awk '{print $1}')
if [ "$WORD_HASH" == "$TARGET_HASH" ]; then
echo "Found match!"
echo "Password is : $WORD"
exit
fi
done < /usr/share/dict/words
Now, this was just a quick hack to satisfy my curiosity, and only something I threw together after a few seconds. Of particular relevance is the fact that it's a shell script, and uses a lot of forking to generate the MD5 hashes of the dictionary. If I wrote it in C, I'm sure it would be faster by an order of magnitude.
But anyway, on to the test - I created a MD5 phrase for it to crack, and timed it :
# time ./crack.sh 3a783fb2aa3a2318499f0a60d7ef6078
Found match!
Password is : hedgehog
real 8m43.432s
user 1m48.410s
sys 8m27.030s
Not bad - just under 9 minutes. Obviously, that'd take longer if I used a word starting with "x" or "z"! I then realised it would be a lot faster if I generated a "compiled" version of the dictionary file with the MD5 hashes preprepared :
while read WORD; do echo "$WORD:$(echo $WORD | md5sum | awk '{print $1}')"; done < /usr/share/dict/words > md5.txt
Obviously, I could then generate compiled dictionary files for each hashing algorithm I wanted to crack (assuming that they are non-Salted algorithms). This took around 30 minutes, but now I don't have to generate the hashes again, all I need to do is check against the second column of the file for a match. It is also irrelevant whether the word lies near the start or end of the file, it now takes about the same time to find a match :
# time grep ac23b37db0039dda62896bb21f312755 md5.txt | cut -d':' -f1
aardvark
real 0m0.019s
user 0m0.008s
sys 0m0.011s
# time grep 981fe627ab4906b677ce9d3e6eff499f md5.txt | cut -d':' -f1
zoology
real 0m0.019s
user 0m0.006s
sys 0m0.014s
So there you have it. It was an interesting way to spend a few minutes, and I now have an answer whenever someone asks "how long would it take to crack a password based on a dictionary word": Assuming you have the compiled hash files, around 0.019 seconds.
OpenVPN on Windows XP and Vista
Just a quick post this time, as I thought this may help others in the same situation I found myself in recently. At work, we've been using OpenVPN which works a treat with Unix clients; Windows clients (Vista in particular) were more problematic, though.
None of our regular users have admin privileges (for obvious reasons), but this caused problems with the routing setup: users could use the GUI tool, but could not create the necessary routes required to direct traffic over the VPN. We experimented for a while with setting up persistent routes, but this didn't work for multiple users. I'd read all kinds of posts about running the executables as an Administrator, disabling Vista UAC, registry tweaks and other voodoo - either they didn't work, or they were unacceptable in our environment.
I then hit upon a simple workaround that also seems to work on Windows XP: Just add the user to the "Network Configuration Operators" group:
Administrative Tools -> Computer Management -> Local Users and Groups -> Groups -> Network Configuration Operators
Now, everything works right out of the box on Vista SP1 with the 2.1RC builds of OpenVPN (OpenVPN 2.1_rc15 was the version we tested). You have to install this as an Administrator, and you do have to be happy with giving your VPN users slightly elevated privileges - but at least it stops way short of having to give them administrator rights.
For reference, here's the client config file as well :
client
script-security 3 system
dev tun
proto udp
remote <openvpn server address> 1194
nobind
persist-key
persist-tun
ca ca.crt
cert <user.name>.crt
key <user.name>.key
cipher BF-CBC
comp-lzo
verb 3
mute 20
route-method exe
route-delay 2
Building a redundant iSCSI and NFS cluster with Debian - Part 4
This is part 4 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
In this part, we'll configure Heartbeat to manage IP address failover on the two storage interfaces. We'll also install and configure an iSCSI target to provide block-level storage to clients.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 4"Building a redundant iSCSI and NFS cluster with Debian - Part 3
This is part 3 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
Introduction
In the last two guides, we set up a DRBD resource and LVM volume group which we could manually migrate between the two cluster nodes. In this guide, we'll set up the Heartbeat cluster software to handle automatic migration of services between the two nodes in our cluster ("failover").
The version of Heartbeat included in Debian Etch is 1.x. It is a very simple system, and is limited to two node clusters, making it ideal for something simple such as failover for services between two nodes. The current 2.x branch is a lot more complicated, and has a new XML configuration format, although it can still be used with the original 1.x format files. Although it adds many useful features, it's overkill for our needs at the moment - plus, sticking to 1.x avoids the need to install software not included in the current stable distribution.
no comments yet, be the first! Trackbacks (0)
Linux, Solaris and FreeBSD iostat monitoring with Cacti
I've been looking for ages for a tool to parse the output from "iostat" on Linux, and graph it in Cacti. I found a few scripts and templates that did some of what I was looking for (disk I/O etc.), but nothing that gave me the full set of statistics such as queue length, utilisation, service time etc. I finally got round to writing my own set of templates and a data gathering script to provide this information, and it seems to work very well. So that others can benefit, I've posted the package archive and a brief description over on the Cacti forums (click Continue Reading for a download link to an updated version - the one on the Cacti forums has a bug so that it won't work with all versions of sysstat). Below are a couple of sample graphs to give you an idea of what it can do - there's also a few more samples posted in the Cacti forums thread :


Installation is a simple matter of creating a cron job to gather iostat data, extending your snmpd.conf to call the included iostat.pl script, and then importing the templates. Full instructions are included in the README within the archive (click the Continue Reading link to see them), but if you have any comments, suggestions or problems please let me know!
Continue reading "Linux, Solaris and FreeBSD iostat monitoring with Cacti"
Building a redundant iSCSI and NFS cluster with Debian - Part 2
This is part 2 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
Configuring DRBD
Following on from part one, where we covered the basic architecture and got DRBD installed, we'll proceed to configuring and then initialising the shared storage across both nodes. The configuration file for DRBD (/etc/drbd.conf) is very simple, and is the same on both hosts. The full configuration file is below - you can copy and paste this in; I'll go through each line afterwards and explain what it all means. Many of these sections and commands can be fine tuned - see the man pages on drbd.conf and drbdsetup for more details.
global {
}
resource r0 {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
wfc-timeout 0;
}
disk {
on-io-error detach;
}
net {
on-disconnect reconnect;
}
syncer {
rate 30M;
}
on weasel {
device /dev/drbd0;
disk /dev/md3;
address 10.0.0.2:7788;
meta-disk internal;
}
on otter {
device /dev/drbd0;
disk /dev/md3;
address 10.0.0.1:7788;
meta-disk internal;
}
}
The structure of this file should be pretty obvious - sections are surrounded by curly braces, and there are two main sections - a global one, in which nothing is defined, and a resource section, where a shared resource named "r0" is defined.
The global section only has a few options available to it - see the DRBD website for more information; though it's pretty safe to say you can ignore this part of the configuration file when you're getting started.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 2"Building a redundant iSCSI and NFS cluster with Debian - Part 1
It's been a while now since I last updated this blog with any decent material (The Poo Truck notwithstanding, as honestly, that's a classic) so I thought I'd dust off some of my notes on building a redundant iSCSI and NFS SAN using Debian Etch.
The following post takes the form of a "HOWTO" guide - I'll include all the relevant commands, configuration files and output produced so you can follow along. This is the first part of the series; I'll post the different sections in phases, each covering a different part of the setup. The plan is to cover all this in 5 (possibly 6) separate posts, with the following content :
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
So, this being part one, I'll start with a quick overview of what I'm trying to achieve here :
Cluster overview
The cluster will consist of 2 storage servers, providing iSCSI and NFS services to a number of clients, over floating IP addresses and from a replicated pool of storage. This storage will be used for file sharing (NFS), and block devices (iSCSI) - although you could add any kind of service on top of the cluster; an obvious option would be to provide SMB (Microsoft Windows file services), although I won't explore that particular avenue.
This will be replicated with DRBD, and managed using LVM2. I'll also be using multipathing to the storage, so that a component (NIC, switch, cable etc.) can fail in one channel but the storage will still be accessible. Failover and cluster management will be provided by the Linux-HA project.
The distribution I'm using is Debian Etch (4.0), although most of the configuration files and commands used will work on any distro, although file locations and the package management commands will obviously differ.
Network layout
The two storage nodes (which I'll call "otter" and "weasel") will have the following 4 network interfaces configured :
- eth0: 172.16.1.x -Management interface (the address we SSH into to manage the system)
- eth1: 10.0.0.x - This is for data replication and heartbeat between the two nodes, and will be via a cross-over cable connected directly between the two servers
- eth2 and eth3: 192.168.1.x - This is the storage network, clients will connect to this for their storage.
And the client (which I'll call "badger") will have the following 3 network interfaces configured :
- eth0 : 172.16.1.x - Management / public interface
- eth1 and eth2 : 192.168.x.1 - Storage network (where we access the iSCSI and NFS storage). These will use 192.168.1.1 and 192.168.2.1, both with a netmask of 255.255.255.0 to ensure that requests go to the correct interface when using multipathing (more on that later).
In a real-world scenario, these would be on physically different NICs, and would also be on separate switches - particularly the multipathed storage interfaces. Utilising the different private ranges makes it easier to see at a glance what is going on, and makes trouble-shooting a lot easier. It's also obviously a good idea to separate your storage network from the rest of your regular network traffic.
Of course, there is nothing stopping you from utilising virtual NICs and having each address on eth0:1, eth0:2 and so on. Obviously, GigE or higher would be required in a production network, but there's nothing stopping you from using 100Mb in a test/development environment. Just don't expect stellar performance!
There will also be a null-modem cable connected between the two serial ports on each storage node. This is to supplement the network heartbeat, and will help avoid the problem of "split-brain" that can occur in clusters. If there was a problem with the heartbeat network - the switch failing, for instance - both nodes would then see the other as failed, and try to assume the master role. Having a secondary heartbeat connection between the nodes will help avoid this problem - particularly as it is a "straight-through" connection, and does not rely on any intermediate devices such as a network switch.
At this point, a diagram might be in order - you'll have to excuse my "Dia" skills, which are somewhat lacking!

This diagram shows all the important connections between the various hosts, so hopefully this will make things a little clearer. You can see that with this architecture, we could loose one storage server or one switch, and we'd still have a valid path to the storage from our client.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 1"ZFS Replication
As I've been investigating ZFS for use on production systems, I've been making a great deal of notes, and jotting down little "cookbook recipies" for various tasks. One of the coolest systems I've created recently utilised the zfs send & receive commands, along with incremental snapshots to create a replicated ZFS environment across two different systems. True, all this is present in the zfs manual page, but sometimes a quick demonstration makes things easier to understand and follow.
While this isn't true filesystem replication (you'd have to look at something like StorageTek AVS for that) it does provide periodic snapshots and incremental updates; these can be run every minute if you're driving this from cron - or, at even more granular intervals if you write your own daemon. Nonetheless, this suffices for disaster recovery and redundancy if you don't need up-to-the second replication between systems.
I've typed up my notes in blog format so you can follow along with this example yourself, all you'll need is a Solaris system running ZFS. Read more for the full demonstration...
ZFS as a volume manager
The example used in this post is the creation of a mirrored zpool which is then used to create a block device, on top of which I'll create a UFS filesystem. The reasons for doing this are many and varied : you may have an application that needs UFS (particularly forcedirectio); you may need to create a block device for some reason but all your storage is currently tied up in zpools; or you just need a quick block device to use for testing.
Using ZFS as a volume manager also has it's advantages over something like SVM (formerly "DiskSuite"). The management features are much improved (along with a browser-based GUI, if that's your thing) and you also gain access to ZFS features which operate at the volume manager layer and aren't dependant on the filesystem parts of ZFS. This includes features such as end-to-end error checking and recovery, along with snapshots.
Read on for the full update...
Continue reading "ZFS as a volume manager"



















