Calendar
|
|
October '08 | |||||
| Mo | Tu | We | Th | Fr | Sa | Su |
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||
Quicksearch
Categories
Syndicate This Blog
Quicksearch

Blastwave is dead
Earlier on today, the main Blastwave website got replaced by this message :
Blastwave is a registered trademark of Blastwave.org Inc. in the
United States and Canada. All assets of Blastwave.org Inc. are frozen
until further notice. All Solaris(tm) related open source software
work and services are cancelled. All websites, documents and binary
software packages that bear the mark Blastwave or Blastwave(tm) are no
longer available until further notice.
At the same time, mailing lists, shell logins and other services seem to have been shutdown and/or removed from DNS. None of this came with any warning or notification to the maintainers, and I still don't know what's going on. I can't access any of the build servers, so it's fairly safe to assume that my build scripts, packages, documentation, and everything else I've been working on for the Solaris community over the last 5 years is gone also. As if that wasn't enough, there are also reports that someone has been attempting to sabotage various mirror sites. I don't know how to take that - but frankly, right now, I don't care. I'm out. I've had it with the political fighting and drama. Many maintainers had already left following the last spat - I simply don't have the will to get involved in it any more, the damage has already been done. If anyone is still using my Blastwave packages (PostgreSQL, Nessus, PHP4, and some others) I recommend you switch to something else, like Sun's own CoolStack or OpenSolaris.
There's plenty more I could say, but at this point I think it's perhaps better to simply leave it. It's a sad day for me: seeing years of work towards something that I believed in, and helped a great many people, all go to ruin. It's even sadder for the Solaris community as a whole; this was a true grass-roots organisation - made up from like-minded Solaris users, admins, programmers and fans - who gave up countless hours of their own time to help others. I think the least we deserve is an explanation, but somehow I don't think one at this stage would make any difference anyway.
Update : People have been mailing me to say the main page is back up - true, but it's a case of "the lights are on, but no one's home". Check the thread in comp.unix.solaris.
Building a redundant iSCSI and NFS cluster with Debian - Part 2
This is part 2 of a series on building a redundant iSCSI and NFS SAN with Debian.
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
Configuring DRBD
Following on from part one, where we covered the basic architecture and got DRBD installed, we'll proceed to configuring and then initialising the shared storage across both nodes. The configuration file for DRBD (/etc/drbd.conf) is very simple, and is the same on both hosts. The full configuration file is below - you can copy and paste this in; I'll go through each line afterwards and explain what it all means. Many of these sections and commands can be fine tuned - see the man pages on drbd.conf and drbdsetup for more details.
global {
}resource r0 {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
wfc-timeout 0;
}disk {
on-io-error detach;
}net {
on-disconnect reconnect;
}syncer {
rate 30M;
}on weasel {
device /dev/drbd0;
disk /dev/md3;
address 10.0.0.2:7788;
meta-disk internal;
}on otter {
device /dev/drbd0;
disk /dev/md3;
address 10.0.0.1:7788;
meta-disk internal;
}}
The structure of this file should be pretty obvious - sections are surrounded by curly braces, and there are two main sections - a global one, in which nothing is defined, and a resource section, where a shared resource named "r0" is defined.
The global section only has a few options available to it - see the DRBD website for more information; though it's pretty safe to say you can ignore this part of the configuration file when you're getting started.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 2"no comments yet, be the first! Trackbacks (0)
Building a redundant iSCSI and NFS cluster with Debian - Part 1
It's been a while now since I last updated this blog with any decent material (The Poo Truck notwithstanding, as honestly, that's a classic) so I thought I'd dust off some of my notes on building a redundant iSCSI and NFS SAN using Debian Etch.
The following post takes the form of a "HOWTO" guide - I'll include all the relevant commands, configuration files and output produced so you can follow along. This is the first part of the series; I'll post the different sections in phases, each covering a different part of the setup. The plan is to cover all this in 5 (possibly 6) separate posts, with the following content :
Part 1 - Overview, network layout and DRBD installation
Part 2 - DRBD and LVM
Part 3 - Heartbeat and automated failover
Part 4 - iSCSI and IP failover
Part 5 - Multipathing and client configuration
Part 6 - Anything left over!
So, this being part one, I'll start with a quick overview of what I'm trying to achieve here :
Cluster overview
The cluster will consist of 2 storage servers, providing iSCSI and NFS services to a number of clients, over floating IP addresses and from a replicated pool of storage. This storage will be used for file sharing (NFS), and block devices (iSCSI) - although you could add any kind of service on top of the cluster; an obvious option would be to provide SMB (Microsoft Windows file services), although I won't explore that particular avenue.
This will be replicated with DRBD, and managed using LVM2. I'll also be using multipathing to the storage, so that a component (NIC, switch, cable etc.) can fail in one channel but the storage will still be accessible. Failover and cluster management will be provided by the Linux-HA project.
The distribution I'm using is Debian Etch (4.0), although most of the configuration files and commands used will work on any distro, although file locations and the package management commands will obviously differ.
Network layout
The two storage nodes (which I'll call "otter" and "weasel") will have the following 4 network interfaces configured :
- eth0: 172.16.1.x -Management interface (the address we SSH into to manage the system)
- eth1: 10.0.0.x - This is for data replication and heartbeat between the two nodes, and will be via a cross-over cable connected directly between the two servers
- eth2 and eth3: 192.168.1.x - This is the storage network, clients will connect to this for their storage.
And the client (which I'll call "badger") will have the following 3 network interfaces configured :
- eth0 : 172.16.1.x - Management / public interface
- eth1 and eth2 : 192.168.1.x - Storage network (where we access the iSCSI and NFS storage)
In a real-world scenario, these would be on physically different NICs, and would also be on separate switches - particularly the multipathed storage interfaces. Utilising the different private ranges makes it easier to see at a glance what is going on, and makes trouble-shooting a lot easier. It's also obviously a good idea to separate your storage network from the rest of your regular network traffic.
Of course, there is nothing stopping you from utilising virtual NICs and having each address on eth0:1, eth0:2 and so on. Obviously, GigE or higher would be required in a production network, but there's nothing stopping you from using 100Mb in a test/development environment. Just don't expect stellar performance!
There will also be a null-modem cable connected between the two serial ports on each storage node. This is to supplement the network heartbeat, and will help avoid the problem of "split-brain" that can occur in clusters. If there was a problem with the heartbeat network - the switch failing, for instance - both nodes would then see the other as failed, and try to assume the master role. Having a secondary heartbeat connection between the nodes will help avoid this problem - particularly as it is a "straight-through" connection, and does not rely on any intermediate devices such as a network switch.
At this point, a diagram might be in order - you'll have to excuse my "Dia" skills, which are somewhat lacking!

This diagram shows all the important connections between the various hosts, so hopefully this will make things a little clearer. You can see that with this architecture, we could loose one storage server or one switch, and we'd still have a valid path to the storage from our client.
Continue reading "Building a redundant iSCSI and NFS cluster with Debian - Part 1"no comments yet, be the first! Trackbacks (0)
Poo
OK, I readily admit that this is really childish. I haven't updated this site for ages; I know there are far better things that I could be writing about. At the age of nearly 30, I really shouldn't be sniggering at naughty words like I'm back in primary school. But when I saw this unbelievably apt number plate on a sewage truck, I realised I had witnessed the stuff of playground legends. I present to you The Poo Truck in all it's glory...

no comments yet, be the first! Trackbacks (0)
I'm married!

no comments yet, be the first! Trackbacks (0)
ZFS Replication
As I've been investigating ZFS for use on production systems, I've been making a great deal of notes, and jotting down little "cookbook recipies" for various tasks. One of the coolest systems I've created recently utilised the zfs send & receive commands, along with incremental snapshots to create a replicated ZFS environment across two different systems. True, all this is present in the zfs manual page, but sometimes a quick demonstration makes things easier to understand and follow.
While this isn't true filesystem replication (you'd have to look at something like StorageTek AVS for that) it does provide periodic snapshots and incremental updates; these can be run every minute if you're driving this from cron - or, at even more granular intervals if you write your own daemon. Nonetheless, this suffices for disaster recovery and redundancy if you don't need up-to-the second replication between systems.
I've typed up my notes in blog format so you can follow along with this example yourself, all you'll need is a Solaris system running ZFS. Read more for the full demonstration...
ZFS as a volume manager
While browsing the ZFS man page recently, I made an interesting discovery: ZFS can export block devices from a zpool, which means you can separate "ZFS the volume manager" from "ZFS the filesystem". This may well be old news to many; however I haven't seen many references to this on the web, so thought I'd post a quick blog update.
The example used in this post is the creation of a mirrored zpool which is then used to create a block device, on top of which I'll create a UFS filesystem. The reasons for doing this are many and varied : you may have an application that needs UFS (particularly forcedirectio); you may need to create a block device for some reason but all your storage is currently tied up in zpools; or you just need a quick block device to use for testing.
Using ZFS as a volume manager also has it's advantages over something like SVM (formerly "DiskSuite"). The management features are much improved (along with a browser-based GUI, if that's your thing) and you also gain access to ZFS features which operate at the volume manager layer and aren't dependant on the filesystem parts of ZFS. This includes features such as end-to-end error checking and recovery, along with snapshots.
Read on for the full update...
Continue reading "ZFS as a volume manager"ZFS and caching for performance
I've recently been experimenting with ZFS in a production environment, and have discovered some very interesting performance characteristics. I have seen many benchmarks indicating that for general usage, ZFS should be at least as fast if not faster than UFS (directio not withstanding), but nothing prepared me for what I discovered in my preliminary benchmarking.
To give a little background : I have been experiencing really bad throughput on our 3510-based SAN. The hosts are X4100s, 12Gb RAM, 2x dual core 2.6Ghz opterons and Solaris 10 11/06. They are each connected to a 3510FC dual-controller array via a dual-port HBA and 2 Brocade SW200e switches, using MxPIO. All fabric is at 2Gb/s.
So far, pretty straightforward. I had been using iozone as my benchmarking tool (using a 512Mb file as that's the average table size for our databases), and compared a wide range of systems and configurations, from an Ultra 20 with 7200RPM SATA drives, to the X4100's internal 10K RPM SAS disks as well as LUNs made available from the SAN in a variety of RAID levels.
Some interesting results here, which I'll skip over for the moment (like the Ultra20 beating the X4100 and SAN in read performance!) - the kicker happens when I added ZFS into the mix as an experiment.
Continue reading "ZFS and caching for performance"
no comments yet, be the first! Trackbacks (0)
Digital Badger
I couldn't resist it. I can't remember how it came up in conversation, but today the immortal phrase "digital badger" was uttered at work. I overheard it and thought to myself, "Now, there's a cool domain name". I had a look, and sure enough - digitalbadger.net was available. So now it's mine, all mine! Stupid, but still oh so very cool. However, I do apologise if you stumbled upon this site actually looking for information pertaining to binary mustelids.
no comments yet, be the first! Trackbacks (0)
Apache mod_proxy balancing with PHP sticky sessions
I've been investigating the new improved mod_proxy in Apache 2.2.x for use in our new production environment, and in particular the built-in load balancing support. It was always possible to build a load-balanced proxy server with Apache before, using some mod_rewrite voodoo, but having a whole set of directives that do all the hard work for you is a great feature.
There is however, a catch. It won't work out of the box with PHP sessions, or many other applications. I've since worked out a way around this which enables you to continue using all the great features mod_proxy_balancer offers and still bind requests to an originating server. All you need is a little mod_rewrite magic : Read on for more details...
Continue reading "Apache mod_proxy balancing with PHP sticky sessions"Sun V240 to X4100 : AMD vs SPARC
At work, we just migrated a database server from a Sun Fire V240 to a Sun X4100. This makes it the first AMD64 system we've put into production, and the performance advantage is staggering. I could post the benchmarks and various statistics, but I believe the following graphs paint a far more interesting and convincing argument for the price/performance benefit of Sun's AMD64 offerings...
Before (V240) CPU Utilisation

After (X4100) CPU Utilisation

All told, I'm impressed. The X4100 is ripping through queries at a phenomenal rate and is barely breaking a sweat. The V240 on the other hand was clearly struggling and was maxing out at 100% load. True, it's not a true like-for-like comparison, as it's pretty much impossible to do that across different systems and different architectures. But take a look at the price levels of these two systems - the V240 came in at around £7,500 for dual 1.5Ghz UltraSPARC IIIi processors, whereas for £4,800 you can get the X4100 with dual dual-core AMD 285 processors clocked at 2.6Ghz. Frankly, it's no contest. The only thing you don't get with the X4100 is another couple of disks which is no big deal as we've hooked it up to our SAN. However, even if you want to go for the X4200 which has room inside for 4 internal disks, you'd still only end up paying £5,100.
no comments yet, be the first! Trackbacks (0)

















