I’ve been using and evaluating Citrix XenServer now for a while, and felt I should really post a review. I haven’t seen much detailed coverage of this product at the level I’m interested in, so what follows is my take on it from a Unix Sysadmin’s perspective. There won’t be any funky screenshots or graphics; instead, I tried to cover the sort of things I wanted to know about when I was looking at it as a candidate for our virtualization solution at work.
After all, implementing a new hypervisor is a big step, and a decision that you’ll likely be stuck with for a long time. If there’s anything else you’d like to know, just post in the comments section and I’ll do my best to answer.
As some background: I’ve been using the open source Xen hypervisor as a virtualization platform, alongside VMware for Windows hosts for a good few years now at work. Part of the reason for picking Xen was that it was the standard on the systems I inherited, and also it was free and well-supported on most Linux distributions at the time. To date, I have been using CentOS as a Dom0 – as it’s a free "clone" of Red Hat Enterprise Linux, it follows the same support schedules (up to 2014 for RHEL/CentOS 5.x) and is supported by pretty much every hardware vendor out there. It also has the libvirt tools built into it, as well as up to date packages for storage infrastructure such as DRBD and open-iscsi. It’s well supported, and even though it is a conservative "stable" distro, point releases occur regularly with back-ported drivers and user-land updates.
With some work, you can roll your own management tools and scripts, and end up with a very flexible solution. However, it lacks some management ease of use, particularly for other systems administrators who may not be totally comfortable in a Linux environment. We also wanted to standardise on one virtualization platform if possible, and this all coincided nicely with a planned upgrade/migration off the VMware stack.
XenServer therefore presents a very attractive proposition: A well known, widely tested and supported open source hypervisor, with a superior management stack. The basic product is free, although support and enterprise features are available for a price. The prices for the advanced features are very reasonable, all the more so when you compare against VMware’s offerings. Also consider that the free product allows you to connect to a wide range of networked storage systems and includes live migration, something that the free ESXi doesn’t offer.
All of what follows covers the freely downloadable XenServer 5.6; Both Dell and HP offer embedded versions for some of their servers, however running and managing these systems should be near enough identical apart from the installation steps.
Update : Just after writing this, the beta of "FP1" (an update to XenServer 5.6) was announced. Full details of what will be in this update are here in the release notes. It looks like there will be plenty of significant improvements across all areas (including MPP RDAC, scheduled backups, supported jumbo frames, on-line coalescing of snapshot space and various other things of particular interest to me). Bear in mind when reading this review, that many of the little issues I have with XenServer may well be resolved in the upcoming version, and other areas may be totally overhauled. As soon as the final version is released I’ll post a full update…
Update 2 : FP1 is indeed a big improvement. I’ve been using it in production now for a few months and should have an update soon, covering the new features such as the distributed switch, self-service portal etc.
Click the "Continue reading" link for the full review.
Installation and DriversInstallation is fortunately a snap. It’s a very simple text-driven affair that gives you very little control over the process, which is exactly what you want. After setting various system parameters such as locale and networking details, you can install additional "supplemental packs". One of these is provided as an option from the Citrix download page along with XenServer – it provides various Linux guest related tools and sets up a demo Debian etch template. Dell have also released a supplemental pack which sets up OpenManage and related hardware monitoring tools, which is a nice touch as it saves you the hassle of having to set this all up manually post-install.
One notable exception to the installation process is software RAID – there is no facility to set this up whatsoever. True, it is possible to set this up yourself afterwards if you are familiar with mdraid and LVM, but it’s totally unsupported. You really do need to use a hardware RAID controller for your boot volumes or boot from a SAN, if that’s an option.
As the control domain (Dom0) is based on CentOS, hardware and driver support is therefore identical to Red Hat Linux: In general, any recent server from any of the main vendors should present few difficulties, as long as it has a 64-bit CPU. In fact, if you are running PV Linux VMs, you don’t even need hardware virtualization support (such as Intel VT). The one exception that you really need to check carefully against the HCL is your SAN block storage.
If your array works out of the box with DM-Multipath (assuming you’re using multipathing; although you’d be mad not to in a production environment), then setup should be straight forward. If you are using something else like MPP-RDAC (such as on the Dell MD3000i or various Sun and IBM arrays), then you will have to customize your system a little more. I also experienced a problem with the supported Dell MD3220i array – XenServer tries to log into all available targets that an array presents. As the MD3220i has 4 active ports per controller, XenServer has to be able to reach them all. Originally, I had intended to only present 2 ports per controller to each server, but this meant I had to rethink my storage network.
In short – you’ll need to check it all thoroughly before you go into production, but as it’s a free download, you should be able to run all the tests you need before shelling out: as mentioned above, the free version is not limited when it comes to storage networking.
Booting and Management
Once installed, the boot process is similarly restricted. EXTLINUX (not GRUB) boots straight into the system, you get a white screen with the Citrix logo and a progress bar, and you don’t then see anything else until you are presented with the management console. You can switch to alternate terminals as with any other Linux distribution, but you won’t see much. The text-mode management console provides you with a few basic functions, such as the ability to reconfigure networking and storage, start/stop/migrate VMs, backup/restore metadata, and perform basic diagnostics.
Dropping to the command line reveals a 32-bit Dom0 based on CentOS 5. In fact, the CentOS repositories are all ready-configured in /etc/yum.repos.d, although by default they are disabled. What this means is that you can install any software on your Dom0 as you would any "regular" CentOS host. Whilst this is generally A Bad Idea in practice (your Dom0 should be doing as little as possible, not to mention that if anything does go wrong you may be unsupported), it does mean that you can install management and monitoring utilities for your RAID controllers, system management agents such as Dell’s OpenManage or HP’s Insight Manager, as well as Nagios plugins and Cacti scripts (my own iostat templates work fine!). Having such a full-featured Dom0 is tremendously useful, and a real advantage over ESXi which lacks a proper console.
Pretty much every other aspect of XenServer is managed through the XenCenter console, or the "xe" command line tool, which is also the same "xe" configuration tool in the open source "Xen Cloud Platform". XenCenter is a .Net application, and unfortunately only runs on Windows hosts although an open-source clone written in Python is available. A nice touch to the xe CLI tool is that all the parameters auto-complete through the tab key, just like any other Linux command. This means you can type something like "xe vm-param-list", hit tab and the required "uuid=" parameter gets filled in, and then pressing tab again lists all the available VM UUIDS to pick from.
There is almost a 100% mapping between the functionality in XenCenter and the xe tool, although the xe tool does expose some additional capabilities, and is also required for things like configuring MPP-RDAC based storage. You can also use xe to make some advanced tweaks that are unsupported by Citrix, such as enabling jumbo frames for your network interfaces. I suppose the idea is that if you’re using the CLI, you know what you’re doing and don’t require hand-holding or protecting from your actions! Along with the auto-completion, this tool is backed up by good on-line help and the XenServer manual documents the "xe" way of doing things very thoroughly.
Speaking of networking, all network interfaces are by default managed by XenServer instead of the underlying CentOS system. You end up with a bridge created for each network card (xenbr0, xenbr1, and so on…) and although you can label them as being for management use (e.g. iSCSI traffic), it is still possible to add them to a VM. Using xe, you can set a NIC (apart from the management and VM data networks) to be "unmanaged", at which point XenServer forgets all about it and you can use the usual /etc/sysconfig/network-scripts/ifcfg-ethX to manage them. This may have advantages if you find the additional overhead of a bridged configuration too much, or require more fine-grained control over your network.
In summary, the XenCenter GUI management tool is pretty solid, does a good job and has all the tools you’ll need laid out in a logical fashion. In fact, it should be possible to use the GUI tool exclusively for most tasks, which is a great help for administrators who perhaps would prefer to steer clear of a Linux bash prompt. There is the xe tool there for those who prefer the CLI approach, or who want to perform advanced configuration or tuning tasks. Other than that, there’s not much else to say – both tools are reliable and have never presented me with any problems.
Storage and Pools
When you install XenServer, it takes up around 4Gb of disk space, and the rest is assigned to a local Storage Repository (SR), which is essentially a LVM volume group. If you happen to have multiple devices or LUNs detected during installation, you’ll get to choose which ones you want to use. Once installation has completed, you have several options for adding storage. The XenCenter GUI supports adding storage from the following sources :
- Software iSCSI (using the OpeniSCSI stack in CentOS)
- Hardware HBA – This allows you to connect to Fibre Channel, FCoE, SAS or other SAN, assuming that your HBA is on the HCL supported list.
In addition, you can connect to a read-only CIFS or NFS "ISO library", where you can store installation CD images, rescue disks etc. If you upgrade to one of the premium versions of XenServer, you can also make use of StorageLink on supported arrays, which pushes operations such as provisioning and snapshots to the array controller.
Of course, you can also present storage to the CentOS-based Dom0, and as long as you can see a block device, you can create a LVM based repository on it using the xe tool, using something similar to the following :
xe sr-create type=lvm content-type=user device-config:device=/dev/cciss/c0d1 \ name-label=LOCAL SR
If you enable multipathing in XenCenter, it will tune various OpeniSCSI parameters to enable faster failback, and set up dm-multipath. One thing that threw me initially is that /etc/multipath.conf is actually a symlink, instead of an actual configuration file. If you need to make any modifications, you need to change /etc/multipath-enabled.conf instead. Of course, if you are using an alternative multipath implementation such as MPP-RDAC, you will have to configure and manage this manually and XenCenter will not report it’s status.
If you have multiple XenServer hosts, you can join them together into a named resource pool, and then you can take full advantage of shared storage as live migration of VMs between hosts is then possible. If you join hosts to pools through the GUI, you’ll find that only heterogeneous systems can be joined: you can’t mix and match different families of CPUs, for instance. However, if you join a pool using the xe tool, you can pass the "force=true" parameter which will permit this. Obviously, you then need to be very careful, and live migration between hosts may well result in system crashes. It does however open up the possibility of running "cold migrations" (e.g. shutting down and then starting on a different system) of VMs.
When XenServer hosts are in a pool, one server is the pool master. All commands to the other hosts go through this master. If it’s not available, then you will not be able to start, stop or manage VMs running on the other servers. If the master is just down for a short time (rebooting etc.) then this is not a big issue. However, if it has properly died, you will need to promote one of the slaves to a master in the meantime, which is covered in the manual but basically boils down to picking a suitable slave and running the following commands on it :
xe pool-emergency-transition-to-master xe pool-recover-slaves
The rest of the slaves will now be pointing to the new master. You can then re-install the old master and add it back to the pool as a slave. One thing that did catch me unawares (it’s mentioned in the manual, but is such a big "gotcha", I feel it’s worth repeating here) is that if you remove a host from a pool, it will be reset back to factory conditions. This includes wiping any local SRs on it. This means you need to move any VMs running on local SRs to something else prior to removing the host from the pool, or you will lose your data!
All SR types (apart from NFS) appear to use LVM as the underlying volume manager, but you don’t get much control over them. You can’t control block size, PE start or much else. I have also been unable to determine exactly how access to the volumes is arbitrated between hosts in a pool, as there appears to be no LVM host tags or cluster manager (such as CLVMd) running. However, you are prevented from attaching a SR that is in use by a pool to a non-pooled XenServer, and I have yet to experience any problems. The NFS SR uses flat ".vhd" files as virtual disks, and despite the lack of "true" multipathing, can make an cheap and effective shared storage solution, particularly when combined with interface bonding which is supported natively through XenCenter.
Once a VM is in the shut-down state, it’s very easy to move it’s underlying storage to a different repository – you just right click on it in the GUI, select "Move VM", and then select the target repository. Assuming you don’t have an exceedingly large volume of data to move, this makes a tiered approach to VMs possible and easy. If you have cheap NFS storage for non-critical VMs, iSCSI for more important ones, and even a top-tier of FC/SAS/FCoE, you can move VMs between SRs as performance or reliability requirements change.
I found that the usual trick of using "kpartx" to access virtual disks on LVM volumes doesn’t seem to work (as there is additional VHD metadata before that start of the actual disk image), although there’s an easy work around for PV Linux hosts on ext3. You can run "xe-edit-bootloader" to modify the grub configuration for the VM, which plugs the required Virtual Block Device (VBD), and mounts the VM’s root filesystem. If you are using "vi" as your editor, you can then hit Ctrl-Z and be left at a prompt where you can change into the mounted filesystem and run any maintenance as needed. Alternatively, you can create a "Rescue VM" for such tasks. I have one ready configured which boots from a System Rescue CD iso, and has a 100Gb empty filesystem for copying temporary files to. I can then simply attach another VM’s disk to this in the GUI, boot and mount it from within the rescue environment.
Snapshots are also not LVM-based snapshots, but utilise an altogether more complex (but more versatile) method. Full details of how this works are available in the Citrix knowledge bases – http://support.citrix.com/article/CTX122978. The linked PDF in that knowledge base entry is well worth a read, as it explains a lot about how XenServer uses virtual storage.
Essentially, when you create a snapshot on an iSCSI or LVM based SR, the original disk image gets truncated to it’s allocated size (e.g. a 10GB image with 5Gb used would get truncated to 5Gb), and gets re-labelled so it becomes the snapshot image. A new image then gets created to hold all future writes, and an additional image gets created to hold any writes to the snapshot. This is why, when you view your storage you may see snapshots showing up as "0% on disk" – they haven’t been written to, so are consuming no extra space. However, if you have a 100Gb disk image with 40Gb used, when you create a snapshot you will end up using 140Gb. Even if you delete all snapshots for that VM, you will still use 140Gb of disk space.
This is because XenServer cannot use "thin provisioning" on block devices (NFS stores, or local "ext" SRs do not have these limitations), as it does not use a clustered filesystem unlike VMWare’s VMFS. Citrix do provide a "coalesce tool", which will re-combine the snapshots into one VDI again and free up used space. This tool is documented in another Citrix knowledge base article here : http://support.citrix.com/article/CTX123400, and I have heard from a Citrix support engineer that an online coalesce tool will be provided in the next update of XenServer due later this year, so VHDs can be re-combined without powering off or suspending them. It’s worth bearing these requirments in mind if you plan on using regular snapshots for your backup strategy.
Migrating existing systems
Fortunately, the vast majority of my VMs are paravirtualized Linux systems already running on open-source Xen. These are exceptionally easy to convert to XenServer – I just created a VM using the "Demo Linux" template, which sets up a Debian Etch system. After this has been created, I shut it down and use the "xe-edit-bootloader" trick to mount the filesystem. I can then run a "rm -rf" on it, and then copy the new VM’s root filesystem over to it – using something like rsync or a tar archive (If using tar, remember to use —numeric-owner when unpacking!). I then make a quick edit of /boot/grub/menu.lst and /etc/fstab to point to the new block devices (/dev/xvda1 etc.) then change out of the mounted directory and quit the editor session. When the VM next starts, it’ll be running your new image. I’ve tried this approach on CentOS 5 and Debian Etch/Lenny/Squeeze hosts and all worked perfectly. Debian Squeeze doesn’t even need a -xen kernel, as pv_ops DomU is now in the mainline kernel; you just need the "-bigmem" kernel for PAE support.
There is also a script on the Xen.org website that appears to do this automatically : http://www.xen.org/products/cloud_projects.html. I haven’t tried this yet, but it looks like it’s worth checking out if you have a lot of open-source Xen VMs to migrate.
You will need to install the Xen tools in your VMs for optimum performance and reporting capabilities. I found though that the installation script tries to replace your kernel with a Citrix one. If this is not acceptable in your environment (I prefer to stick with the kernel supplied by the distro), you can just install the xe-guest-utilities packages, which are provided as 32 and 64-bit RPMs or DEBs.
For the Windows systems running on ESX that needed migrating, we discovered that XenConvert managed them all. You just need to remove the VMware tools before starting the conversion process, as well as being prepared for a long wait. So far, Windows server 2000, 2003 and 2008 have all been converted with success.
There are a number of backup options available to you. In addition to running host-based backups (using something like Bacula, Amanda, NetBackup, Legato Networker etc.), you can also backup your VM images. You could use something based on the script I wrote to automate this, assuming you have a centralised backup location with enough space. If this is a NFS SR, you can include the flat file contents of this in your regular backups. You can also run backup commands from within XenCenter, which will save a VM image or produce a "VM appliance" bundle to your local PC. Whilst not a good approach for regular backups, this can be useful for quick ad-hoc backups of systems. Finally, you can also backup VM metadata from the text-mode console, which will create a special volume on a given SR which holds all the VM configuration data. This means that you can re-attach this SR at a later date, and recover all your machine images and configuration.
There is also a commercial tool that integrates with XenCenter available from PHD Virtual, although I haven’t investigated it.
If you are using the free edition, all you need to do is re-register each of your systems once a year with Citrix in order to keep running; apparently, this is so that they can accurately gauge interest and allocate resources as needed. If you fail to do this, your VMs will keep running, but you will not be able to make any configuration changes or start any new VMs. I’ve found that the registration process is very simple and can all be done with a few clicks through XenCenter. You can view existing licenses and expiry dates, request new licenses and assign premium licenses (more on that in a moment) all through the same license manager. So far, every time I have requested a new license, it has been processed and emailed to me within a couple of hours so as long as you leave yourself enough time to renew them each year, shouldn’t become a burden.
If you do end up purchasing one of the advanced editions, these include a perpetual license so you do not need to continually renew your systems. This requires the use of a Citrix licensing server, which is provided either as an application which runs under Windows, or as a Linux-based appliance VM which you can run on XenServer itself. So that there isn’t a "chicken and egg" situation, there is a 30-day grace period where XenServers will still continue to run with all the advanced features without the licensing server being available. After this, they will revert back to the basic edition so you will still be able to run your VMs.
The Future and Conclusion
With any big investment (time as well as money), it’s always prudent to consider the future, all the more so when it comes to something which will underpin your whole IT infrastructure. There have been some concerns raised as to the future of XenServer, but my personal take is that it’s just the usual "the sky is falling" rumour-mongering so sadly common on the Internet. Citrix seem to be doing well as a company, and XenServer has enjoyed a solid heritage, all the way from back when it was called "XenEnterprise" before Citrix bought it. With the release of 5.6 earlier this year, it would appear that Citrix are putting a lot of effort behind their product, and there have been a number of big client wins although it doesn’t yet have the market share of VMware. However, a recent report suggests it’s growing market share faster than any of it’s competitors.
But what if the worst did happen though? If XenServer development and support both halted and no permanent licenses were available (highly unlikely, but then a few years back I’d never have thought that Oracle would have bought Sun and killed OpenSolaris), you do have several options open to you. You could always go back to open source Xen on the distribution of your choice or migrate to the open source Xen Cloud Platform, which is pretty much XenServer minus the XenCenter GUI and support options. If you use OpenXenManager, this could be a near drop-in replacement.
You can also export your VMs as an OVF image, which you could then import on multiple platforms, including VirtualBox and VMware. In short, I was happy that I had enough options for an exit strategy if needed – of course, this was sufficient for my needs, but I’d recommend you do some research and experimentation of your own.
And so, to my conclusion : XenServer makes a logical upgrade if you are already running on an open source Xen system such as Red Hat or SuSE Linux, and represents a very low risk choice if you are just starting out with virtualization or looking to migrate from an old, non-clustered VMware environment. It represents fantastic value for money, as with the free version you get an full-featured system with the all important live migration enabled, so you can move VMs between physical hosts with zero downtime. If you spend a little more, the premium editions give you far more "bang for your buck" than the comparative VMware offerings. While it may lack a few high-end features (and the lack of granular control over storage parameters is frustrating), it will likely fulfil the requirements of many environments and it won’t cost you anything to try it out!