When virtualised storage virtually stops: what can be done?

You move your servers to VMware or Hyper-V and everything is fine. Then you virtualise your storage and, without even changing your application, everything slows right down. Even if you know what is going on, you may not see any obvious way to get around the problem-and particularly if you are on a tight budget (and who isn’t?).

For instance, if your storage has been thin provisioned to economise on disk usage, the last thing you want is to need to spread out your data to fully allocate disks, to get a performance boost; that would undo the capacity and energy savings that had reduced your operating costs. Nor, typically, would you want to embark on major systems changes at the same time as virtualising your storage.

What typically causes this slowdown is that the storage pool is now being accessed from, say, a dozen different virtual servers at once. While the overall volume of transactions may not have changed much, the randomness of record reading and writing will have multiplied. Writing in particular is slowed down through much greater drive contention and latency (longer seek times, greater head movement, longer average data transfer rates). This will affect everything sharing the storage pool from production applications to backups, snapshots and clones.

One company addressing this problem is Virsto (fairly obviously VIRtual STOrage), a recent VC start-up. It has test results showing a slowdown of three times for a Hyper-V environment and sometimes as high as nine times for a snapshot in the more widely adopted VMware.

Virsto uses a conceptually simple way to obviate this problem. The technique is not new to database vendors but has not been seen in the wider storage world until recently. What happens is that all the write requests are serviced initially by the record being written into a journal log file (not visible in the namespace), which creates a 100% sequential data stream irrespective of the number of virtual servers running at the time. This is fast because it removes the randomness and, as soon as the record has been written there, the requesting application gets the OK to continue. Meanwhile, an asynchronous background process works through the log file to write the records into the storage pool. So, two writes are needed but contention is removed.

There are of course a few technical wrinkles to manage, such as handling a read request for a record already due for updating from the journal, so as to ensure the latest image is used. Also, to maximise performance, the log file could be placed on a separate hard drive, or an SSD for which random reads are ultra-fast, but that is for the user (or channel partner) to decide-because I am describing a purely software solution.

Once installed, with the devices mounted through Virsto’s management GUI, the software operates transparently within the existing storage and backup environment. So operation remains as it was before in the virtualised server-storage environment-only faster. It is complementary to existing storage applications, not competitive or a replacement. So, for instance, thin provisioning can continue without causing performance degradation.

Virsto sees the virtual desktop market as having even greater potential. Instead of tens of virtual servers sharing the storage pool there may be 100s of virtual desktops, so the company has developed two versions of the software which share the same processing engine.

The company released a Hyper-V version last year, and this now has around 40 users. Its VMware ESX vCenter version, now in alpha, is due for a Q4 release and, according to Virsto, results are showing performance slightly exceeding that for a fully allocated disk configuration without the software. With that release, users will also have a way of migrating between Hyper-V and VMware environments (reducing vendor lock-in). A Xen desktop version is on its way and due for release in 2012.

I can see some in the channel being ambivalent to this. It can address the needs of their users, expecially SMEs, suffering performance problems-and keep their costs down by maintaining the existing infrastructure; this could reduce the amount of new hardware they purchase in the short-term. Virsto is also maintaining direct sales, not least because some enterprise customers want to try it as the solution spans all business sizes. OEM deals can also be expected a little further down the line.

It is incredible to me that this problem was not foreseen by the virtualisation vendors or, if it was, that they didn’t take steps of this kind to address it from the outset. I can also see no good reason (except for a few patents pending) why they should not develop their own solutions in the future. In the meantime, their loss is Virsto’s gain-with their easy way out for those suffering a virtual storage performance collapse.