Approaching heterogeneous storage optimisation

This is the second of two articles about storage optimisation. In the first I discussed how an optimal solution would be one based around discovering (at a detailed level) all of your SAN infrastructure resources (servers, disks and so on) in real-time to create a consolidated view of the entire environment. Then for the solution to take that awareness, combine it with any specific, unique policy or other restrictions that might be in place, and to apply business intelligence principles to the problem of how you might optimise those resources. As a result of this analysis you would like the software to recommend potential solutions, allowing you to choose which one you prefer and, once that determination has been made, for the software to implement relevant changes (actually prepare the storage for migration and then to perform the data migration) for you automatically.

There are four market sectors from which vendors might approach this issue: discovery, storage resource management, analytics and governance, and data migration. In this article I will consider how suppliers in each of these areas might approach these requirements and I will conclude (sneak preview here) that actually what is required is capabilities from all these disciplines. However, that’s getting ahead of myself.

There are plenty of companies that have discovery capabilities and the larger IT hardware and software solutions vendors in particular—IBM (Tivoli), HP, EMC, CA, BMC, and so on—offer portfolios of products to help monitor and manage the whole enterprise infrastructure. Each vendor’s solution set provides a whole swathe of options and features, but needs skilled personnel to take full advantage of what it provides.

Now overlay virtualisation. This has become widely deployed for some of the servers and is gradually penetrating the storage environment. Virtualisation can simplify the big infrastructure picture—for example by creating relatively few ‘virtual’ servers potentially spanning many physical ones of different types—but, unless and until major consolidation of the underlying physical infrastructure takes place, the picture looks as complex as before, and the mix of virtual and non-virtual actually is more inefficient and complicates things.

Every large organisation needs to keep on top of its infrastructure; yet I bet a pound to a penny hardly any has a complete, reliable and up-to-date picture of what its infrastructure looks like. The above-named vendors, and quite a few more, offer (partially) automated “discovery” applications that can trawl the enterprise’s network and report back on the devices it finds.

Yet there are weaknesses in this process. Each hardware vendor’s discovery tools will invariably collect excellent data on its own equipment (though not necessarily right down to switch and port level, as may be needed). However, enterprises do not have one vendor’s equipment wall to wall and discovery of other vendors’ equipment can be less than full and accurate and will require manual reconciliation, as discussed in the previous article in this series. Nevertheless, shop around and you may find a few specialist vendors who can discover the whole heterogeneous SAN infrastructure at a more granular level. However, that’s all those specialised discovery-only vendors do: just discover the information, leaving you to do any optimisation, planning and implementation: a process that can take long enough that your discovery process will be out-of-date.

So, leaving those vendors aside, what about storage resource management (SRM) companies like NetApp, Symantec, EMC2, IBM and HDS? Historically, developments in storage technology have followed a different path, with the focus being on coping with the inexorable rise in the amount of data users need to store, back-up and retrieve, thereby increasing running costs, capital costs for new equipment and pressure on equipment space. This has resulted in what is not a panacea: storage virtualisation.

Storage virtualisation uses an approach that centres on SANs whereby all the physical storage is connected directly or indirectly (for example, via NAS) into a SAN (or linked SANs). Yet, once again, some of the hardware inevitably stays outside this set-up. Moreover, storage virtualisation is more about simplifying the environment than anything else and, in practice, it typically results in less efficient usage of resources, as opposed to optimisation. SRM similarly focuses on optimising performance for processing the most current and important data in a timely way, as opposed to making best use of resources. The ideal, so far nowhere achieved in practice, is to be able to optimise the consolidation of the storage and server infrastructure together, to provide best use of resources at the same time as providing operational efficiency.

So, that doesn’t get us a great deal further and it certainly doesn’t introduce the levels of automation required. But what about analytics and governance?

There are many companies focusing on governance, risk and compliance (GRC) and enterprise analytics. Some of these (for example, SAS) focus exclusively on the business domain rather than IT operations. There are a number of vendors, such as MetricStream, OpenPages and Oracle, that do work in IT operational environments. However, while these may be used to construct an optimisation plan they a) rely on your having and maintaining an up-to-date and accurate view of the enterprise infrastructure and b) have no ability to implement or execute any plan that they may come up with.

Finally, consider data migration, introducing another slew of specialist vendors including Informatica, SAP Business Objects, SAS and iWay. What they do dovetails into consolidating servers and storage, as facilitated by virtualisation, because the output from a data migration project will invariably be to a new virtual or physical location. Migration also involves data transformation or database merging for example, and a detailed planning process is needed (as is true for server and storage consolidation projects). However, there are, again, problems here. To begin with, using tools such as these for this sort of data migration is using a sledgehammer to crack a nut: they have been built to support much more complex migration issues where, for example, it is not just moving a database but changing from say, Oracle to DB2. If you happen to be using one of these products for some other purpose then you will possibly be able reuse it but it will certainly be an expensive option and, in any case, it will only do the physical movement of the data, not tell you where to put it in the first place. Moreover, these sorts of data integration tools have not been designed to understand physical data structures (for example, how the data is partitioned on disk): they may be good with logical metadata but not the physical metadata that is involved with storage and servers.

The bottom line is that you can license a heterogeneous discovery tool, use that to feed an appropriate analytics product and then use the output from that to start an appropriate data integration, consolidation or migration task or tasks, if you want to. Personally, I don’t envy you the task of knitting that all together and integrating it within the SRM environment.

What I think is really required, and what every enterprise IT manager and CIO longs to be able to see—and, in truth, needs, but usually thinks is impossible to achieve—is a product that intersects the four areas discussed and provides a single end-to-end solution. It involves software that uses some elements from all the four software sectors in combination and adds a few extra dimensions. The key elements are the following:

An automated discovery process that, once set up within a particular enterprise, will produce a complete, accurate and up-to-date view of the heterogeneous enterprise infrastructure—obviating the need for manual reconciliation.
An analytical process that works from the discovery results, which can logically deduce and recommend where virtual consolidation or other remediation ought to occur to optimise the server and storage infrastructure in terms of performance and/or space usage, then…
If the IT manager approves it by “pressing the button,” to cause the software to carry out its recommendations automatically without further user intervention (since it already has the information to do this).
Thereafter it should be able to continue monitoring the whole infrastructure and, by option, constantly optimise it based on performance criteria (‘policies’) set by the user.

So this then becomes a largely automated, much more simplified process. It would enable the IT manager to much more easily and effectively manage the infrastructure which, in turn, will be far more agile and responsive to requests involving business changes.

It is arguable that the requirements I have outlined are what SRM vendors should be providing; but the truth is that they are not doing so. Even if we just consider migration, there are few tools available from the SRM vendors and those that are available are not heterogeneous and only allow like-for-like transformations. This is clearly not sufficient. However, that doesn’t mean that a solution isn’t possible: most of the elements already exist, but have not hitherto been put together in the right way. Some vendors with software in more than one of the areas discussed may be closer to being able to achieve this (perhaps without realising it) than others. They will need their different products to integrate with one another and, in particular, the major and nearly universal weakness is an inability to automate and streamline the entire process.

This is not a case of “it would be nice if it were possible”; every enterprise that I have come across needs it—yesterday—and it IS possible. So who is going to get on the case?