StorLife CAS – compliance and de-duping another way

Written By:
Content Copyright © 2007 Bloor. All Rights Reserved.

Among the newer storage technologies, content addressable storage (CAS) is very much a standalone approach, with specific places where it can best be exploited. Low-cost SATA disk drives have changed the storage game, which is also good news for CAS.

Put CAS and SATA together, and you have the platform for Softco spin-off StorLife’s software; it especially exploits compliance archiving—a sector once dominated by optical WORM drives because the information on WORM drives could not be overwritten.

In order to show the potential benefits, I need to explain CAS as implemented by StorLife. The concept is simple enough. As file content is received by the StorLife Server, the binary content of the object or file is used in calculating a unique digital ‘fingerprint’ that uses the industry standard MD5/SHA-1 hashing algorithm—and this determines where the information is stored. This fingerprint is returned to the application which will be used to retrieve the information.

The first benefit of this is that, since the actual binary content determines storage and retrieval, this guarantees content authenticity as is often required for evidence of compliance. A completely different benefit is that this also guarantees single instance storage (SIS) since, if another copy of the same data comes in the algorithm will arrive at the same answer, so it can be tagged from multiple application sources but only held once.

In other words, CAS inherently carries out de-duplication (de-dupe) to save potentially huge amounts of storage space. When I wrote about de-duping a couple of weeks ago I made no mention of CAS. (Had I done so I would then have been faced with also explaining CAS, seriously complicating things.)

If one application were to change the data, the algorithm would be recalculated in its case and the changed binary would produce another location. Now consider that in conjunction with StorLife’s server virtualisation technology which allows the pooling of physical storage from multiple network devices. These are then managed from a single central console, maximising hardware resource utilisation and simplifying management.

StorLife asserts that not a single byte of storage is wasted is achieved, and it handles the addition of extra storage capacity.

I have not previously explained how this purely software product used with magnetic disk replaces WORM technology. First StorLife offers content protection and security with FIPS-197 (AES) encryption. Second, as already explained, its CAS technology does not allow overwriting data on disk. Equally, it can provide the ‘digital shredding’ by removal of expired content from the disk in accordance with compliance rules.

Apparently the US FCC has approved CAS for use in this way by NASDAQ and the US Department of Defense has approved this method of digital shredding.

Swapping out WORM disks is partly a matter of device cost, and low cost SATA drives have helped here. But it is also about ease and speed of access. Retrieving data from magnetic disk using a hash key is very fast, with software multi-threading further speeding this, whereas retrieval from optical is slow and writing to it even slower.

When you do things differently, there are other compatibility issues to consider. So, for instance, StorLife has its own optional but integrated retrieval function to provide advanced search features, high speed indexing, workflow and query capabilities. CAS’s unique storage space-saving capabilities are not compatible with, for instance, thin provisioning. It is an either-or, but both save lots of space.

Does CAS have a big future? I think so, as long as it is used where it is appropriate—as StorLife has understood.