Content Copyright © 2007 Bloor. All Rights Reserved.
If Microsoft Windows server software ever becomes totally resilient with fully automated failover capabilities, a fair few companies could suffer declining support revenues. I, for one, won’t hold my breath.
Conversely, the causes of many system faults and crashes are not to do with Windows. Sudden power outages, system and network hardware faults and applications causing data corruption are some examples. So a company like Neverfail, whose success is dependent on keeping thousands of Windows systems up and running 24/7, has to offer much more than Windows operating system support.
I am happiest with straightforward explanations of ‘hows and whys’ and the resulting business benefits. If you are too, then my explanation of the Neverfail approach may be to your taste.
First, there is a hardware element. The user needs to set up a secondary server—which does not need to be identical to the primary server (so may use spare kit) but does need an identical operating system on it. This is connected to the existing network and one or more network interface cards (NICs) need to be installed in both servers to provide a dedicated channel connection between them.
What then happens is that the Neverfail software ‘clones’ the entire system to the secondary server then places it ready in a ‘passive’ state not accessible from elsewhere. As soon as this is done, all data changes made on the primary system will start to replicate to the secondary without the user needing to know this. The two systems then remain in synch as long as things are running smoothly. As described so far, this amounts to continuous data protection (CDP).
However, things get more interesting when something goes wrong—a primary system failure. Neverfail’s ‘Heartbeat’ software stops the replication, auto-notifies IT staff and failover happens immediately and automatically. The ‘passive’ server becomes ‘active’ and all traffic is re-directed to it instead, with end users probably not noticing it miss a beat.
With a following wind, the IT staff should then be able to find and correct the fault behind the scenes on what had been the primary server. Then this can be brought up in the passive state, at which point Neverfail’s software will bring the systems back into synch by transferring all transactions not previously copied. Then it is an administrator decision if/when to switch back. Neat. In fact the GUI has a ‘switchover’ button allowing a manual switch even if both systems are working OK; this can be used ready to go off-line for routine maintenance or off-line application upgrades for instance, and these are immediately cloned when back on-line.
The same system can be used in a disaster recovery (DR) scenario. It can run in either a LAN or WAN environment, so the second system can be a physically remote site. Obviously, performance, if trying to run off the secondary system, would be degraded and highly variable. Neverfail does provide some low bandwidth modules with compression to reduce data transmission volume and improve application response time for congested WAN links; the company told me that de-duplication may be added to further improve this.
However, that does leave two outstanding problems. One is what happens if, say, application or data corruption occurs but the system keeps running for a time; the corruption will have been replicated before it was detected. Not so good. However, if ‘snapshots’ of the data, applications and systems registry are taken from time to time, then a roll-back to a snapshot that occurred before corruption can be made, after which this is re-established on the primary server as part of re-synchronising.
The second problem is the, hopefully very rare, occurrence of both systems being out at once. You simply have to weigh up the business risk of this occurring versus the cost of providing even more resilience. Neverfail provides a two- not three-system solution but, as the name suggests, this might never happen to your business.