An emerging network management headache: analysing ever-faster networks

Peter Williams

Written By:
Published: 27th March, 2008
Content Copyright © 2008 Bloor. All Rights Reserved.

Talking about protocol analysers is not really my beat—except that they do provide useful information for troubleshooting faults to support network management (which is my beat). However, a major issue concerning network speeds and real-time analysers should be receiving more management attention.

In mission-critical environments, the ideal analysers are those which can capture all the data all the time by working in real-time; then, if a problem occurs, the network specialists can trawl for exactly what passed along the line at the point of problem. Otherwise, they could be faced with trying to figure out, then recreate, the problem in order to analyse what happens—a very hit-a-miss affair.

However (in case anyone hadn't noticed), the more data that has to travel over networks the faster the networks have had to become to cope. So, in the case of Ethernet, we have gone from 10Mb/sec to 100Mb to 1Gb to 10Gb (so a 1000 times as fast as 10Mb) in a few short years.

The problem for these ‘real-time’ protocol analysers, of which there are only a few, is partly that they have had to keep up. This has, for instance, meant upgrading from software-only to purpose-built plug-in hardware appliances. Yet, even if they do keep up, the amount of information they collect in a very short time is multiplied, leaving the network specialist with a tougher task trying to see the wood for the trees to pinpoint the problem.

The market-leading protocol analyser is called Sniffer (nowadays owned by Netscout after its recent purchase of Network General). Yet, even in its hardware-software appliance format, Sniffer cannot yet cope with 10Gb Ethernet in real-time. The product nearest to achieving this at present is from Network Instruments. To do this, its Observer software is supported by a dedicated capture card designed from scratch for throughput and its GigaStor disk technology that incorporates daisy-chained SATA RAID arrays to which the data is written in parallel to keep up with storing the data at the speed received.

Of this combination, Ian Cummins, Network Instruments' VP of EMEA, told me: "It is completely happy with 100Mb and 1Gb Ethernet, but 10Gb is a challenge."

However, the challenge he is referring to is not that it cannot keep up with the speed of data flow; it is probably the only real-time product on the market which can, Sniffer notwithstanding. It is that, as the speed of the network has multiplied, so the analysis has become more complex.

All such analysers use retrospective network analysis (RNA) software to trawl through the captured information to identify possible problem points. But suppose, at 100Mb, the analysis finds two potential causes of a glitch in a given time-span; at 1Gb this may multiply to 10 and at 10Gb perhaps 30, all needing deeper investigation. In other words, the faster the network the more difficult it is to pinpoint the problem when a fault occurs—and the fastest networks are typically those that run the most mission-critical tasks.

The longer term management issue is that networks will inevitably get even faster and, even if the appliances can be upgraded to keep up, the complexity in pinpointing the problem will only get worse and resolving problems will tend to take longer—when they need to take less time!

Nor is this the end of the story. This difficulty is multiplied when Voice over IP (VoIP) and data traffic are mixed, since the software has to be able to separate out the two different streams and, even after that is done, a fault caused by one may manifest as a problem in the other. In Network Instruments' own annual user survey on networking, it found the number of VoIP users had grown by 5% in a year (from 61 to 66%).

A further factor is that VoIP traffic is not verified to the same degree as other data, so adding VoIP tends to introduce more rogue packets so potentially multiply the error count and making pinpointing problems even tougher.

So what's to be done? Cummins explained that Network Instruments is now working hard on the analysis for 10Gb. The main approach Network Instruments is taking is to provide an overview of the potential problem sources in less technical form, then provide a drill down capability to get into the fine detail for each. Parameters can also be set that will screen out acceptable "errors" to assist in seeing the wood for the trees.

This is a sound approach which other protocol analyser providers would do well to follow. Yet I doubt that, come the next hike in network speeds, this will be enough.

Post a comment?

We welcome constructive criticism on all of our published content. Your name will be published against this comment after it has been moderated. We reserve the right to contact you by email if needed.

If you don't want to see the security question, please register and login.