Big Java apps need Java hardware

Written By:
Published:
Content Copyright © 2007 Bloor. All Rights Reserved.

Java architecture includes some rich functionality that has made it attractive for the development of large enterprise application suites. Conversely, its object oriented (OO) format and characteristics mean heavily-used applications do not achieve great performance. But is that really Java’s fault? Not if you believe Azul Systems.

The actual performance problem, according to Peter Holditch, senior systems engineer for appliance-maker Azul Systems, stems primarily from the way conventional server systems are architected. Despite advances, the basic design goes back to the days when single-threaded Cobol TP systems were the norm.

“Even today, a system with 16 separate CPUs is considered quite big,” said Holditch, adding that memory management tended to be ‘single-threaded’ so that multi-threading limits were reached quickly.

Java, on the other hand, is inherently multi-threaded as well as using garbage collection, as with other OO languages. This also means a performance problem for enterprise applications developed using Microsoft’s .NET, which is also OO-based. Holditch said that UNIX was also bad at multi-threading, with a first application with high usage tending to hog resources at the expense of the next one. Even using VMWare to create virtual machines did not overcome this for high-end Java-based systems, he said.

This problem was realised more than five years ago by the Azul Systems founders. They then spent the first four years doing R&D in ‘stealth mode’ to come up with something to tackle this. Last year they went public with their Java appliance. What it does, in effect, is provide a Java offload engine—a Java-designed system bolted on to the main system.

Azul’s appliance has a huge number of core processors, in binary increments from 96–384 at present, doubling to 768 by summer this year, with memory increments to 768GB; importantly, all of this forms a shared pool of capacity. The appliance is connected by dual Gigabit Ethernet connections to the main system. Java byte code applications need no amendment to run on the appliance through Azul’s own Java Virtual Machine (JVM). CPU; memory resources are allocated as needed for each thread, then freed up when finished with.

According to Holditch, the first dramatic benefit is that there is no discernable performance degradation from adding more and more concurrent Java applications. So, for instance, a spike in usage at peak times does not show up in a sudden performance tail-off (short of an extreme of allocating every last CPU). This is very useful for meeting service level agreement (SLA) performance guarantees.

Since resources are freed up for reuse as soon as finished with, memory utilisation can be much higher than with a ‘partitioned’ virtual system which has to maintain spare capacity for every process. Dependent on appliance capacity—appliances are 5U or 11U, with the new high-end box 14U—this can mean something like 3–5 times less hardware and far less power consumption than a conventional server running Java apps. Yet Holditch said it could typically achieve a 20x performance increase for a large number of concurrent processes.

That is a bold claim, but there are now a few major reference sites and plenty of test implementations to verify the picture. (Early adopters include BT, Credit Suisse First Boston (CSFB) and Wachovia.) This then begs the question of the big enterprise server makers: Do you have your designs right for the 21st century? Actually, it is not quite that clear cut.

Azul achieves part of this throughput boost through other hardware and software design characteristics—which it can do precisely because it is a free-standing Java-only box. Among the extras are nested locking removal for where data would not change and, more especially, hardware-assisted garbage collection which avoids major system pauses. “This is a big deal,” said Holditch. “A [financial] application can stop for a whole minute on book closing for instance.” He explained that some Java applications had had to include fancy code to keep garbage heap sizes small so as to reduce crippling garbage collection delays—but this also limited the amount of cacheing so reducing general performance.

The basic design means an almost horizontal performance graph regardless of the number of threads, as compared with an increasingly steep upward curve for other systems. So the more the threads the better it looks.

So the obvious question is: what is the market potential for this type of system? Its home territory is the large BEA WebLogic, Red Hat JBoss and IBM WebSphere Java appserver users. (Azul has approved status with BEA and Red Hat, with IBM thought to be fairly imminent.) Then, because of its compactness, it can be of particular interest to datacentres anxious to save space so as to put off the need to relocate to bigger premises.

In principle this appliance design is not limited to Java. Holditch hinted at a possible .NET version as a likely next development, logically in collaboration with Microsoft, since it has the same sort of performance problems (although not so many enterprise apps yet). “There’s nothing it couldn’t do,” said Holditch, who also mentioned SAP, optimised database engines (for instance Oracle), an SSL offload engine and perhaps an XML appliance for everyone to tap into.

Has Azul made its point? If it has, or does so soon, can we expect a new breed of systems to appear from the big hardware makers and chip providers? Probably not yet as the volumes do not appear to be great enough to interest them, and special software is needed for each type of usage. In the short term at least, this may to be Azul’s benefit—with the potential for some lucrative OEM agreements helping to fund its appliance variants.

Environment-related footnote: There is an environmental benefit of this type of appliance, but that has not so far been of interest to data centre managers. “Power consumption doesn’t come up,” said Holditch. “Instead it is the prospect of having to build a new data centre because they are physically running out of space, along with the amount of cooling per square foot.” So the big capital expenditure to relocate is front of mind while the environmental impact—and the reduction in ongoing costs from using less energy—are being ignored.