### Microwulf: Cost Efficiency

When you have measured a supercomputer's performance using HPL, and know its price, you can measure its cost efficiency by computing its price/performance ratio. By computing the number of dollars you are paying for each floating point operation (flop), you can compare one supercomputer's cost-efficiency against others.

With a price of just \$2470 and performance of 26.25 Gflops, Microwulf's price/performance ratio (PPR) is \$94.10/Gflop, or less than \$0.10/Mflop! This makes Microwulf the first general-purpose Beowulf cluster to break the \$100/Gflop (or \$0.10/Mflop) threshold for measured double-precision floating point performance.

For comparison purposes:

• In 1976, the Cray-1 cost more than 8 million dollars and had a peak (theoretical maximum) performance of 250 Mflops, making its PPR more than \$32,000/Mflop. Since peak performance exceeds measured performance, its PPR using measured performance (estimated at 160 Mflops) would be much higher.
• In 1985, the Cray-2 cost more than 17 million dollars and had a peak performance of 3.9 Gflops, making its PPR more than \$4,350/Mflop (\$4,358,974/Gflop).
• In 1997, IBM's Deep Blue defeated world chess champion Gary Kasparov. Its price has been estimated at 5 million dollars, and it produced 11.38 Gflops of measured performance, making its PPR more than \$439,367/Gflop.
• In 2003, the U. of Kentucky's Beowulf cluster KASY0 cost \$39,454 to build, and produced 187.3 Gflops on the double-precision version of HPL, giving it a PPR of about \$210/Gflop.
• Also in 2003, the University of Illinois at Urbana-Champaign's National Center for Supercomputing Applications built the PS 2 Cluster for about \$50,000. No measured performance numbers are available; which isn't surprising, since the PS-2 has no hardware support for double precision floating point operations. This cluster's theoretical peak performance is about 500 Gflops (single-precision); however, one study showed that the PS-2's double-precision performance took over 17 times as long as its single-precision performance. Even using the inflated single-precision peak performance value, its PPR is more than \$100/Gflop; it's measured double-precision performance is probably more than 17 times that.
• In 2004, Virginia Tech built System X, which cost 5.7 million dollars, and produced 12.25 Tflops of measured performance, giving it a PPR of about \$465/Gflop.
• In 2007, Sun's Sparc Enterprice M9000 with a base price of \$511,385, produced 1.03 Tflops of measured performance, making its PPR more than \$496/Gflop. (The base price is for the 32 cpu model, the benchmark was run using a 64 cpu model, which is presumably more expensive.)

At \$94.10/Gflop, Microwulf is by far the most cost-efficient platform available today for high performance double-precision computation. While it may not provide Tflop performance, it provides more than twice the general-computation performance of Deep Blue. Microwulf thus offers significant computational power at a highly affordable price.

Joel Adams > Research > Microwulf > Cost Efficiency