A site devoted mostly to everything related to Information Technology under the sun - among other things.

Thursday, April 26, 2007

A $600 Super-computer

You can purchase a Sony Playstation 3 (PS3) for $600, install Linux on it, and get yourself a 200+ Gflop (single-precision) Linux supercomputer.

There is an IBM tutorial for installing Linux on PS3 and for building applications @
www-128.ibm.com/developerworks/power/library/pa-linuxps3-1

Under Linux on the PS3, there are six accessible synergistic processing elements (SPEs) for computation. (A seventh runs in a special mode and is dedicated to aspects of the OS and security, and an eighth is disabled to improve production yields.) Each SPE can run a different program, and the internal communications allows programmers to arrange the data flow in different ways using parallel, pipelined or streamed processing data flow models.

A DMA engine moves data on and off the cell. DMA requires the programmer to manually orchestrate data movement and computation. There will be the need for a lot of programming assembly code to get close to peak performance numbers.

One can run "C" on the SPEs but the performance will degrade; tight loops of "C" will generate around 4 Gflop/s per SPE.

PS3 offers limited memory, as each SPE has only 256-KB RAM for both program and data. Thus, only tight loops can run on each SPE. Performance is poor for double-precision (64-bit), relative to single-precision (32-bit) performance.

A PS3 cell processor can produce around 204 Gflop/s single-precision performance but only 15 Gflop/s double-precision

If we compare the PS3 to a 4-way (dual-socket, dual-core) 2.4 Ghz Opteron with 1 GB of DDR2-667 RAM, using 3 measures:

single-precision floating point performance
the ratio of RAM capacity and floating point performance (GB/GFLOP)
the ratio of RAM bandwidth and floating point performance (GB/s/ Gflop)

we get the follwoing figures:




Figure 1: Peak single-precision floating-point rate comparison.



















Figure 2: Assumed “good” "C" performance comparison.





One can see that the PS3 is highly unbalanced and favors single-precision floating-point performance. It is clear that the Cell B.E. architecture is highly specialized for certain types of applications but not others.

More information is available below:

Cell Processors for Scientific Computing @ ww.cs.berkeley.edu/~samw/projects/cell/CF06.pdf
Cell Workshop Slides, LANL @ www.cs.utk.edu/~dongarra/cell2006/cell-slides/04-Ken-Koch.pdf
Graph Exploration Algorithms @ hpc.pnl.gov/people/fabrizio/papers/ipdps07-graphs.pdf
LANL Newsletter, Roadrunner News Announcement @ www.lanl.gov/news/newsletter/091106.pdf
Optimizing Sweep3D @ hpc.pnl.gov/people/fabrizio/papers/ipdps07-sweep3d.pdf
Roadrunner Benchmarks @ www.c3.lanl.gov/pal/software/roadrunner.html

No comments:

About Me

My photo
I am a senior software developer working for General Motors Corporation.. I am interested in intelligent computing and scientific computing. I am passionate about computers as enablers for human imagination. The contents of this site are not in any way, shape, or form endorsed, approved, or otherwise authorized by HP, its subsidiaries, or its officers and shareholders.

Blog Archive