Sunday, June 05, 2005

The Cellular Revolution

Okay, time for my first "real" post.

In recent years, cell phones have revolutionized the way we communicate, and indeed our society itself. I recently realized the extent to which I have personally become dependent on this relatively new technology when I went to Europe for about two weeks, and between not having consistent internet access and my cell phone not working, I had this continual, uncomfortable feeling of being totally disconnected. My point is that not all that long ago, cellular phones were a novelty (I still remember my dad's 20 lb. carphone that worked slightly more often than Michael Jackson's plastic surgeries), and that today, most people, even if they don't own one, can't imagine going to back to the days of pagers, fax machines, and homing pigeons.

But this post isn't about cell phones. It's about another "cell" technology that I believe will have an even greater impact on society than the cell phone. Sony, Toshiba, and IBM (STI) have spent the last few years developing a new type of processor called the Cell Processor. STI has been pretty tight-lipped about the new architechture and there's a lot disagreement and speculation about the details and implications, but what is agreed upon is this: the cell processor will be powerful, flexible, and amazingly, rediculously, absurdly FAST. I've been reading quite a bit about this new technology (mostly here and here), and I'm convinced that it has the potential to displace the traditional "Wintel"-based PCs that now dominate the market. Of course, the first place you'll see these new processors is in the PS3, maybe as soon as early next year (manufacturing of the cell processor should begin at the end of this year).

Here's the idea. A traditional computer architechture has a CPU which is connected to main memory (a.k.a. RAM) and various I/O devices (such as disk drives, monitors, keyboards, mice, etc.). The CPU is where the action is, and they have gotten pretty darn fast in recent years (3+ GHz for a high-end desktop). As we approach the boundaries of silicon, though, it's getting harder and harder to get more transistors on a chip, which means that its getting harder and harder and more and more expensive to make faster processors. However, even though this is a significant problem, the real issue that is holding back processing speeds is that main memory and other I/O devices are by far the bottleneck in any system anyway. It doesn't matter how many operations the CPU can do per second if it can't get the instructions it needs to know what to do and the data it needs to do it on.

This problem has led to all kinds of fancy tricks in modern computer architechtures. Perhaps most noteable is the addition of "cache memory". The idea with a cache is that we put a small buffer of expensive but super-fast memory on chip, right next to the CPU and we try really hard to make sure that whenever the CPU needs new data or instructions, it can find them in the cache (where they can be obtained in a cycle or two), instead of having to go out to main memory (where access times are maybe 10 times that). Of course, to do this, we have to "guess" at what information the CPU is going to request next, so we use all kinds of techniques such as time locality ("we used this recently, maybe we'll need it again soon"), spacial locality ("we used this byte of data, so we'll probably use the next one, too"), frequency of use ("we've used this a lot, so we'll probably use it again"), branch predition ("last time we did X after doing Y, so maybe we'll do that again this time"), and all kinds of others.

Another trick that chip designers and operating system designers use is called Virtual Memory or "Paging". In this system, the computer divides a program up into sections called "pages". Since most of the code in most programs is never actually used (they say that roughly 90% of a program's time is spent in 10% of the code), the computer can then load only those pages into memory which the program actually needs. The proglem is that if the program suddenly needs some pages that aren't in memory, it has to go get them from the hard drive, which, in terms of processor speeds, take FOREVVVER. That's usually what's happening when you hear your hard drive clicking loudly all of a sudden and your computer starts going really slow.

As you can imagine, these things (and LOTS of other strange tricks) add a TON of complexity to a microprocessor. So much so that most of the real estate on a microprocessor chip is now going to things like cache (which usually has 2 "levels" now) and the logic to controll all of the complex (convoluted?) tricks.

All of this stuff also makes it difficult to run a multi-processor computer. To get an idea of why this is, imagine if two microprocessors are working on a set of data, and one of them wants to read the value of a variable from memory. What if the other processor is currently working on that variable and has a newer version in cache? Similarly, if we want to write a piece of data to memory, we must first see if the other processor is using that data, and if so, we need to let it know that there's a new version available.

The cell processor abandons all of this. No cache, no virtual memory, no branch prediction, a shallow pipeline, little multitasking, no user management, just bare-bones processing power. To a traditional microprocessor designer, this sounds like madness. I mean, after all, caching and paging and pipelining and all that really do speed things up... a lot, usually. That's why they've been developed. The answer is by using a cell-based architechture, which I will explain now.

A cell processor is actually 9 processors in one. The first processor is basically an IBM PowerPC microprocessor. It's job is mostly to distribute smaller jobs (called software "cells") to the other 8 processing units, which STI has given the very stupid name of "Synergistic Processing Units" (SPUs). These SPUs are where the real power lies. Each one has its own memory and they are all connected by a super-high-bandwidth bus, which means that they can talk to each other really fast. The whole thing will run at something like 4.6 GHz, which is pretty impressive itself, but the real shocker comes when you look at a different statistic, called floating-point operations per second (FLOPS). A traditional CPU can handle about 6 million floating-point operations per second (GigaFLOPS). Some high-end graphics processors can do around 50 GigaFLOPS, but these are very specialized for handling 3D graphics. A single cell processor can handle 250 GigaFlops! One writer compared a single cell processor to 5 overclocked dual-core Opterons (the Opteron is a top-end processor by AMD).

Now here's the kicker:

The PS3 will have 4(!) cell processors.

Because that's the amazing thing about cell processors. They're alreadly designed to be massively parallel, so if you want to increase performance, just add another one. The operating system will then take care of dividing your program up into software cells which can then be distributed to the hardware cells, which then process the data indepently and return the results. Connect the various cells on a high speed bus, or even accross a high-speed network connection (wireless?) and suddenly they can start cooperating. Picture the cell processors in your HDTV and PS3 helping out you desktop computer when you're not playing Final Fantasy XII.

Some applications will be more benifitted by the cell architechture than others. Things like graphics and sound, signal processing, and certain scientific applications will be especially (rediculously) helped. Some things (like servers, for example) won't be helped as much, because their programs are not easily vectorizeable (I'm not entirely sure what that means myself).

If it's not obvious already, this kind of computing power will be hard or impossible to match in the near future with a traditional-style computer architechture. In fact, the cell processor will be so much more powerful, that it will likely be easy to emulate, say, a Pentium 4 and run all of your favorite Microsoft-based applications (or operating systems). It is possible that in a few years, everyone will own a cell-based computer running Linux (or even OS X??). At the very least it will be interesting to see how companies like Intel, AMD, Mac, NVidia, and Microsoft deal with this new threat.

It'll be better than a picture-phone.

I can't wait.

1 comment:

Anonymous said...

great stuff homes..very interesting.
!~jwo