Learn More
This paper describes an integrated architecture, compiler, runtime, and operating system solution to exploiting heterogeneous parallelism. The architecture is a pipelined multi-threaded multiprocessor, enabling the execution of very fine (multiple operations within an instruction) to very coarse (multiple jobs) parallel activities. The compiler and runtime(More)
Writing parallel numerical programs is difficult. One problem is that the regular code and communication structure of typical parallel algorithms often becomes obfuscated when special cases at the boundaries of the computation are handled. Another problem is that parallel architectures can have vastly different communication structures, which makes it hard(More)
In parallel programming, the need to manage communication, load imbalance, and irregularities in the computation puts substantial demands on the programmer. Key properties of the architecture, such as the number of processors and the cost of communication, must be exploited to achieve good performance, but coding these properties directly into a program(More)
The development of Tera's MTA system was unusual. It respected the need for fast hardware and large shared memory, facilitating execution of the most demanding parallel application programs. But at the same time, it met the need for a clean machine model enabling calculated compiler optimizations and easy programming; and the need for novel architectural(More)
  • 1