Learn More
The Raw microprocessor consumes 122 million transistors, executes 16 different load, store, integer or floating point instructions every cycle, controls 25 GB/s of I/O bandwidth, and has 2 MB of on-chip, distributed L1 SRAM memory, providing on-chip memory bandwidth of 43 GB/s. Is this the latest billion-dollar 3,000 man-year processor effort? In fact, Raw(More)
This paper evaluates the Raw microprocessor. Raw addresses thechallenge of building a general-purpose architecture that performswell on a larger class of stream and embedded computing applicationsthan existing microprocessors, while still running existingILP-based sequential programs with reasonable performance in theface of increasing wire delays. Raw(More)
The drive for performance in the face of increasing wire delay blurs the line between microprocessors and multiprocessors. Microprocessor designs such as the Alpha 21464 have multi-cycle " network " latencies between ALUs [1], and come close to having multiple, parallel fetch units, much as a multiprocessor. A recent paper [2] identifies the existence of a(More)
Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express many complex data accesses such as the wrap-around behavior in queues using ring buffers and array address calculations in data distribution and cache locality compiler-optimizations. Experienced application programmers, however,(More)
  • Matthew Ian Frank, Jason Asanovic, David Miller, Atul Wentzlaf, Emmett Adya, Scott Witchel +40 others
  • 1996
A computer can never be too fast or too cheap. Computer systems pervade nearly every aspect of science, engineering, communications and commerce because they perform certain tasks at rates unachievable by any other kind of system built by humans. A computer sys-tem's throughput, however, is constrained by that sys-tem's ability to find concurrency. Given a(More)
122 million transistors; executes 16 different load, store, integer, or floating-point instructions every cycle; controls 25 Gbytes/s of input/output (I/O) bandwidth; and has 2 Mbytes of on-chip distributed L1 static RAM providing on-chip memory bandwidth of 57 Gbytes/s. Is this the latest billion-dollar, 3,000 man-year processor effort? In fact, it took(More)
  • 1