- Full text PDF available (25)
—A 167-processor computational platform consists of an array of simple programmable processors capable of per-processor dynamic supply voltage and clock frequency scaling, three algorithm-specific processors, and three 16 KB shared memories; and is implemented in 65 nm CMOS. All processors and shared memories are clocked by local fully independent,… (More)
A 167-processor 65 nm computational platform well suited for DSP, communication, and multimedia workloads contains 164 programmable processors with dynamic supply voltage and dynamic clock frequency circuits, three algorithm-specific processors, and three 16 KB shared memories, all clocked by independent oscillators and connected by configurable… (More)
Modern applications increasingly require the computation of DSP workloads comprised of a variety of numerically-intensive DSP tasks. These workloads are found in communication, multime-dia, embedded, and wireless applications, and often require very high levels of computation and high energy efficiency. The Asynchronous Array of Simple Processors (AsAP)… (More)
as a JPEG encoder core and a fully compliant IEEE 802.11a/g wireless LAN baseband transmitter.
—A new inter-processor communication architecture for chip multiprocessors is proposed which has a low area cost, flexible routing capability, and supports globally asynchronous locally synchronous (GALS) clocking styles. To achieve a low area cost, the proposed statically-configurable asymmetric architecture assigns large buffer resources to only the… (More)
This paper investigates implementation techniques for tile-based chip multiprocessors with Globally Asynchronous Locally Synchronous (GALS) clocking styles. These architectures can simplify the physical design flow since they allow focusing on a single processor when designing an entire chip. However, they also introduce challenges to maintain system… (More)
This paper investigates the performance and power dissipation of globally asynchronous locally synchronous (GALS) multi-processor systems. We show that communication loops are a source of significant throughput degradation in communications links and that there is no degradation whatsoever under certain conditions for one-way links, and that it is possible… (More)