D. Scott Wills

Learn More
The J-Machine is a fine-grain concurrent computer that provides low-overhead primitive mechanisms for communication, synchronization, and translation. Communication mechanisms are provided that permit a node to send a message to any other node in the machine in < 2ps. On message arrival, a task is created and dispatched in < ljis. A translation mechanism(More)
In the modern era of wire-dominated architectures, specific effort must be made to reduce needless communication within out-of-order pipelines while still maintaining binary compatibility. To ease pressure on highly-connected elements such as the issue logic and bypass network, we propose the dynamic detection and speculative execution of instruction(More)
ÐSingle-chip multiprocessors are an important research direction for future microprocessors. The stigma of this approach is that many important applications cannot be automatically parallelized. This paper presents a single-chip multiprocessor that engages aggressive speculation techniques to enable dynamic parallelization of irregular, sequential binaries.(More)
We propose a machine architecture for a high-performance processing node for a message-passing, MIMD concurrent computer. The principal mechanisms for attaining this goal are the direct execution and buffering of messages and a memory-based architecture that permits very fast context switches. Our architecture also includes a novel memory organization that(More)
With advances in semiconductor technology, processors are becoming larger and more complex. Future processor designers will face an enormous design space, and must evaluate more architecture design points to reach a final optimum design. This exploration is currently performed using cycle accurate simulators that are accurate but slow, limiting a(More)
Modern embedded processors are designed to maximize execution efficiency--the amount of performance achieved per unit of energy dissipated while meeting minimum performance levels. To increase this efficiency we propose utilizing <i>static strands</i>, dependence chains without fan-out which are exposed by a compiler pass. These dependent instructions are(More)
* © IEEE 1996, from Third Conference on Massively Parallel Processing using Optical Interconnect, Maui, HI, October 1996, pages 44-52. Abstract Focal plane processing applications present a growing computing need for portable telecomputing and videoputing systems. This paper demonstrates the integration of digital processing, analog interface circuitry, and(More)
A dynamic speculative multithreaded processor automatically extracts thread level parallelism from sequential binary applications without software support. The hardware is responsible for partitioning the program into threads and managing inter-thread dependencies. Current published dynamic thread partitioning algorithms work by detecting loops, procedures,(More)
As inexpensive imaging chips and wireless telecommunications are incorporated into an increasing array of portable products, the need for high efficiency, high throughput embedded processing will become an important challenge in computer architecture. Videocentric applications, such wireless videoconferencing, real-time video enhancement and analysis, and(More)