Learn More
Three dimensional (3D) graphics applications have become very important workloads running on today's computer systems. A cost-effective graphics solution is to perform geometry processing of 3D graphics on the host CPU and have specialized hardware handle the rendering task. In this paper, we analyze microarchitecture and SIMD instruction set enhancements(More)
The microprocessor industry is currently struggling with higher development costs and longer design times that arise from exceedingly complex processors that are pushing the limits of instruction-level parallelism. Meanwhile, such designs are especially ill suited for important commercial applications, such as on-line transaction processing (OLTP), which(More)
Most Prolog machines have been based on specialized architectures. Our goal is to start with a general purpose architecture and determine a minimal set of extensions for high performance Prolog execution. We have developed both the architecture and optimizing compiler simultaneously, drawing on results of previous implementations. We find that most Prolog(More)
Three dimensional (30) graphics applications have become very important workloads running on today's computer systems. A cost-effective graphics solution is to perform geometry processing of 30 graphics on the host CPU and have specialized hardware handle the rendering task. In this paper, we analyze microarchitecture and SIMD instruction set enhancements(More)
1. Abstract The progress in the development of the 10 channel POLO (Parallel Optical Link Organization) module is described. The POLO program is a consortium of Hewlett-Packard, AMP, Du Pont, SDL, and the University of Southern California to develop low cost, high performance parallel optical data links for computer clusters, multimedia, and switching(More)
We demonstrate an 8 Gbps CMOS link interface designed for use with parallel fiber-optic interconnect technology. The link interface is implemented in 0.8 µm CMOS and consists of eight data and one frame control channel each operating at 1 Gbps along with a full-speed 1 GHz clock channel. The chip also provides dual-ported FIFO buffers for interface to a(More)
In this paper we describe the design and implementation of a 190-MHz pipelined 4-Kbyte instruction and data cache. The caches are designed in 1.0-µm CMOS and measure 0.78 x 0.47 cm 2. This paper describes the microarchitecture, cache timing, circuit implementation, and layout of both the instruction and the data cache. The key features of these caches are(More)
Most Prolog machines have been based on specialized architectures. Our goal is to start with a general purpose architecture and determine a minimal set of extensions for high performance Prolog execution. We have developed both the architecture and optimizing compiler simultaneously, drawing on results of previous implementations. We nd that most Prolog(More)
  • 1