• Publications
  • Influence
NVIDIA Tesla: A Unified Graphics and Computing Architecture
To enable flexible, programmable graphics and high-performance computing, NVIDIA developed the Tesla scalable unified graphics and parallel computing architecture. Expand
  • 1,378
  • 144
  • PDF
IEEE Standard for Floating-Point Arithmetic
  • 1,100
  • 50
Division Algorithms and Implementations
We present a taxonomy of division algorithms which classifies the algorithms based upon their hardware implementations and impact on system design. Expand
  • 299
  • 24
  • PDF
Floating point division and square root algorithms and implementation in the AMD-K7/sup TM/ microprocessor
  • S. Oberman
  • Mathematics, Computer Science
  • Proceedings 14th IEEE Symposium on Computer…
  • 14 April 1999
This paper presents the AMD-K7 IEEE 754 and /spl times/87 compliant floating point division and square root algorithms and implementation. Expand
  • 125
  • 20
Design issues in high performance floating point arithmetic units
In recent years computer applications have increased in their computational complexity. Expand
  • 102
  • 15
  • PDF
High-speed function approximation using a minimax quadratic interpolator
A table-based method for high-speed function approximation in single-precision floating-point format is presented in this paper. Expand
  • 119
  • 13
  • PDF
Design Issues in Division and Other Floating-Point Operations
This paper presents the system performance impact of floating-point division latency for varying instruction issue rates. Expand
  • 177
  • 10
  • PDF
The SNAP project: design of floating point arithmetic units
The paper presents results of the Stanford subnanosecond arithmetic processor (SNAP) research effort in the design of hardware for floating point addition, multiplication and division. Expand
  • 81
  • 7
SRT division architectures and implementations
We present an analysis of the effects of both circuit style and divider architecture on divider area and performance of divider implementations. Expand
  • 92
  • 7
  • PDF
AMD 3DNow! technology: architecture and implementations
The AMD-K6-2 microprocessor is the first implementation of AMD 3DNow!, a technology innovation for the x86 architecture that drives today's personal computers. The microprocessor implements 21 new instructions designed to open the traditional processing bottlenecks for floating-point-intensive and multimedia applications. Expand
  • 150
  • 3