Nhon T. Quach

Learn More
0272-1732/00/$10.00  2000 IEEE The Itanium processor is the first implementation of the Intel IA-64 architecture. Designed for the high-end server market, the processor provides features (see the sidebar, next page) that maximize system reliability and availability. These features include fault coverage at the hardware level, a configurable multilevel(More)
This paper presents the concept of leading-one prediction (LOP) used in most high-speed floating-point adders in greater detail and describes two existing implementations. The first one is similar to that used in the TBM RS /GO00 processor. The second is a distributed version of the first, consuming less hardware when multiple patterns need to be detected.(More)
This paper describes a fully static Complementary Metal-Oxide Semiconductor (CMOS) implementation of a Ling type adder. The implementation described herein saves up to one gate delay and always reduces the number of serial transistors in the worst-case (critical) path over the conventional carry look-ahead (CLA) approach with a negligible increase in(More)
The increasing computation requirements of modern computer applications have stimulated a large interest in developing extremely high-performance floating-point dividers. A variety of division algorithms are available, with SRT being utilized in many computer systems. A careful analysis of SRT divider topologies has demonstrated that a relatively simple(More)
SNAP — the Stanford subnanosecond arithmetic processor — is an interdisciplinary effort to develop theory, tools, and technology for realizing an arithmetic processor with execution rates under 1 ns. Specific improvements in clocking methods, floating-point addition algorithms, floatingpoint multiplication algorithms, division and higher-level function(More)
We studied three possible strategies to overlap the operations in a floating-point add (FADD) and a floating-point multiply (FMPY) for implementing a multiply-add-fused (MAF) instruction, whose result would be compatible with the IEEE floating-point standard. The operations in FMPY and FADD are: (a) non-overlapped, (b) fully-overlapped, and (c)(More)
  • 1