Michael J. Schulte

Learn More
This paper presents a high-speed method for computing elementary functions using parallel table lookups and multi-operand addition. Increasing the number of tables and inputs to the multi-operand adder significantly reduces the amount of memory required. Symmetry and leading zeros in the table coefficients are used to reduce the amount of memory even(More)
ÐThis paper presents a high-speed method for function approximation that employs symmetric bipartite tables. This method performs two parallel table lookups to obtain a carry-save (borrow-save) function approximation, which is either converted to a two's complement number or is Booth encoded. Compared to previous methods for bipartite table approximations,(More)
Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby(More)
There is increasing interest in hardware support for decimal arithmetic as a result of recent growth in commercial, financial, and Internet-based applications. Consequently, new specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic. This paper introduces and analyzes(More)
State-of-the-art graphic processing units (GPUs) provide very high memory bandwidth, but the performance of many general-purpose GPU (GPGPU) workloads is still bounded by memory bandwidth. Although compression techniques have been adopted by commercial GPUs, they are only used for compressing texture and color data, not data for GPGPU workloads.(More)
Auditory stimuli are encoded by frequency-tuned neurons in the auditory cortex. There are a number of tonotopic maps, indicating that there are multiple representations, as in a mosaic. However, the cortical organization is not fixed due to the brain's capacity to adapt to current requirements of the environment. Several experiments on cerebral cortical(More)
This paper presents a methodology for designing bipartite tables for accurate function approximation. Bipartite tables use two parallel table lookups to obtain a carry-save (borrow-save) function approximation. A carry propagate adder can then convert this approximation to a two’s complement number or the approximation can be directly Booth encoded. Our(More)
The set-top and portable device market continues to grow, as does the demand for more performance under increasing cost, power, and thermal constraints. The integration of Graphics Processing Units (GPUs) into these devices and the emergence of general-purpose computations on graphics hardware enable a new set of highly parallel applications. In this paper,(More)