Hossam A. H. Fahmy

Learn More
This work uses a partially redundant number system as an internal format for floating point arithmetic operations. The redundant number system enables carry free arithmetic operations to improve performance. Conversion from the proposed internal format back to the standard IEEE format is done only when an operand is written to memory. A detailed discussion(More)
This paper presents a new division algorithm, which requires two multiplication operations and a single lookup in a small table. The division algorithm takes two steps. The table lookup and the first multiplication are processed concurrently in the first step, and the second multiplication is executed in the next step. This divider uses a single multiplier(More)
In this contribution the modeling of power consumption for the VLIW processor TMS320C6416T is presented taking into account typical software algorithms in signal and image processing. The modeling is performed at the functional level making this approach distinctly different from other modeling approaches in low level technique. This means that the power(More)
In this paper, we investigate the read operation of memristor-based memories. We analyze the sneak paths problem and provide a noise margin metric to compare the various solutions proposed in the literature. We also analyze the power consumption associated with these solutions. Moreover, we study the effect of the aspect ratio of the memory array on the(More)
The increasing demand for portable computing has elevated power consumption to be one of the most critical embedded systems design parameters. In this paper, we present a qualitative study wherein we examine the impact of code transformations on the energy and power consumption. Three main categories of code transformations are investigated, namely data,(More)
Decimal floating-point multiplication became important in many commercial applications. This paper presents a fully parallel Decimal64 floating point multiplier compliant to IEEE 754r standard. The proposed multiplier possesses novel methods to target low latency. The proposed design is based on previously published fixed point multiplier [1]. Several(More)
The benefit of high radix Booth encoders in reducing the number of partial products in fast multipliers has been hampered by the complexity of generating the hard multiples. The use of redundant binary (RB) Booth encoder can overcome this problem and avoid the error compensation vector but at the cost of doubling the number of RB partial products. This(More)
The increasing demand for portable computing has elevated power consumption to be one of the most critical embedded systems design parameters. In this paper, we present a precise high-level power estimation methodology for the software loaded on a VLIW processor that is based on a functional level power model. The targeted processor of our approach is the(More)
Interest in decimal arithmetic increased considerably in recent years. This paper presents new designs for decimal floating point (DFP) addition, multiplication, fused multiply-add, division, and square root. It stresses the importance of energy savings achieved by hardware implementations of the IEEE standard for decimal floating point. To the best of the(More)