Ray C. C. Cheung

Learn More
An automated static approach for optimizing bit widths of fixed-point feedforward designs with guaranteed accuracy, called MiniBit, is presented. Methods to minimize both the integer and fraction parts of fixed-point signals with the aim of minimizing the circuit area are described. For range analysis, the technique in this paper identifies the number of(More)
We present an automated methodology for producing hardware-based random number generator (RNG) designs for arbitrary distributions using the inverse cumulative distribution function (ICDF). The ICDF is evaluated via piecewise polynomial approximation with a hierarchical segmentation scheme that involves uniform segments and segments with size varying by(More)
This paper presents a method for producing hardware designs for elliptic curve cryptography (ECC) systems over the finite field GF(2 ), using the optimal normal basis for the representation of numbers. Our field multiplier design is based on a parallel architecture containing multiple -bit serial multipliers; by changing the number of such serial(More)
This paper examines the hardware implementation trade-offs when evaluating functions via piecewise polynomial approximations and interpolations for precisions of up to 24 bits. In polynomial approximations, polynomials are evaluated using stored coefficients. Polynomial interpolations, however, require the coefficients to be computed on-the-fly by using(More)
We present the design and implementation of a Gaussian random number generator (GRNG) via hierarchical segmentation. Gaussian samples are generated using the inversion method, which involves the evaluation of the inverse Gaussian cumulative distribution function (IGCDF). The IGCDF is highly non-linear and is evaluated via piecewise polynomial approximations(More)
We present a flexible hardware architecture for precise gamma correction via piece-wise linear polynomial approximations. Arbitrary gamma values, input bit widths, and output bit widths are supported. The gamma correction curve is segmented via a combination of uniform segments and segments whose sizes vary by powers of two. This segmentation method(More)
An architecture and implementation of a high performance Gaussian random number generator (GRNG) is described. The GRNG uses the Ziggurat algorithm which divides the area under the probability density function into three regions (rectangular, wedge and tail). The rejection method is then used and this amounts to determining whether a random point falls into(More)
Polynomial multiplication is the basic and most computationally intensive operation in ring-Learning With Errors (ring-LWE) encryption and “Somewhat” Homomorphic Encryption (SHE) cryptosystems. In this paper, the Fast Fourier Transform (FFT) with a linearithmic complexity of O(n logn), is exploited in the design of a high-speed polynomial multiplier. A(More)
This paper describes a novel hardware accelerator for Monte Carlo (MC) simulation, and illustrate its implementation in field programmable gate array (FPGA) technology for speeding up financial applications. Our accelerator is based on a generic architecture, which combines speed and flexibility by integrating a pipelined MC core with an on-chip instruction(More)