Tarek A. El-Ghazawi

Learn More
Montgomery modular multiplication is one of the fundamental operations used in cryptographic algorithms, such as RSA and Elliptic Curve Cryptosystems. At CHES 1999, Tenca and Koç proposed the Multiple-Word Radix-2 Montgomery Multiplication (MWR2MM) algorithm and introduced a now-classic architecture for implementing Montgomery multiplication in hardware.(More)
Several high-performance computers now use field-programmable gate arrays as reconfigurable coprocessors. The authors describe the two major contemporary HPRC architectures and explore the pros and cons of each using representative applications from remote sensing, molecular dynamics, bioinformatics, and cryptanalysis.
UPC extends ISO C into a Partioned Global Address Space (PGAS) programming language. UPC allows programmers to exploit data locality and parallelism in their applications, while maintaining ease of use. UPC is running ubiquitously across nearly all HPC platforms and has been gaining rising support from the community. UPC is relatively very easy to use for(More)
We designed hardware accelerators based on Xilinx FPGAs, XCV2000E, to speed up the scalar multiplications on elliptic curves recommended by NIST, over GF (2) and GF (2), in polynomial basis representation. Linear-Feedback-Shift-Registers (LFSRs) are exploited in the most significant digitserial (MSD) multipliers in order to improve design efficiency. We(More)
Summary form only given. Parallel programming paradigms, over the past decade, have focused on how to harness the computational power of contemporary parallel machines. Ease of use and code development productivity, has been a secondary goal. Recently, however, there has been a growing interest in understanding the code development productivity issues and(More)
Co-array Fortran (CAF) and Unified Parallel C (UPC) are two emerging languages for single-program, multiple-data global address space programming. These languages boost programmer productivity by providing shared variables for inter-process communication instead of message passing. However, the performance of these emerging languages still has room for(More)
UPC, or Unified Parallel C, is a parallel extension of ANSI C. UPC is developed around the distributed shared-memory programming model with constructs that can allow programmers to exploit memory locality, by placing data close to the threads that manipulate them in order to minimize remote accesses. Under the UPC memory sharing model, each thread owns a(More)