High-radix systolic modular multiplication on reconfigurable hardware

@article{McIvor2005HighradixSM,
  title={High-radix systolic modular multiplication on reconfigurable hardware},
  author={Ciaran McIvor and M{\'a}ire O'Neill and John V. McCanny},
  journal={Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005.},
  year={2005},
  pages={13-18}
}
  • C. McIvor, M. O'Neill, J. McCanny
  • Published 11 December 2005
  • Computer Science, Mathematics
  • Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005.
The overall aim of the work presented in this paper has been to develop Montgomery modular multiplication architectures suitable for implementation on modern reconfigurable hardware. Accordingly, novel high radix systolic array Montgomery multiplier designs are presented, as we believe that the inherent regular structure and absence of global interconnect associated with these, make them well-suited for implementation on modern FPGAs. Unlike previous approaches, each processing element (PE… 

Figures and Tables from this paper

Montgomery Modular Multiplication on Reconfigurable Hardware: Systolic versus Multiplexed Implementation
TLDR
The systolic implementation can run the 1024 bits RSA decryption process in just 3.23 ms, and the multiplexed architecture executes the same operation in 4.36ms, but the second approach saves up to 28% of logical resources, which is competitive with the state-of-the-art performance.
Montgomery modular multiplication on reconfigurable hardware: Fully systolic array vs parallel implementation
TLDR
A comparison of two FPGA Montgomery modular multiplication architectures: a fully systolic array and a parallel implementation and it is compared the time x area efficiency for both architectures as well as a RSA application.
High-Performance and Area-Efficient Hardware Design for Radix-2 k Montgomery Multipliers
TLDR
An improved design is proposed that is capable of carrying out the same computation in n clock cycles using equivalent amount of hardware resource at higher frequency rate and can reduce hardware resource utilization by up to 60% compared with other previous architectures.
High Radix Montgomery Modular Multiplier on Modern FPGA
TLDR
This work proposes a new systolic architecture to perform high radix Montgomery algorithm on modern FPGA, which is rich in dedicated hardcore multiplier resources, and the new architecture is suitable to be used in public key coprocessors.
Generation of Finely-Pipelined GF(PP) Multipliers for Flexible Curve Based Cryptography on FPGAs
TLDR
A tool, distributed as open source, for generating VHDL codes with various parameters: width of operands, number of logical multipliers per physical one, speed or area optimization, possible use of BRAMs, target FPGA.
How to Maximize the Potential of FPGA-Based DSPs for Modular Exponentiation
TLDR
A modular exponentiation processing method and circuit architecture that can exhibit the maximum performance of FPGA resources and can perform fast operations using small-scale resources is described.
New Hardware Architectures for Montgomery Modular Multiplication Algorithm
TLDR
Two new hardware architectures that are able to perform the same operation in approximately n clock cycles with almost the same clock period are proposed, based on precomputing partial results using two possible assumptions regarding the most significant bit of the previous word.
Optimized Multiple Word Radix-2 Montgomery Multiplication Algorithm
TLDR
A modified optimized algorithm for radix-2 Montgomery Multiplication is presented which is based on parallelizing the multiplications within each Processing Element and pre-computation of partial results using assumptions regarding the most significant bit of the previous design thereby improving speed.
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm
TLDR
This paper proposes and discusses an optimized hardware architecture performing the same operation in approximately n clock cycles with almost the same clock period, and is only marginally more demanding in terms of the circuit area.
An efficient implementation of montgomery powering ladder in reconfigurable hardware
TLDR
This paper describes an efficient architecture to perform modular exponentiation using the Montgomery Powering Ladder algorithm that can performs the 1024 bits RSA decryption in 2.5 ms and presents a countermeasure against SPA attack.
...
1
2
3
4
...

References

SHOWING 1-10 OF 19 REFERENCES
High-Radix Montgomery Modular Exponentiation on Reconfigurable Hardware
TLDR
This contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs) that perform modular exponentiation with very long integers, at the heart of many practical public-key algorithms such as RSA and discrete logarithm schemes.
Modular Exponentiation on Reconfigurable Hardware
TLDR
It is shown that it is possible to implement modular exponentiation at secure bit lengths on a single commercially available FPGA and faster processing times are presented, more than ten times faster than any reported software implementation.
Montgomery modular exponentiation on reconfigurable hardware
  • Thomas Blum
  • Computer Science, Mathematics
    Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)
  • 1999
TLDR
This contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs) and shows that it is possible to implement modular exponentiation at secure bit lengths on a single commercially available FPGA.
Montgomery modular-multiplication method and systolic arrays suitable for modular exponentiation
TLDR
This paper derives the general condition so that the size of the output need not be examined each time the Montgomery method is executed and proposes two types of systolic arrays that execute theMontgomery method under that condition.
Systolic Modular Multiplication
  • C. D. Walter
  • Computer Science, Mathematics
    IEEE Trans. Computers
  • 1993
TLDR
A systolic array for modular multiplication is presented using the ideally suited algorithm of P.L. Montgomery (1985), where its main use would be where many consecutive multiplications are done, as in RSA cryptosystems.
New VLSI architectures of RSA public-key cryptosystem
  • P. A. Wang, W. Tsai, C. Shung
  • Computer Science
    Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97
  • 1997
In this paper, we propose several new VLSI architectures to reduce the hardware complexity and to increase the computation speed of the RSA public-key cryptosystem. By applying LSB-first algorithm in
The Imagine Stream Processor
TLDR
The Imagine architecture and programming model is presented in the first half and the scalability of the Imagine architecture is explored in the second half to provide a scalable architecture that supports 48 ALUs on a single chip.
Modular multiplication without trial division
TLDR
A method for multiplying two integers modulo N while avoiding division by N, a representation of residue classes so as to speed modular multiplication without affecting the modular addition and subtraction algorithms.
Systolic modular exponentiation via Montgomery algorithm
TLDR
Using graph models, a pure systolic pipeline for modular exponentiation (as a whole) is designed and can be used to raise to any power via Montgomery multiplications and squarings.
The Softening of Hardware
In the 1940s, when modern computing began, engineers tended to view computers and the programs running on them as unified entities. Now, after decades in which software and hardware developed along
...
1
2
...