New Speed Records for Montgomery Modular Multiplication on 8-Bit AVR Microcontrollers

@inproceedings{Liu2014NewSR,
  title={New Speed Records for Montgomery Modular Multiplication on 8-Bit AVR Microcontrollers},
  author={Zhe Liu and Johann Gro{\ss}sch{\"a}dl},
  booktitle={AFRICACRYPT},
  year={2014}
}
Modular multiplication of large integers is a performance-critical arithmetic operation of many public-key cryptosystems such as RSA, DSA, Diffie-Hellman (DH) and their elliptic curve-based variants ECDSA and ECDH. The computational cost of modular multiplication and related operations (e.g. exponentiation) poses a practical challenge to the widespread deployment of public-key cryptography, especially on embedded devices equipped with 8-bit processors (smart cards, wireless sensor nodes, etc… 
Efficient Ring-LWE Encryption on 8-Bit AVR Processors
TLDR
A carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128 and outperform related RSA and ECC implementations by an order of magnitude.
Reverse Product-Scanning Multiplication and Squaring on 8-Bit AVR Processors
High performance, small code size, and good scalability are important requirements for software implementations of multi-precision arithmetic algorithms to fit resource-limited embedded systems. In
Efficient arithmetic on ARM-NEON and its application for high-speed RSA implementation
TLDR
A novel Double Operand Scanning (DOS) method to speed-up multi-precision squaring with non-redundant representations on SIMD architecture, compatible with separated Montgomery algorithms and highly efficient for RSA crypto system is introduced.
Efficient modular exponential algorithms compatible with hardware implementation of public-key cryptography
TLDR
The bit forwarding BFW techniques to reduce the count of modular multiplications for hardware implementation of modular exponentiation are presented and will result in increased throughput and decreased power consumption.
Multiprecision multiplication on AVR revisited
TLDR
This paper presents new speed records for multiprecision multiplication on the AVR ATmega family of 8-bit microcontrollers and shows that subquadratic-complexity Karatsuba multiplication is in fact faster than fully unrolled product-scanning multiplication already for surprisingly small inputs, starting at 48 bits.
A Synthesis of Multi-Precision Multiplication and Squaring Techniques for 8-Bit Sensor Nodes: State-of-the-Art Research and Future Challenges
TLDR
A survey on the multi-precision multiplication and squaring techniques, and makes special focus on the comparison of their performance and memory footprint on sensor nodes using 8-bit processors.
Study of Modular Multiplication Methods for Embedded Processors
TLDR
This study investigated Montgomery multiplication for public key cryptography on embedded microprocessors and the results reported will become part of a reference book for advanced Montgomery multiplication methods for future researchers.
Bit Forwarding 3-Bits Technique for Efficient Modular Exponentiation
TLDR
The Bit Forwarding 3-bitsBFW3 technique for efficient implementation of modular exponentiation is presented and shows that the B FW3 technique is able to reduce the frequency of multiplications by 18.20% for 1024-bit exponent, resulting in increased throughput and reduced power consumption.
Low-Weight Primes for Lightweight Elliptic Curve Cryptography on 8-bit AVR Processors
TLDR
A special variant of Montgomery multiplication for OPFs that does not execute any input-dependent conditional statements and is, hence, resistant against certain side-channel attacks is described, improving the state-of-the-art in lightweight ECC on 8-bit processors.
Area-Time Efficient Hardware Implementation of Modular Multiplication for Elliptic Curve Cryptography
In this paper, an area-time efficient hardware implementation of modular multiplication over five National Institute of Standard and Technology (NIST)-recommended prime fields is proposed for
...
1
2
3
4
...

References

SHOWING 1-10 OF 34 REFERENCES
Efficient and Side-Channel Resistant RSA Implementation for 8-bit AVR Microcontrollers
TLDR
A new variant of the hybrid method for multiple-precision multiplication that optimizes both memory accesses and register allocation and protects the RSA implementation against power analysis attacks via the integration of low-cost countermeasures is introduced.
Enabling Full-Size Public-Key Algorithms on 8-Bit Sensor Nodes
TLDR
This article presents the fastest known implementation of a modular multiplication for a 160-bit standard compliant elliptic curve (secp160r1) for 8-bit micro controller which are typically used in WSNs and presents an optimized arithmetic algorithm which significantly speed up ECC schemes.
Exponentiation Cryptosystems on the IBM PC
  • P. Comba
  • Computer Science
    IBM Syst. J.
  • 1990
TLDR
A mixed system that combines the superior key management capabilities inherent in public key cryptosystems with the much higher bulk-encryption speed obtainable with the Data Encryption Algorithm is discussed.
Energy-Efficient Software Implementation of Long Integer Modular Arithmetic
TLDR
This paper investigates performance and energy characteristics of software algorithms for long integer arithmetic, and shows that a combination of Karatsuba-Comba multiplication and Montgomery reduction allows to achieve better performance than other algorithms for modular multiplication.
Multi-precision Multiplication for Public-Key Cryptography on Embedded Microprocessors
TLDR
This paper proposes a novel method, i.e., “consecutive operand caching”, which reduces the number of required load instructions by caching the operands and boosts the speed of multi-precision multiplication by 3.85%, as compared to previous best known results.
Efficient prime-field arithmetic for elliptic curve cryptography on wireless sensor nodes
  • Yang Zhang, J. Grossschadl
  • Computer Science, Mathematics
    Proceedings of 2011 International Conference on Computer Science and Network Technology
  • 2011
TLDR
A high-speed implementation of arithmetic in Optimal Prime Fields for the ATmega128, an 8-bit processor used in a number of sensor nodes including the MICAz mote, and an optimized variant of Montgomery multiplication, based on Gura et al's hybrid technique, that takes the low weight of such primes into account to minimize execution time are described.
Analyzing and comparing Montgomery multiplication algorithms
TLDR
The operations involved in computing the Montgomery product are studied, several high-speed, space-efficient algorithms for computing MonPro(a, b), and their time and space requirements are described.
Optimized Multi-Precision Multiplication for Public-Key Cryptography on Embedded Microprocessors
TLDR
A novel method, i.e., "Carry-Once", is proposed, which reduces the number of intermediate result computation by size of result accumulation and improves all multi-precision multiplication techniques having Intermediate result computation and show performance enhancement in terms of speed by up to 2.5%, compared with best known results.
Simple Power Analysis of Unified Code for ECC Double and Add
TLDR
It is shown that SPA attacks may still be possible on selected single point multiplications if there is sufficient side channel leakage at lower levels, and Montgomery modular multiplication (MMM) is assumed to give such leakage, but other modular multipliers may be equally susceptible to attack.
Comparing Elliptic Curve Cryptography and RSA on 8-bit CPUs
TLDR
To accelerate multiple-precision multiplication, a new algorithm to reduce the number of memory accesses is proposed and implemented elliptic curve point multiplication for 160-bit, 192- bit, and 224-bit NIST/SECG curves over GF(p), RSA-1024 and RSA-2048 on two 8-bit microcontrollers.
...
1
2
3
4
...