Efficient Implementation of Ring-LWE Encryption on High-End IoT Platform

  title={Efficient Implementation of Ring-LWE Encryption on High-End IoT Platform},
  author={Zhe Liu and Reza Azarderakhsh and Howon Kim and Hwajeong Seo},
ARM NEON architecture has occupied a significant share of high-end Internet of Things platforms such as mini computer, tablet and smartphone markets due to its low cost and high performance. This paper studies efficient techniques of lattice-based cryptography on ARM processor and presents the first implementation of ring-LWE encryption on ARM NEON architecture. We propose a vectorized version of Iterative Number Theoretic Transform (NTT) for high-speed computation and present a 32-bit variant… 
RLizard: Post-Quantum Key Encapsulation Mechanism for IoT Devices
The performance analysis showed that the RLizard KEM requires the fewest clock cycles for key generation, encapsulation, and decapsulation when the parameters are set to support a security level comparable with that of AES-128.


Efficient software implementation of ring-LWE encryption
This paper presents the new state of the art in efficient software implementations of a post-quantum secure public-key encryption scheme based on the ring-LWE problem using a 32-bit ARM Cortex-M4F microcontroller as the target platform and shows that the scheme beats ECC-based public- key encryption schemes by at least one order of magnitude.
Efficient Ring-LWE Encryption on 8-Bit AVR Processors
A carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128 and outperform related RSA and ECC implementations by an order of magnitude.
Compact Ring-LWE Cryptoprocessor
This paper presents three optimizations for the Number Theoretic Transform NTT used for polynomial multiplication and proposes an optimization of the ring-LWE encryption system that reduces the number of NTT operations from five to four resulting in a 20% speed-up.
Pseudo random number generator and Hash function for embedded microprocessors
This paper presented a light weight implementation techniques for efficient Pseudo Random Number Generator (PRNG) and Hash function and adopted AES accelerator based implementation to reduce memory consumption and accelerate performance.
Efficient Implementation of Bilinear Pairings on ARM Processors
This paper investigates the efficient computation of the Optimal-Ate pairing over Barreto-Naehrig curves in software at different security levels on ARM processors, exploiting state-of-the-art techniques and proposing new optimizations to speed up the computation in the tower field and curve arithmetic.
Implementing GCM on ARMv8
This work presents an optimized and timing-resistant implementation of GCM over AES-128 using instructions aimed to speed up binary polynomial multiplication, an operation which can be used to implement binary field multiplication.
Higher-Order Masking in Practice: A Vector Implementation of Masked AES for ARM NEON
A vector implementation of Coron et al’s masking scheme (FSE 2012) for ARM NEON processors is developed, demonstrating that the performance penalty caused by the integration of higher-order masking is significantly lower than in generally assumed and reported in previous papers.
Implementation and Comparison of Lattice-based Identification Protocols on Smart Cards and Microcontrollers
This paper reports the implementation of several state-of-the-art and highly-secure lattice-based identification protocols on smart cards and microcontrollers, and shows that only a few of such protocols fit into the limitations of these devices.
Beyond ECDSA and RSA: Lattice-based digital signatures on constrained devices
This work presents an efficient implementation of BLISS, a recently proposed, post-quantum secure, and formally analyzed novel lattice-based signature scheme that can achieve a significant performance of 35.3 and 6 ms for signing and verification, respectively, at a 128-bit security level on an ARM Cortex-M4F microcontroller.
Parallel Implementations of LEA
This paper proposes novel parallel LEA implementations on representative SIMT and SIMD architectures such as CUDA and NEON by taking advantage of both the desirable features of LEA and a parallel computing platform and programming model by NEON and CUDA.