The Lutonium: a sub-nanojoule asynchronous 8051 microcontroller

  title={The Lutonium: a sub-nanojoule asynchronous 8051 microcontroller},
  author={Alain J. Martin and Mika Nystr{\"o}m and Karl Papadantonakis and Paul I. P{\'e}nzes and Piyush Prakash and Catherine G. Wong and Jonathan Chang and Kevin S. Ko and Benjamin Lee and Elaine Ou and James Pugh and Eino-Ville Talvala and James T. Tong and Ahmet Tura},
  journal={Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings.},
  • Alain J. Martin, M. Nyström, A. Tura
  • Published 12 May 2003
  • Computer Science
  • Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings.
We describe the Lutonium, an asynchronous 8051 microcontroller designed for low Et/sup 2/. In 0.18 /spl mu/m CMOS, at nominal 1.8 V, we expect a performance of 0.5 nJ per instruction at 200 MIPS. At 0.5 V, we expect 4 MIPS and 40 pJ/instruction, corresponding to 25,000 MIPS/Watt. We describe the structure of a fine-grain pipeline optimized for Et/sup 2/ efficiency, some of the peripherals implementation, and the advantages of an asynchronous implementation of a deep-sleep mechanism. 

Figures and Tables from this paper

The design of a sub-nanojoule asynchronous 8051 with interface to external commercial memory
An asynchronous 8051 microcontroller with interface to external commercial memory consisting of an asynchronous core implemented using dual-rail four-phase protocol, a 128 byte internal asynchronous RAM and other synchronous peripherals including interrupts, timers and serial port is presented.
Three generations of asynchronous microprocessors
We trace the evolution of Caltech asynchronous processors from a simple proof of concept, to a high-performance MIPS-like processor using a different buffer circuit for better performance, to the
A pipelined asynchronous 8051 soft-core implemented with Balsa
A novel pipelined asynchronous 8051 microcontroller is proposed which is implemented with Balsa language which is a CSP-based asynchronous HDL, and synthesized into Xilinx netlist by Balsa synthesis tool.
A low-energy low-voltage asynchronous 8051 microcontroller core
This paper proposes a low-energy, low-voltage (1.1 V) 8051 microcontroller core using asynchronous logic based on Austria Micro Systems 0.35 mum technology for hearing instrument (hearing aid) applications and proposes a unique indirect data memory fetching method where an indirect data can be fetched in one memory request cycle.
An eight-bit divider implemented in asynchronous pulse logic
The results show that it is possible to design, with a high degree of automation, complex systems with a throughput of 10 CMOS transitions (less than 15 F04 delays) per cycle.
A Low-Power Implementation of Asynchronous 8051 Employing Adaptive Pipeline Structure
A low-power implementation of the A8051 processor employs an adaptive pipeline structure that allows to skip a redundant stage operation and to combine with the neighboring empty stage to reduce the power dissipation and improve performance.
Low Power Techniques Applied to a 80C51 Microcontroller for High Temperature Applications
It shows that gating techniques can achieve good performances in a low power high temperature 80C51 microcontroller by using extensive clock and data gating, and by completely redesigning the micro-architecture.
An ultra-low energy asynchronous processor for wireless sensor networks
The design flow used for an asynchronous 8-bit processor implementing the Atmel AVR instruction set architecture is described, to show dramatic reductions in power and energy with respect to the synchronous case, while retaining essentially a traditional design flow.
A semi-custom memory design for an asynchronous 8051 microcontroller
The A8051 with the proposed ROM and RAM design operates at 28% higher MIPS rate, dissipates 20% lower energy per instruction, -50% lower Et2 and occupies 19% lesser area, as compared to the A80 51 with register-based memory.
A self-timed dual-rail processor core implementation for microcontrollers
A possible implementation model of processor core with dual-rail Muller pipeline, designed to meet the DI/QDI constraints, is demonstrated and a new pipeline model is proposed in this paper.


The design of an asynchronous MIPS R3000 microprocessor
The paper describes the structure of a high-performance asynchronous pipeline, in particular precise exceptions, pipelined caches, arithmetic, and registers, and the circuit techniques developed to achieve high throughput.
An asynchronous low-power 80C51 microcontroller
This paper presents a low-power asynchronous implementation of the 80C51 microcontroller. It was realized in a 0.5 /spl mu/ CMOS process and it shows a power advantage of a factor 4 compared to a
The design of an asynchronous microprocessor
This is the first entirely asynchronous microprocessor ever built and it is quite aware that asynchronous techniques may influence the computer architects in completely new ways that this first design is just starting to explore.
A Pipelined Asynchronous Cache System
This cache is designed as a distributed message passing system and implemented with full-custom quasi delay-insensitive circuits, and achieves timing robustness, low latency, and high average-case throughput by making minimal assumptions on signal delays.
Low-energy asynchronous memory design
  • J. Tierno, Alain J. Martin
  • Computer Science
    Proceedings of 1994 IEEE Symposium on Advanced Research in Asynchronous Circuits and Systems
  • 1994
We introduce the concept of energy per operation as a measure of performance of an asynchronous circuit. We show how to model energy consumption based on the high-level language specification. This
Energy-delay complexity of asychronous circuits
A circuit-level theory of energy-delay complexity is developed for asynchronous circuits and the notion of minimum-energy function is developed and applied to the parallel and sequential composition of circuits in general and, in particular, to circuits optimized through transistor sizing and voltage scaling.
ET 2 : a metric for time and energy efficiency of computation
An efficiency metric for VLSI computation that includes energy is investigated and an approximation for Etn (for arbitrary n) of an optimally sized system that can be computed without actually sizing the transistors is derived; it is proved that when multiple, adjustable supply voltages are allowed, the optimal Et2 for the sequential composition of components is achieved when the supply voltage is adjusted so that the components consume equal power.
Synthesis of Asynchronous VLSI Circuits
This work proposes a concurrent programming approach to digital VLSI design, where a digital circuit is the implementation of a concurrent algorithm, and the circuit to be designed is first implemented as a concurrent program that fulfills the logical specification of the circuit.
Towards an energy complexity of computation
Proceedings Ninth International Symposium on Asynchronous Circuits and Systems
  • Computer Science
    Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings.
  • 2003
The following topics are dealt with: asynchronous processors; pipeline design; synchronization; circuit analysis; interconnect methods; synthesis; and power management in security and signal