Low-Power, Resilient Interconnection with Orthogonal Latin Squares

Abstract

OVER THE YEARS, microprocessors have continually grown according to Moore’s law. System integration has reached a stage at which it’s possible to place a complete system on a single chip. As SoC design expands to encompass ever-increasing cores and other IP blocks to meet high-performance needs, the interconnection shared by different resources has emerged as another challenging SoC design issue. Transient failures due to crosstalk from other interconnects, electromagnetic interference, alpha particle hits, cosmic radiation, and so forth make an on-chip interconnection more susceptible to errors, altering the behavior of the interconnection fabric and deteriorating signal integrity. Furthermore, using low-swing signaling aggravates the reliability problem. Providing resilience against such transient failures is critical for proper system operation. In addition to resilience, energy consumption is another major challenge facing multicore SoC design. An interconnection network dissipates a significant fraction of the total system power budget. For instance, the Massachusetts Institute of Technology’s raw on-chip network consumes 36% of total chip power, and the Alpha 21364 microprocessor dissipates 20% of interconnection network power. Hence, an interconnection network must be designed to be power aware. Error-correcting code (ECC) can protect the system from transient errors that occur in an interconnection network, by encoding the data stream to address such errors. In addition, adopting ECC can reduce a link’s supply voltage without compromising system reliability. Therefore, ECC is an elegant, effective, and technology-independent technique that not only can make an on-chip interconnection resilient but also can reduce its energy consumption. In this article, we propose using multibit ECC called Orthogonal Latin Square Code (OLSC) for an on-chip interconnection. We have developed several OLSC schemes, each offering a different error correction capability. Because OLSC belongs to the class of one-step-decodable majority code, it can be decoded at exceptionally high speed. Moreover, the decoder can be implemented in modular form because parity check matrices are constructed in modular form, letting each additional module add further error correction capability. This property also provides the opportunity and flexibility to adaptively change the error correction capability with little hardware overhead. Therefore, each interconnection path can determine its own reliability level by enabling or disabling ECC modules according to noise intensity, defined by link constraints such as voltage and distance. Low-Power Resilient Interconnections

DOI: 10.1109/MDT.2011.35

8 Figures and Tables

Cite this paper

@article{Lee2011LowPowerRI, title={Low-Power, Resilient Interconnection with Orthogonal Latin Squares}, author={Seung Eun Lee and Yoon Seok Yang and Gwan S. Choi and Wei Wu and Ravi R. Iyer}, journal={IEEE Design & Test of Computers}, year={2011}, volume={28}, pages={30-39} }