Chengmo Yang

Learn More
Ever increasing performance requirements have elevated deeply pipelined architectures to a standard even in the embedded processor domain, requiring the incorporation of dynamic branch prediction subsystems to hide the execution latency of control-altering instructions. In this paper a low power early branch identification technique which enables the design(More)
Hybrid cache architectures have been proposed to mitigate the increasing on-chip power dissipation through the exploitation of the emerging non-volatile memories (NVMs). To overcome the high energy and long latency associated with write operations of NVMs, a small SRAM is typically incorporated into the hybrid cache for accommodating write-intensive cache(More)
Advances in semiconductor technologies have placed MPSoCs center stage as a standard architecture for embedded applications of ever increasing complexity. Because of real-time constraints, applications are usually statically parallelized and scheduled onto the target MPSoC so as to obtain predictable worst-case performance. However, both technology scaling(More)
Multiprocessor system-on-chip (MPSoC) platforms face some of the most demanding security concerns, as they process, store, and communicate sensitive information using third-party intellectual property (3PIP) cores. The complexity of MPSoC makes it expensive and time consuming to fully analyze and test during the design stage. This has given rise to the(More)
NAND flash memory has been widely adopted in embedded systems as secondary storage. Yet the further development of flash memory strongly hinges on the tackling of its inherent implausible characteristics, including read and write speed asymmetry, inability of in-place update, and performance harmful erase operations. While Write Buffer Cache (WBC) has been(More)
Due to its low power consumption and high density, phase change memory (PCM) becomes a promising main-memory alternative to DRAM in embedded systems. PCM, however, has the endurance problem in which the number of rewrites to each cell is quite limited compared with DRAM. Therefore, it is fundamental to eliminate unnecessary writes in PCM-based embedded(More)
The ever scaling-down feature size and noise margin keep elevating hardware failure rates, requiring the incorporation of fault tolerance into computer systems. One fault tolerance scheme that receives a lot of research attention is redundant execution. However, existing solutions are developed under the assumption that the fault rate is low. These(More)
Phase change memory (PCM) has demonstrated great potential as an alternative of DRAM to serve as main memory due to its favorable characteristics of non-volatility, scalability and near-zero leakage power. However, the comparatively poor endurance of PCM largely limits its adoption. Wear leveling strategies targeting to even write distributions have been(More)
Phase change memory (PCM) is one of the most promising candidates to replace DRAM as main memory in deep submicron regime. Regardless of single-level or multiple-level cells, the programming costs to each state exhibit significant asymmetries in latency, energy and endurance. In this paper, we exploit the potential of reducing programming costs in terms of(More)