Learn More
This paper presents a study of board-level interconnection requirements for highly parallel and massively parallel computing. Analytical models of the I/O bandwidth of popular interconnection networks have been developed and show that current electronic technologies are poor in supporting the necessary I/O density and bandwidth. Optical interconnects appear(More)
The interest in translation-based virtual execution environments (VEEs) is growing with the recognition of their importance in a variety of applications. However, due to constrained memory and energy resources, developing a VEE for an embedded system presents a number of challenges. In this paper we focus on the VEE’s memory overhead, and in particular, the(More)
Dynamic binary translators(DBTs) provide powerful platforms for building dynamic program monitoring and adaptation tools. DBTs, however, have high memory demands because they cache translated code and auxiliary code to a software code cache and must also maintain data structures to support the code cache. The high memory demands make it difficult for(More)
Dynamic binary translators (DBTs) are becoming increasingly important because of their power and flexibility. However, the high memory demands of DBTs present an obstacle for all platforms, and especially embedded systems. The memory demand is typically controlled by placing a limit on cached translations and forcing the DBT to flush all translations upon(More)
Emerging exascale architectures bring forth new challenges related to heterogeneous systems power, energy, cost, and resilience. These new challenges require a shift from conventional paradigms in understanding how to best exploit and optimize these features and limitations. Our objective is to identify the top few dominant characteristics in a set of(More)
A central tenet behind accelerators is to partition a program execution into regions with different behavior (e.g., SIMD, Irregular, Compute-Intensive) and then use behaviorspecialized architectures [1] for each region. It is unclear whether the gains in efficiency arise from recognizing that a simpler microarchitecture is sufficient for the acceleratable(More)
The need for adaptability in a rapidly expanding embedded systems market makes it important to design virtual execution environments (VEEs) specifically targeting embedded platforms. We believe the first step in this direction should be to replace the performance focus of traditional VEE design with a combined memory and performance focus, given the memory(More)
Chip power consumption has reached its limits, leading to the flattening of single-core performance. We propose the 10x10 processor, a federated heterogeneous multi-core architecture, where each core is an ensemble of u-engines (micro-engines, similar to accelerators) specialized for different workload groups to achieve dramatically higher energy(More)
The performance and energy-efficiency advantages of customized architectures are well-known and widely-pursued. To date there has been no systematic basis to balance benefit from specialization with general-purpose coverage, much less to assemble accelerators to collectively support a generalpurpose workload. Our study is a first step to create a systematic(More)
Dynamic binary translators (DBTs) are becoming increasingly important because of their power and flexibility. DBT-based services are valuable for all types of platforms. However, the high memory demands of DBTs present an obstacle for embedded systems. Most research on DBT design has a performance focus, which often drives up the DBT memory demand. In this(More)