Per Larsson-Edefors

Learn More
We introduce FlexCore, the first exemplar of an architecture based on the FlexSoC framework. Comprising the same datapath units found in a conventional five-stage pipeline, the FlexCore has an exposed datapath control and a flexible interconnect to allow the datapath to be dynamically reconfigured as a consequence of code generation. Additionally, the(More)
— A twin-precision multiplier that uses reconfigurable power gating is presented. Employing power cutoff techniques in independently controlled power-gating regions, yields significant static leakage reductions when half-precision multiplications are carried out. In comparison to a conventional 8-bit tree multi-plier, the power overhead of a 16-bit(More)
We propose an accurate architecture-level power estimation method for SRAM memories. This hybrid method is composed of an analytical part for dynamic power estimation and a circuit-simulation backend used to obtain static leakage power values of all basic memory components. The method is flexible in that memory size is an arbitrary parameter. In a(More)
A novel partial-product reduction circuit for use in integer in the reduction tree. However, first we need to select a multiplier to multiplication is presented. The High-Performance Multiplier (HPM) start from, our model multiplier. In principle, we can select any of the reduction tree has the ease of layout of a simple carry-save reduction three(More)
In this paper, we present a new technique which indirectly separates and extracts the total short-circuit power consumption of digital CMOS circuits. We avoid a direct encounter with the complex behavior of the short-circuit currents. Instead, we separate the dynamic power consumption from the total power and extract the total short-circuit power. The(More)
The FlexSoC project aims at developing a design framework that makes it possible to combine the computational speed and energy-efficiency of specialized hardware accelerators with the flexibility of programmable processors. FlexSoC approaches this problem by defining a uniform programming interface across the heterogeneous structure of processing resources.(More)
— We investigate the effects of introducing a flexible interconnect into an exposed datapath. We define an exposed datapath as a traditional GPP datapath that has its normal control removed, leading to the exposure of a wide control word. For an FFT benchmark, the introduction of a flexible interconnect reduces the total execution time by 16%. Compared to a(More)
— The modified-Booth algorithm is extensively used for high-speed multiplier circuits. Once, when array multipliers were used, the reduced number of generated partial products significantly improved multiplier performance. In designs based on reduction trees with logarithmic logic depth, however, the reduced number of partial products has a limited impact(More)