Erik Brockmeyer

Learn More
Interestingly, data access and storage management for embedded programmable processors that you really wait for now is coming. It's significant to wait for the representative and beneficial books to read. Every book that is provided in better way and utterance will be expected by many peoples. Even you are a good reader or not, feeling to read this book(More)
Nearly all platforms use a multi-layer memory hierarchy to bridge the enormous latency gap between the large off-chip memories and local register files. However, most of previous work on HW or SW controlled techniques for layer assignment have been mainly focussed on performance. As a result, the intermediate layers have been assigned too large sizes(More)
We present a survey of the state-of-the-art techniques used in performing data and memory-related optimizations in embedded systems. The optimizations are targeted directly or indirectly at the memory subsystem, and impact one or more out of three important cost metrics: area, performance, and power dissipation of the resulting implementation. We first(More)
Loop fusion and loop shifting are important transformations for improving data locality to reduce the number of costly accesses to off-chip memories. Since exploring the exact platform mapping for all the loop transformation alternatives is a time consuming process, heuristics steered by improved data locality are generally used. However, pure locality(More)
automation (EDA) researchers have paid considerable attention to data memory issues because of the memory subsystem’s significance in determining such important design parameters as area, power, and performance. Designers have studied different approaches for the memory subsystem, ranging from the standard processor-memory hierarchy to fully customized(More)
In multimedia and other streaming applications, a significant portion of energy is spent on data transfers. Exploiting data reuse opportunities in the application, we can reduce this energy by making copies of frequently used data in a small local memory and replacing speed- and power-inefficient transfers from main off-chip memory by more efficient local(More)
This paper presents a tool for exploring different parallelization options for an application. It can be used to quickly find a high-quality match between an application and a multi-processor platform architecture. By specifying the parallelization at a high abstraction level, and leaving the actual source code transformations to the tool, a designer can(More)
In multimedia and other streaming applications a significant portion of energy is spent on data transfers. Exploiting data reuse opportunities in the application, we can reduce this energy by making copies of frequently used data in a small local memory and replacing speed and power inefficient transfers from main off-chip memory by more efficient local(More)
In an MP-SoC environment, a customized run-time management should be incorporated on top of the basic OS services to globally optimize costs (e.g. energy consumption) across all active applications, according to constraints (e.g. performance, user requirements) and available platform resources. To that end, we have proposed a Pareto-based approach combining(More)
Memory latency has always been a major issue in embedded systems that execute memory-intensive applications. This is even more true as the gap between processor and memory speed continues to grow. Hardware and software prefetching have been shown to be effective in tolerating the large memory latencies inherit in large off-chip memories; however, both types(More)