• Publications
  • Influence
HIPAcc: A Domain-Specific Language and Compiler for Image Processing
TLDR
It is shown that domain knowledge can be captured in the language and that this knowledge enables us to generate tailored implementations for a given target architecture, which outperform state-of-the-art domain-specific languages and libraries significantly.
A highly parameterizable parallel processor array architecture
TLDR
A new class of highly parameterizable coarse-grained reconfigurable architectures called weakly programmable processor arrays is discussed, which have the possibility of partial and differential reconfiguration and the systematical classification of different architectural parameters.
Invasive Tightly-Coupled Processor Arrays
TLDR
This work introduces a novel class of massively parallel processor architectures called invasive Tightly-Coupled Processor Arrays (TCPAs) and presents a seamless mapping flow for TCPAs, based on a domain-specific language, and outlines a complete symbolic mapping approach.
Power Density-Aware Resource Management for Heterogeneous Tiled Multicores
TLDR
This paper presents a resource management technique that introduces power density as a novel system level constraint, and provides runtime adaptation of the power density constraint according to the characteristics of the executed applications, and reacting to workload changes at runtime.
PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications
TLDR
The PARO design tool for the automated hardware synthesis of massively parallel embedded architectures for given dataflow dominant applications is presented and advanced partitioning techniques are applied in order to balance the trade-offs in cost and performance along with requisite throughputs.
Generating FPGA-based image processing accelerators with Hipacc: (Invited paper)
TLDR
It is shown that domain knowledge can be captured to generate tailored implementations for C-based HLS from a common high-level DSL description targeting FPGAs, and the resulting hardware accelerators to GPU implementations, generated from exactly the same DSL source code are evaluated.
Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration
TLDR
Five frameworks for parallelization on shared memory multi-core architectures are presented, namely OpenMP, Cilk++, Threading Building Blocks, RapidMind, and OpenCL, and a real world application from medical imaging is investigated, the 2D/3D image registration.
Code generation from a domain-specific language for C-based HLS of hardware accelerators
TLDR
This work proposes code generation techniques for C-based HLS from a common high-level DSL description targeting FPGAs, and assesses the achieved energy efficiency in contrast to software implementations, generated by HIPAcc from the same code base, executed on GPUs.
ExaStencils: Advanced Stencil-Code Engineering
TLDR
The ExaStencils approach will enable a highly automated code generation at all layers and has been demonstrated successfully before in the U.S. projects FFTW and SPIRAL for certain linear transforms.
...
1
2
3
4
5
...