• Publications
  • Influence
Graph Partitioning with Acyclicity Constraints
This work shows that this more constrained version of the graph partitioning problem is NP-complete and presents heuristics that achieve a close approximation of the optimal solution found by an exhaustive search for small problem instances and much better scalability for larger instances. Expand
Evolutionary multi-level acyclic graph partitioning
This work engineer an evolutionary algorithm to further reduce the cut, achieving a 30% reduction on average compared to the state of the art, and shows that this can reduce the amount of communication for a real-world imaging application and thereby accelerate it by up to 5% on an embedded multiprocessor architecture. Expand
A note on the extremes of a particular moving average count data model
In this note we present a study of the extremal properties of a particular moving average count data model introduced by McKenzie (1986) [Auto regressive-moving-average processes with negativeExpand
LIME: a future-proof programming model for multi-cores
The Less Is More (LIME) programming model addresses known programmability, compositionality, predictability, and scalability problems related to parallel programming in embedded systems of new asExpand
Evolutionary Acyclic Graph Partitioning
A multi-level algorithm for the acyclic graph partitioning problem and an evolutionary algorithm to further reduce communication cost, as well as to improve load balancing and the scheduling makespan on embedded multiprocessor architectures are contributed. Expand
Automatic HAL generation for embedded multiprocessor systems
This work demonstrates, how a Hardware Abstraction Layer (HAL) for device addresses and properties can be automatically generated from a formal system description while providing sufficient abstraction from hardware details. Expand
Disciplined Multi-core Programming in C
This work proves that the API-less programming model LIME is effective in practice by porting a radio application to LIME and showing a significant decrease in code complexity with no significant increase in run-time overhead due to code generation. Expand
Automatic Control Flow Generation for OpenVX Graphs
A new heuristic to reduce communication among PEs and to external memory by aggregating inter-process communication and pipelining image processing functions is presented, which can yield a reduction of up to 53% compared to other OpenVX implementations. Expand
A novel approach to minimising the logic of combinatorial multiplexing circuits in product-term-based hardware
An optimisation algorithm based on simulated annealing is developed, which targets circuits implemented in a PT-based functional unit of a reconfigurable processor, and shows that a considerable reduction in logic can be achieved. Expand
Compiling Applications for ConCISe: An Example of Automatic HW/SW Partitioning and Synthesis
In the ConCISe project, an embedded programmable processor is augmented with a Reconfigurable Functional Unit (RFU) based on Field-Programmable Logic (FPL), in a technique that aims at beingExpand