• Publications
  • Influence
Energy-efficient sorting using solid state disks
TLDR
Using a low-power processor, solid state disks, and efficient algorithms, this work beats the current records in the JouleSort benchmark for 10GB to 1 TB of data by factors of up to 5.1.
Building a parallel pipelined external memory algorithm library
TLDR
STXXL library provides a framework for external memory algorithms with an easy-to-use interface for large and fast hard disks, but the clock speed of processors cannot keep up with the increasing bandwidth of parallel disks.
A CUDA fast multipole method with highly efficient M2L far field evaluation
TLDR
This work presents a CUDA-accelerated, C++ FMM implementation for multi particle systems with r − 1 potential that are found, e.g. in biomolecular simulations, and proposes, implements and benchmark three different M2L parallelization approaches.
Status of the Undulator Systems for the European X-ray Free Electron Laser
For the European X-ray Free Electron Laser (XFEL.EU) three undulator systems with a net magnetic length of 455 meters are planned, employing 91 undulator segments each 5m long. They are gap variable
I/O-efficient approximation of graph diameters by parallel cluster growing — A first experimental study
TLDR
An implementation of Meyer's proposed parametrized algorithm to compute an approximation of graph diameter with fewer I/Os than that required for exact BFS traversal of the graph is presented and it is confirmed that there are graph-classes where the parametRIzed approach runs into bad approximation ratios just as the theoretical analysis in (Meyer, 2008) suggests.
On Computational Models for Flash Memory Devices
TLDR
A broad range of existing external-memory algorithms and data structures based on the merging paradigm can be adapted efficiently into the unit-cost model, and the theoretical analysis of algorithms on these models corresponds to the empirical behavior of algorithms when using solid-state disks as external memory.
A structural analysis of the A5/1 state transition graph
TLDR
Efficient algorithms to analyze the cycle structure of the graph induced by the state transition function of the A5/1 stream cipher used in GSM mobile phones are described and structural results for the full graph are presented for the first time.
The Large Scale European XFEL Control System: Overview and Status of the Commissioning
The European XFEL is a 3.4 km long X-ray Free Electron Laser in the final construction and commissioning phase in Hamburg. It will produce 27000 bunches per second at 17.5 GeV. Early 2015 a first
Accelerating an FMM-Based Coulomb Solver with GPUs
TLDR
This work reports on the parallelization of those operators that have been implemented for a GPU cluster to speed up the FMM calculations of Coulomb interactions.
Eventify: Event-Based Task Parallelism for Strong Scaling
TLDR
This work introduces event-based task parallelism to solve the performance and programmability issues for algorithms that exhibit fine-grained task Parallelism and contain repetitive task patterns and shows how these event lists are processed by a task engine that reuses user-defined, algorithmic data structures.
...
...