Learn More
Analytic performance models are essential for understanding the performance characteristics of loop kernels, which consume a major part of CPU cycles in computational science. Starting from a validated performance model one can infer the relevant hardware bottlenecks and promising optimization opportunities. Unfortunately, analytic performance modeling is(More)
In this paper we present our findings from parallelizing a material science application which simulates dendritic growth in molten metal alloys. The simulation itself is based on an iterative 2D meshfree model. The simulation cells are tightly coupled and depend on neighbors in a relatively large radius, so the code turned out to be communication bound. We(More)
Achieving optimal program performance requires deep insight into the interaction between hardware and software. For software developers without an in-depth background in computer architecture, understanding and fully utilizing modern architectures is close to impossible. Analytic loop performance modeling is a useful way to understand the relevant(More)
  • 1