Optimization And Profiling Of The Cache Performance Of Parallel Lattice Boltzmann Codes

When designing and implementing highly efficient scientific applications for parallel computers such as clusters of workstations, it is inevitable to consider and to optimize the single-CPU performance of the codes. For this purpose, it is particularly important that the codes respect the hierarchical memory designs that computer architects employ in order… CONTINUE READING

11 Figures & Tables



Citations per Year

108 Citations

Semantic Scholar estimates that this publication has 108 citations based on the available data.

See our FAQ for additional information.