Emanuel H. Rubensson

Learn More
The performance of linear-scaling electronic structure calculations depends critically on matrix sparsity. This article gives an overview of different strategies for removal of small matrix elements, with emphasis on schemes that allow for rigorous control of errors. In particular, a novel scheme is proposed that has significantly smaller computational(More)
We propose Chunks and Tasks, a parallel programming model built on abstractions for both data and work. The application programmer specifies how data and work can be split into smaller pieces, chunks and tasks, respectively. The Chunks and Tasks library maps the chunks and tasks to physical resources. In this way we seek to combine user friendliness with(More)
Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is(More)
Methods for the removal of small symmetric matrix elements based on the Euclidean norm of the error matrix are presented in this article. In large scale Hartree-Fock and Kohn-Sham calculations it is important to be able to enforce matrix sparsity while keeping errors under control. Truncation based on some unitary-invariant norm allows for control of errors(More)
Matrices appearing in Hartree–Fock or density functional theory coming from discretization with help of atom–centered local basis sets become sparse when the separation between atoms exceeds some system–dependent threshold value. Efficient implementation of sparse matrix algebra is therefore essential in large–scale quantum calculations. We describe a(More)
We present a library for parallel block-sparse matrix-matrix multiplication on distributed memory clusters. By using a quadtree matrix representation data locality is exploited without any prior information about the matrix sparsity pattern. A distributed quadtree matrix representation is straightforward to implement due to our recent development of the(More)
We investigate effects of ordering in blocked matrix–matrix multiplication. We find that sub-matrices do not have to be stored contiguously in memory to achieve near optimal performance. Instead it is the choice of execution order of the submatrix multiplications that leads to a speedup of up to four times for small block sizes. This is in contrast to(More)
A hierarchic sparse matrix data structure for Hartree-Fock/Kohn-Sham calculations is presented. The data structure makes the implementation of matrix manipulations needed for large systems faster, easier, and more maintainable without loss of performance. Algorithms for symmetric matrix square and inverse Cholesky decomposition within the hierarchic(More)
  • 1