#### Filter Results:

#### Publication Year

2005

2016

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

The performance of linear-scaling electronic structure calculations depends critically on matrix sparsity. This article gives an overview of different strategies for removal of small matrix elements, with emphasis on schemes that allow for rigorous control of errors. In particular, a novel scheme is proposed that has significantly smaller computational… (More)

Methods for the removal of small symmetric matrix elements based on the Euclidean norm of the error matrix are presented in this article. In large scale Hartree-Fock and Kohn-Sham calculations it is important to be able to enforce matrix sparsity while keeping errors under control. Truncation based on some unitary-invariant norm allows for control of errors… (More)

Efficient truncation criteria used in multiatom blocked sparse matrix operations for ab initio calculations are proposed. As system size increases, so does the need to stay on top of errors and still achieve high performance. A variant of a blocked sparse matrix algebra to achieve strict error control with good performance is proposed. The presented idea is… (More)

We propose Chunks and Tasks, a parallel programming model built on abstractions for both data and work. The application programmer specifies how data and work can be split into smaller pieces, chunks and tasks, respectively. The Chunks and Tasks library maps the chunks and tasks to physical resources. In this way we seek to combine user friendliness with… (More)

Matrices appearing in Hartree–Fock or density functional theory coming from discretization with help of atom–centered local basis sets become sparse when the separation between atoms exceeds some system–dependent threshold value. Efficient implementation of sparse matrix algebra is therefore essential in large–scale quantum calculations. We describe a… (More)

We present a library for parallel block-sparse matrix-matrix multiplication on distributed memory clusters. By using a quadtree matrix representation data locality is exploited without any prior information about the matrix sparsity pattern. A distributed quadtree matrix representation is straightforward to implement due to our recent development of the… (More)

Purification and minimization methods for computation of the one-particle density matrix are compared. This is done by considering the work needed by each method to achieve a given accuracy in terms of the difference to the exact solution. Simulations employing orthogonal as well as non-orthogonal versions of the methods are performed using both element… (More)

We investigate effects of ordering in blocked matrix–matrix multiplication. We find that sub-matrices do not have to be stored contiguously in memory to achieve near optimal performance. Instead it is the choice of execution order of the submatrix multiplications that leads to a speedup of up to four times for small block sizes. This is in contrast to… (More)