Tor Sørevik

Learn More
Achieving an even load balance with a low communication overhead is a fundamental task in parallel computing. In this paper we consider the problem of partitioning an array into a number of blocks such that the maximum amount of work in any block is as low as possible. We review diierent proposed schemes for this problem and the complexity of their(More)
The problem of partitioning a sequence of n real numbers into p intervals is considered. The goal is to find a partition such that the cost of the most expensive interval measured with a cost function f is minimized. An efficient algorithm which solves the problem in time O(p(n − p) log p) is developed. The algorithm is based on finding a sequence of(More)
We introduce the class of skew-circulant lattice rules. These are s-dimensional lattice rules that may be generated by the rows of an s×s skewcirculant matrix. (This is a minor variant of the familiar circulant matrix.) We present briefly some of the underlying theory of these matrices and rules. We are particularly interested in finding rules of specified(More)
Many problems have multiple layers of parallelism. The outer-level may consist of few and coarse-grained tasks. Next, each of these tasks may also be rich in parallelism, and be split into a number of fine-grained tasks, which again may consist of even finer subtasks, and so on. Here we argue and demonstrate by examples that utilizing multiple layers of(More)
In this paper we discuss the use of nested parallelism. Our claim is that if the problem naturally possesses multiple levels of parallelism, then applying parallelism to all levels may significantly enhance the scalability of your algorithm. This claim is sustained by numerical experiments. We also discuss how to implement multi-level parallelism using(More)
We describe a spectral method for the direct numerical calculation of the time-dependent Schrödinger equation described in hyperspherical coordinates. The method is based on the split-step technique where the wavefunction is expanded in the appropriate eigenfunctions for the partial operators, making the time integration efficient, accurate and simple. The(More)
In this paper we describe some of the salient features of our search program for finding good lattices. The reciprocals of these lattices are used in lattice integration rules, of which number theoretic rules form a major subset. We describe algorithms for ϱ(⋎), the Zaremba index (or figure of merit) of an integer lattice ⋎. We describe a search algorithm(More)
In this paper we describe how to apply fine grain parallelism to augmenting path algorithms for the dense linear assignment problem. We prove by doing that the technique we suggest, can be efficiently implemented on commercial available, massively parallel computers. Using n processors, our method reduces the computational complexity from the sequentialO(n(More)