#### Filter Results:

- Full text PDF available (276)

#### Publication Year

1985

2017

- This year (9)
- Last 5 years (92)
- Last 10 years (157)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- James Demmel
- 1997

- Vasily Volkov, James Demmel
- SC
- 2008

We present performance results for dense linear algebra using recent NVIDIA GPUs. Our matrix-matrix multiply routine (GEMM) runs up to 60% faster than the vendor’s implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80–90% of the peak GEMM rate. Our parallel LU running on two GPUs achieves up… (More)

- James Demmel, Stanley C. Eisenstat, John R. Gilbert, Xiaoye S. Li, Joseph W. H. Liu
- SIAM J. Matrix Analysis Applications
- 1999

We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. We introduce the notion of unsymmetric supernodes to perform most of the numerical computation in dense matrix kernels. We introduce unsymmetric supernode-panel updates and two-dimensional data partitioning to… (More)

We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific… (More)

The Optimized Sparse Kernel Interface (OSKI) is a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. These kernels include sparse matrix-vector multiply and sparse triangular solve, among others. The primary aim of this interface is to hide the complex… (More)

- James Demmel
- 1992

We show that Jacobi's method (with a proper stopping criterion) computes small eigenvalues of symmetric positive de nite matrices with a uniformly better relative accuracy bound than QR, divide and conquer, traditional bisection, or any algorithm which rst involves tridiagonalizing the matrix. In fact, modulo an assumption based on extensive numerical… (More)

- James Demmel
- SIAM Review
- 1990

- Ed Anderson, Zhaojun Bai, +7 authors Danny C. Sorensen
- SC
- 1990

The goal of the LAPACK project is to design and implement a portable linear algebra library for efficient use on a variety of high-performance computers. The library is based on the widely used LINPACK and EISPACK packages for solving linear equations, eigenvalue problems, and linear least-squares problems, but extends their functionality in a number of… (More)

- Sukun Kim, Shamim Pakzad, +4 authors Martin Turon
- 2007 6th International Symposium on Information…
- 2007

A Wireless Sensor Network (WSN) for Structural Health Monitoring (SHM) is designed, implemented, deployed and tested on the 4200ft long main span and the south tower of the Golden Gate Bridge (GGB). Ambient structural vibrations are reliably measured at a low cost and without interfering with the operation of the bridge. Requirements that SHM imposes on WSN… (More)

- L. Susan Blackford, Jaeyoung Choi, +9 authors R. Clinton Whaley
- SC
- 1995

This paper outlines the content and performance of ScaLAPACK, a collection of mathematical software for linear algebra computations on distributed memory computers. The importance of developing standards for computational and message passing interfaces is discussed. We present the different components and building blocks of ScaLAPACK, and indicate the… (More)