The LINPACK Benchmark: past, present and future

@article{Dongarra2003TheLB,
  title={The LINPACK Benchmark: past, present and future},
  author={J. Dongarra and P. Luszczek and A. Petitet},
  journal={Concurrency and Computation: Practice and Experience},
  year={2003},
  volume={15}
}
This paper describes the LINPACK Benchmark and some of its variations commonly used to assess the performance of computer systems. Aside from the LINPACK Benchmark suite, the TOP500 and the HPL codes are presented. The latter is frequently used to obtained results for TOP500 submissions. Information is also given on how to interpret the results of the benchmark and how the results fit into the performance evaluation process. Copyright © 2003 John Wiley & Sons, Ltd. 
Complex version of high performance computing LINPACK benchmark (HPL)
TLDR
The results show that the modified HPL software brings a significant increase in the performance of the solver when simulating the highest resolution experiments thus far configured, achieving 87.5 TFLOPS on over 20 000 processors on the Cray XT4. Expand
Complex version of high performance computing LINPACK benchmark (HPL)
TLDR
The results show that the modified HPL software brings a significant increase in the performance of the solver when simulating the highest resolution experiments thus far configured, achieving 87.5 TFLOPS on over 20 000 processors on the Cray XT4. Expand
A Few of the Most Popular Tools for Evaluating Supercomputers
  • Gang Xie, Ya-lin Zhang
  • Computer Science
  • 2018 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES)
  • 2018
TLDR
The purpose, significance and method of benchmarking supercomputers, the state of the art, and a few of the mainstream benchmarks for supercomputer evaluation are discussed. Expand
Implementation of the Mixed-Precision High Performance LINPACK Benchmark on the CELL Processor
TLDR
This paper describes in detail the implementation of code to solve linear system of equations using Gaussian elimination in single precision with iterative refinement of the solution to the full double precision accuracy. Expand
Benchmarking and Analysis of High Productibility Computing (HPCS)
TLDR
The overall objective of this effort was to survey a number of DoD related applications in an effort to ascertain their needs with respect to determining what metrics exist, what metrics need to be developed. Expand
Measuring the performance of parallel computers with distributed memory
TLDR
The results obtained via the de-facto standard LINPACK benchmark suite are shown to be weakly related to the efficiency of applied parallel programs and models and methods proposed by V. M. Glushkov in the late 1970s become topical again. Expand
The LAPACK for clusters project: an example of self adapting numerical software
TLDR
The context, design, and recent development of the LAPACK for clusters (LFC) project, which has been developed in the framework of self-adapting numerical software (SANS), are described. Expand
Self-adapting software for numerical linear algebra and LAPACK for clusters
TLDR
The context, design, and recent development of the LAPACK for clusters (LFC) project is described, developed in the framework of Self-Adapting Numerical Software (SANS), which believes such an approach can deliver the convenience and ease of use of existing sequential environments bundled with the power and versatility of highly tuned parallel codes that execute on clusters. Expand
The LINPACK Benchmark on a Multi-Core Multi-FPGA System
TLDR
Results show, when using small sets of data, one FPGA can provide a speedup of 1.94 over a high-end process or running the LINPACK Benchmark with Level 1 BLAS, however, there is still opportuni ty to do better, especially when scaling to larger systems. Expand
Self-Adapting Software for Numerical Linear Algebra Library Routines on Clusters
TLDR
The context, design, and recent development of the LAPACK for Clusters (LFC) project is described, developed in the framework of Self-Adapting Numerical Software (SANS), which it is argued can deliver the convenience and ease of use of existing sequential environments bundled with the power and versatility of highly-tuned parallel codes that execute on clusters. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 121 REFERENCES
LINPACK user's guide
We provide further discussion of the use of least-squares techniques for this and G. W. Stewart, LINPACK Users' Guide (Society of Industrial and Applied. User's Guide for Intel® Math Kernel LibraryExpand
Implementation of the BLAS level 3 and LINPACK Benchmark on the AP1000
TLDR
An implementation of Level 3 of the Basic Linear Algebra Subprogram (BLAS-3) library and the LINPACK Benchmark on the Fujitsu AP1000 is described, which enables a performance of 85-90% of the AP1000’s theoretical peak speed for the BLAS Level 3 procedures and up to 80% for theLINPACK benchmark. Expand
New trends in high performance computing
TLDR
The automatically tuned linear algebra software (ATLAS) project is described, as well as the fundamental principles that underly it, with the present emphasis on the basic linear algebra subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library. Expand
Performance of various computers using standard linear equations software
TLDR
This report compares the performance of different computer systems in solving dense systems of linear equations, ranging from a CRAY Y-MP to scientific workstations such as the Apollo and Sun to IBM PCs. Expand
High Performance Software on Intel Pentium Pro Processors or Micro-Ops to TeraFLOPS
TLDR
A model into the efforts on obtaining the world's first TeraFLOP MP LINPACK run (on the Intel ASCI Option Red Supercomputer), based on Pentium Pro processor technology, and optimization strategies used to achieve high performance on scientific applications. Expand
Basic Linear Algebra Subprograms for Fortran Usage
TLDR
A package of 38 low level subprograms for many of the basic operations of numerical linear algebra is presented, intended to be used with FORTRAN. Expand
Parallel implementation of BLAS: general techniques for Level 3 BLAS
TLDR
It is shown that the techniques used for the matrix-matrix multiplication naturally extend to all important level 3 BLAS and thus this approach becomes an enabling technology for efficient parallel implementation of these routines and libraries that use BLAS. Expand
Sparse matrix calculations on the CRAY-2
TLDR
This work has been testing kernels and codes on a CRAY-2 prior to the delivery of a machine to Harwell in 1987 and reports some results on the solution of sparse equations which indicate that high efficiency can be obtained. Expand
The Multicomputer Toolbox approach to concurrent BLAS and LACS
TLDR
There is limited leverage in LACS per se as a stand-alone message-passing standard, and it is proposed that needed capabilities instead be integrated in a general, application-level message passing standard, focusing attention on CBLAS and large-scale application needs. Expand
Parallel LU Decomposition on a Transputer Network
TLDR
A general Cartesian data distribution scheme is presented which contains many of the existing distribution schemes as special cases and is used to prove optimality of load balance for the grid distribution. Expand
...
1
2
3
4
5
...