# Performance Optimization for the K-Nearest Neighbors Kernel using Strassen ’ s Algorithm

@inproceedings{Rice2017PerformanceOF, title={Performance Optimization for the K-Nearest Neighbors Kernel using Strassen ’ s Algorithm}, author={Leslie Rice}, year={2017} }

Strassen’s algorithm is an algorithm for computing matrix-matrix multiplication using only 7 multiplications rather than the usual 8. Recent advances have shown the benefit of using Strassen’s algorithm to improve the performance of general matrix-matrix multiplication (GEMM) for matrices of varying shapes and sizes. These advances have created an opportunity to incorporate Strassen’s algorithm in other matrixmatrix multiplication like operations. In this paper, we do so for the GSKNN (General…

## References

SHOWING 1-9 OF 9 REFERENCES

Performance optimization for the k-nearest neighbors kernel on x86 architectures

- Computer ScienceSC15: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2015

This work proposes an efficient implementation and performance analysis for the kNN kernel on x86 architectures, and presents an analysis of the algorithm and explains parameter selection, and observes significant speedups when searching for 16 neighbors in a point dataset with 1.6 million points in 64 dimensions.

Strassen's Algorithm Reloaded

- Computer ScienceSC16: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2016

The practical implementation of Strassen's algorithm for matrix-matrix multiplication (DGEMM) requires no workspace beyond buffers already incorporated into conventional high-performance DGEMM implementations and can be plug-compatible with the standard DG EMM interface.

Gaussian elimination is not optimal

- Mathematics
- 1969

t. Below we will give an algorithm which computes the coefficients of the product of two square matrices A and B of order n from the coefficients of A and B with tess than 4 . 7 n l°g7 arithmetical…

A set of level 3 basic linear algebra subprograms

- Computer ScienceTOMS
- 1990

This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations that should provide for efficient and portable…

Generating Families of Practical Fast Matrix Multiplication Algorithms

- Computer Science2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2017

This study shows that Strassen-like fast matrix multiplication can be incorporated into libraries for practical use and demonstrates a performance benefit over conventional GEMM on single core and multi-core systems.

Anatomy of high-performance matrix multiplication

- Computer ScienceTOMS
- 2008

We present the basic principles that underlie the high-performance implementation of the matrix-matrix multiplication that is part of the widely used GotoBLAS library. Design decisions are justified…

BLIS: A Framework for Rapidly Instantiating BLAS Functionality

- Computer ScienceACM Trans. Math. Softw.
- 2015

Preliminary performance of level-2 and level-3 operations is observed to be competitive with two mature open source libraries (OpenBLAS and ATLAS) as well as an established commercial product (Intel MKL).

A framework for practical parallel fast matrix multiplication

- Computer ScienceProceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
- 2015

It is shown that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and Strassen's fast algorithm on modest problem sizes and shapes and that the best choice of fast algorithm depends not only on the size of the matrices but also the shape.

, and Robert A . van de Geijn . Strassen ’ s algorithm reloaded

- Proceedings of the International Conference for High Performance Computing , Networking , Storage and Analysis , SC ’ 16 , pages