• Corpus ID: 244799272

Dynamic Sparse Tensor Algebra Compilation

@article{Chou2021DynamicST,
  title={Dynamic Sparse Tensor Algebra Compilation},
  author={Stephen Chou and Saman P. Amarasinghe},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.01394}
}
This paper shows how to generate efficient tensor algebra code that compute on dynamic sparse tensors, which have sparsity structures that evolve over time. We propose a language for precisely specifying recursive, pointer-based data structures, and we show how this language can express a wide range of dynamic data structures that support efficient modification, such as linked lists, binary search trees, and B-trees. We then describe how, given high-level specifications of such data structures… 

References

SHOWING 1-10 OF 58 REFERENCES
Tensor Algebra Compilation with Workspaces
TLDR
The results show that the workspace transformation brings the performance of these kernels on par with hand-optimized implementations, and enables generating sparse matrix multiplication and MTTKRP with sparse output, neither of which were supported by prior tensor algebra compilers.
Compilation of sparse array programming models
TLDR
This paper shows how to compile sparse array programming languages using a compiler strategy that generalizes prior work in the literature on sparse tensor algebra compilation to support any function applied to sparse arrays, instead of only addition and multiplication.
Format abstraction for sparse tensor algebra compilers
TLDR
An interface that describes formats in terms of their capabilities and properties is developed, and a modular code generator design makes it simple to add support for new tensor formats, and the performance of the generated code is competitive with hand-optimized implementations.
A Relational Approach to the Automatic Generation of Sequential Sparse matrix Codes
TLDR
This thesis presents techniques for automatically generating sparse codes from dense matrix algorithms through a process called sparse compilation, and discusses the Bernoulli Sparse Compiler, which provides a novel mechanism that allows the user to extend its repertoire of sparse matrix storage formats.
Data-Parallel Language for Correct and Efficient Sparse Matrix Codes
TLDR
LL, a small functional language suitable for implementing operations on sparse matrices, is presented, and a compiler for LL programs that generates efficient, parallel C code is described, which facilitates a straightforward, syntax-directed translation of code.
The tensor algebra compiler
TLDR
The first compiler technique to automatically generate kernels for any compound tensor algebra operation on dense and sparse tensors is introduced, which is competitive with best-in-class hand-optimized kernels in popular libraries, while supporting far more tensor operations.
Efficient and scalable computations with sparse tensors
TLDR
This paper describes new sparse tensor storage formats that provide storage benefits and are flexible and efficient for performing tensor computations and proposes an optimization that improves data reuse and reduces redundant or unnecessary computations in tensor decomposition algorithms.
SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication
Multi-dimensional arrays, or tensors, are increasingly found in fields such as signal processing and recommender systems. Real-world tensors can be enormous in size and often very sparse. There is a
Relational Algebraic Techniques for the Synthesis of Sparse Matrix Programs
TLDR
This dissertation presents a relational algebraic model for automatically generating efficient sparse codes starting with dense matrix codes and specification of sparse matrix formats and presents experimental data that demonstrates that the code generated by the compiler achieves performance competitive with that of hand-written codes for important computational kernels.
Hornet: An Efficient Data Structure for Dynamic Sparse Graphs and Matrices on GPUs
TLDR
Hornet is a novel data representation that targets dynamic data problems, which is scalable with the input size, and does not require any data re-allocation or re-initialization during the data evolution.
...
1
2
3
4
5
...