• Corpus ID: 245329838

Sublinear Time Approximation of Text Similarity Matrices

  title={Sublinear Time Approximation of Text Similarity Matrices},
  author={Archan Ray and Nicholas Monath and Andrew McCallum and Cameron Musco},
We study algorithms for approximating pairwise similarity matrices that arise in natural language processing. Generally, computing a similarity matrix for n data points requires Ω( n 2 ) similarity computations. This quadratic scaling is a significant bottleneck, especially when similarities are computed via expensive functions, e.g., via transformer models. Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the… 


