Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization
@article{Yadav2022EfficientNN, title={Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization}, author={Nishant Yadav and Nicholas Monath and Rico Angell and Manzil Zaheer and Andrew McCallum}, journal={ArXiv}, year={2022}, volume={abs/2210.12579} }
Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or `2-distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders’ high computational cost typically limits…
Figures and Tables from this paper
References
SHOWING 1-10 OF 50 REFERENCES
Scalable Zero-shot Entity Linking with Dense Entity Retrieval
- Computer ScienceEMNLP
- 2020
This paper introduces a simple and effective two-stage approach for zero-shot linking, based on fine-tuned BERT architectures, and shows that it performs well in the non-zero-shot setting, obtaining the state-of-the-art result on TACKBP-2010.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Computer ScienceNAACL
- 2019
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Adam: A Method for Stochastic Optimization
- Computer ScienceICLR
- 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
CUR matrix decompositions for improved data analysis
- Computer ScienceProceedings of the National Academy of Sciences
- 2009
An algorithm is presented that preferentially chooses columns and rows that exhibit high “statistical leverage” and exert a disproportionately large “influence” on the best low-rank fit of the data matrix, obtaining improved relative-error and constant-factor approximation guarantees in worst-case analysis, as opposed to the much coarser additive-error guarantees of prior work.
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
- Computer ScienceNAACL
- 2021
This work presents a simple yet efficient data augmentation strategy called Augmented SBERT, where the cross-encoder is used to label a larger set of input pairs to augment the training data for the bi-encoding, and shows that, in this process, selecting the sentence pairs is non-trivial and crucial for the success of the method.
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering
- Computer ScienceNAACL
- 2021
This work proposes an optimized training approach, called RocketQA, to improving dense passage retrieval, which significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions and demonstrates that the performance of end-to-end QA can be improved based on theRocketQA retriever.
DiPair: Fast and Accurate Distillation for Trillion-ScaleText Matching and Pair Modeling
- Computer ScienceFINDINGS
- 2020
This work proposes DiPair — a novel framework for distilling fast and accurate models on text pair tasks that is both highly scalable and offers improved quality-speed tradeoffs.
Accelerating Large-Scale Inference with Anisotropic Vector Quantization
- Computer ScienceICML
- 2020
A family of anisotropic quantization loss functions is developed that leads to a new variant of vector quantization that more greatly penalizes the parallel component of a datapoint's residual relative to its orthogonal component.
Zero-Shot Entity Linking by Reading Entity Descriptions
- Computer ScienceACL
- 2019
It is shown that strong reading comprehension models pre-trained on large unlabeled data can be used to generalize to unseen entities and proposed domain-adaptive pre-training (DAP) is proposed to address the domain shift problem associated with linking unseen entities in a new domain.
Billion-Scale Similarity Search with GPUs
- Computer ScienceIEEE Transactions on Big Data
- 2021
This paper proposes a novel design for an inline-formula that enables the construction of a high accuracy, brute-force, approximate and compressed-domain search based on product quantization, and applies it in different similarity search scenarios.