Learn More
Word embeddings are ubiquitous in NLP and information retrieval, but it’s unclear what they represent when the word is polysemous, i.e., has multiple senses. Here it is shown that multiple word senses reside in linear superposition within the word embedding and can be recovered by simple sparse coding. The success of the method —which applies to several(More)
Semantic word embeddings represent the meaning of a word via a vector, and are created by diverse methods. Many use nonlinear operations on co-occurrence statistics, and have hand-tuned hyperparameters and reweighting methods. This paper proposes a new generative model, a dynamic version of the log-linear topic model of Mnih and Hinton (2007). The(More)
Recent studies have shown that accounting for intraspecific trait variation (ITV) may better address major questions in community ecology. However, a general picture of the relative extent of ITV compared to interspecific trait variation in plant communities is still missing. Here, we conducted a meta-analysis of the relative extent of ITV within and among(More)
A central problem in ranking is to design a ranking measure for evaluation of ranking functions. In this paper we study, from a theoretical perspective, the widely used Normalized Discounted Cumulative Gain (NDCG)-type ranking measures. Although there are extensive empirical studies of NDCG, little is known about its theoretical properties. We first show(More)
We study k-GenEV, the problem of finding the top k generalized eigenvectors, and k-CCA, the problem of finding the top k vectors in canonicalcorrelation analysis. We propose algorithms LazyEV and LazyCCA to solve the two problems with running times linearly dependent on the input size and on k. Furthermore, our algorithms are doubly-accelerated: our running(More)