• Publications
  • Influence
Deep Canonical Correlation Analysis
DCCA is introduced, a method to learn complex nonlinear transformations of two views of data such that the resulting representations are highly linearly correlated and Parameters of both transformations are jointly learned to maximize the (regularized) total correlation. Expand
On Deep Multi-View Representation Learning
This work finds an advantage for correlation-based representation learning, while the best results on most tasks are obtained with the new variant, deep canonically correlated autoencoders (DCCAE). Expand
Towards Universal Paraphrastic Sentence Embeddings
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data. Expand
Multi-view clustering via canonical correlation analysis
Under the assumption that the views are un-correlated given the cluster label, it is shown that the separation conditions required for the algorithm to be successful are significantly weaker than prior results in the literature. Expand
From Paraphrase Database to Compositional Paraphrase Model and Back
This work proposes models to leverage the phrase pairs from the Paraphrase Database to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB’s internal scores while simultaneously improving its coverage. Expand
Tailoring Continuous Word Representations for Dependency Parsing
It is found that all embeddings yield significant parsing gains, including some recent ones that can be trained in a fraction of the time of others, suggesting their complementarity. Expand
Deep Variational Canonical Correlation Analysis
We present deep variational canonical correlation analysis (VCCA), a deep multi-view learning model that extends the latent variable model interpretation of linear CCA to nonlinear observation modelsExpand
Charagram: Embedding Words and Sentences via Character n-grams
It is demonstrated that Charagram embeddings outperform more complex architectures based on character-level recurrent and convolutional neural networks, achieving new state-of-the-art performance on several similarity tasks. Expand
Deep convolutional acoustic word embeddings using word-pair side information
This work uses side information in the form of known word pairs to train a Siamese convolutional neural network (CNN): a pair of tied networks that take two speech segments as input and produce their embeddings, trained with a hinge loss that separates same-word pairs and different- word pairs by some margin. Expand
Stochastic optimization for PCA and PLS
Several stochastic approximation methods for PCA and PLS are suggested, and empirical performance of these methods is investigated. Expand