• Publications
  • Influence
Co-regularized Multi-view Spectral Clustering
A spectral clustering framework is proposed that achieves this goal by co-regularizing the clustering hypotheses, and two co- regularization schemes are proposed to accomplish this.
Frustratingly Easy Domain Adaptation
We describe an approach to domain adaptation that is appropriate exactly in the case when one has enough “target” data to do slightly better than just using only “source” data. Our approach is
Generalized Multiview Analysis: A discriminative latent space
GMA solves a joint, relaxed QCQP over different feature spaces to obtain a single (non)linear subspace and is a supervised extension of Canonical Correlational Analysis (CCA), which is useful for cross-view classification and retrieval.
Deep Unordered Composition Rivals Syntactic Methods for Text Classification
This work presents a simple deep neural network that competes with and, in some cases, outperforms such models on sentiment analysis and factoid question answering tasks while taking only a fraction of the training time.
A Co-training Approach for Multi-view Spectral Clustering
A spectral clustering algorithm for the multi-view setting where the authors have access to multiple views of the data, each of which can be independently used for clustering, which has a flavor of co-training.
Datasheets for datasets
Documentation to facilitate communication between dataset creators and consumers and consumers is presented.
Search-based structured prediction
Searn is an algorithm for integrating search and learning to solve complex structured prediction problems such as those that occur in natural language, speech, computational biology, and vision and comes with a strong, natural theoretical guarantee: good performance on the derived classification problems implies goodperformance on the structured prediction problem.
Learning Task Grouping and Overlap in Multi-task Learning
This work proposes a framework for multi-task learning that enables one to selectively share the information across the tasks, based on the assumption that task parameters within a group lie in a low dimensional subspace but allows the tasks in different groups to overlap with each other in one or more bases.
Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?
This first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems identifies areas of alignment and disconnect between the challenges faced by teams in practice and the solutions proposed in the fair ML research literature.
Incorporating Lexical Priors into Topic Models
This work proposes a simple and effective way to guide topic models to learn topics of specific interest to a user by providing sets of seed words that a user believes are representative of the underlying topics in a corpus.