• Publications
  • Influence
From Word Embeddings To Document Distances
TLDR
It is demonstrated on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the Word Mover's Distance metric leads to unprecedented low k-nearest neighbor document classification error rates. Expand
Counterfactual Fairness
TLDR
This paper develops a framework for modeling fairness using tools from causal inference and demonstrates the framework on a real-world problem of fair prediction of success in law school. Expand
Grammar Variational Autoencoder
TLDR
Surprisingly, it is shown that not only does the model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. Expand
Bayesian Optimization with Inequality Constraints
TLDR
This work presents constrained Bayesian optimization, which places a prior distribution on both the objective and the constraint functions, and evaluates this method on simulated and real data, demonstrating that constrainedBayesian optimization can quickly find optimal and feasible points, even when small feasible regions cause standard methods to fail. Expand
Supervised Word Mover's Distance
TLDR
This paper proposes an efficient technique to learn a supervised metric, which it is called the Supervised-WMD (S-W MD) metric, and provides an arbitrarily close approximation of the original WMD distance that results in a practical and efficient update rule. Expand
GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution
TLDR
This work evaluates the performance of GANs based on recurrent neural networks with Gumbel-softmax output distributions in the task of generating sequences of discrete elements with a continuous approximation to a multinomial distribution parameterized in terms of the softmax function. Expand
Stochastic Neighbor Compression
TLDR
Stochastic Neighbor Compression is presented, an algorithm to compress a dataset for the purpose of k-nearest neighbor (kNN) classification that is complementary to existing state-of-the-art algorithms to speed up kNN classification and leads to substantial further improvements. Expand
Cost-Sensitive Tree of Classifiers
TLDR
This paper addresses the challenge of balancing the test-time cost and the classifier accuracy in a principled fashion by constructing a tree of classifiers, through which test inputs traverse along individual paths. Expand
Classifier cascades and trees for minimizing feature evaluation cost
TLDR
Two algorithms are developed to efficiently balance the performance with the test-time cost of a classifier in real-world settings, and find their trained classifiers lead to high accuracies at a small fraction of the computational cost. Expand
TAPAS: Tricks to Accelerate (encrypted) Prediction As a Service
TLDR
Ideas from the machine learning literature, particularly work on binarization and sparsification of neural networks, together with algorithmic tools to speed-up and parallelize computation using encrypted data are combined. Expand
...
1
2
3
4
5
...