• Publications
  • Influence
Semantic Parsing on Freebase from Question-Answer Pairs
TLDR
In this paper, we train a semantic parser that scales up to Freebase. Expand
  • 1,027
  • 184
  • PDF
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
TLDR
We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning. Expand
  • 183
  • 31
  • PDF
Measuring the Effects of Data Parallelism on Neural Network Training
TLDR
We study the effects of increasing the batch size on training time, as measured by the number of steps necessary to reach a goal out-of-sample error. Expand
  • 152
  • 18
  • PDF
Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization
TLDR
This is an extended and updated version of our conference paper that appeared in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 2015. Expand
  • 117
  • 12
  • PDF
Competing with the Empirical Risk Minimizer in a Single Pass
TLDR
In many estimation problems, e.g. linear and logistic regression, we wish to minimize an unknown objective given only unbiased samples of the objective function. Expand
  • 77
  • 10
  • PDF
Principal Component Projection Without Principal Component Analysis
TLDR
We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Expand
  • 21
  • 7
  • PDF
Compiling machine learning programs via high-level tracing
TLDR
We describe JAX, a domain-specific tracing JIT compiler for generating high-performance accelerator code from pure Python and Numpy machine learning programs, capable of scaling to multi-core Cloud TPUs. Expand
  • 46
  • 6
  • PDF
Simple MAP Inference via Low-Rank Relaxations
TLDR
We focus on the problem of maximum a posteriori (MAP) inference in Markov random fields with binary variables and pairwise interactions with low-rank relaxations that interpolate between the discrete problem and its full-rank semidefinite relaxation. Expand
  • 16
  • 3
  • PDF
Parente2: a fast and accurate method for detecting identity by descent.
Identity-by-descent (IBD) inference is the problem of establishing a genetic connection between two individuals through a genomic segment that is inherited by both individuals from a recent commonExpand
  • 23
  • 2
The advantages of multiple classes for reducing overfitting from test set reuse
TLDR
We show a new upper bound of $\tilde O(\max\{\sqrt{k\log(n)/(mn)},k/n\})$ on the worst-case bias that any attack can achieve in a prediction problem with $m$ classes. Expand
  • 13
  • 1
  • PDF