• Publications
  • Influence
Tensor decompositions for learning latent variable models
TLDR
A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices, and implies a robust and computationally tractable estimation approach for several popular latent variable models.
signSGD: compressed optimisation for non-convex problems
TLDR
SignSGD can get the best of both worlds: compressed gradients and SGD-level convergence rate, and the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models.
Non-convex Robust PCA
TLDR
A new provable method for robust PCA, where the task is to recover a low-rank matrix, which is corrupted with sparse perturbations, which represents one of the few instances of global convergence guarantees for non-convex methods.
A Method of Moments for Mixture Models and Hidden Markov Models
TLDR
This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixture of axis-aligned Gaussian and hidden Markov models).
Learning Latent Tree Graphical Models
TLDR
This work proposes two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes, and applies these algorithms to both discrete and Gaussian random variables.
A Spectral Algorithm for Latent Dirichlet Allocation
TLDR
This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
TLDR
This work proposes policies for distributed learning and access which achieve order-optimal cognitive system throughput under self play, i.e., when implemented at all the secondary users, and proposes a policy whose sum regret grows only slightly faster than logarithmic in the number of transmission slots.
Born Again Neural Networks
TLDR
This work studies KD from a new perspective: rather than compressing models, students are trained parameterized identically to their teachers, and shows significant advantages from transferring knowledge between DenseNets and ResNets in either direction.
Deep Active Learning for Named Entity Recognition
TLDR
By combining deep learning with active learning, the authors can outperform classical methods even with a significantly smaller amount of training data, and this work shows otherwise.
Fourier Neural Operator for Parametric Partial Differential Equations
TLDR
This work forms a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture and shows state-of-the-art performance compared to existing neural network methodologies.
...
1
2
3
4
5
...