Tensor decompositions for learning latent variable models
- Anima Anandkumar, Rong Ge, Daniel J. Hsu, S. Kakade, Matus Telgarsky
- Computer Science, MathematicsJournal of machine learning research
- 28 October 2012
A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices, and implies a robust and computationally tractable estimation approach for several popular latent variable models.
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
- Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, J. Álvarez, Ping Luo
- Computer ScienceNeural Information Processing Systems
- 31 May 2021
SegFormer is presented, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perceptron (MLP) decoders and shows excellent zero-shot robustness on Cityscapes-C.
Fourier Neural Operator for Parametric Partial Differential Equations
- Zong-Yi Li, Nikola B. Kovachki, Anima Anandkumar
- Computer ScienceInternational Conference on Learning…
- 18 October 2020
This work forms a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture and shows state-of-the-art performance compared to existing neural network methodologies.
signSGD: compressed optimisation for non-convex problems
- Jeremy Bernstein, Yu-Xiang Wang, K. Azizzadenesheli, Anima Anandkumar
- Computer ScienceInternational Conference on Machine Learning
- 13 February 2018
SignSGD can get the best of both worlds: compressed gradients and SGD-level convergence rate, and the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models.
Born Again Neural Networks
- Tommaso Furlanello, Zachary Chase Lipton, M. Tschannen, L. Itti, Anima Anandkumar
- Computer ScienceInternational Conference on Machine Learning
- 12 May 2018
This work studies KD from a new perspective: rather than compressing models, students are trained parameterized identically to their teachers, and shows significant advantages from transferring knowledge between DenseNets and ResNets in either direction.
A Method of Moments for Mixture Models and Hidden Markov Models
- Anima Anandkumar, Daniel J. Hsu, S. Kakade
- Computer ScienceAnnual Conference Computational Learning Theory
- 3 March 2012
This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixture of axis-aligned Gaussian and hidden Markov models).
Non-convex Robust PCA
- Praneeth Netrapalli, U. Niranjan, S. Sanghavi, Anima Anandkumar, Prateek Jain
- Computer ScienceNIPS
- 28 October 2014
A new provable method for robust PCA, where the task is to recover a low-rank matrix, which is corrupted with sparse perturbations, which represents one of the few instances of global convergence guarantees for non-convex methods.
Learning Latent Tree Graphical Models
- M. Choi, V. Tan, Anima Anandkumar, A. Willsky
- Computer ScienceJournal of machine learning research
- 14 September 2010
This work proposes two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes, and applies these algorithms to both discrete and Gaussian random variables.
A Spectral Algorithm for Latent Dirichlet Allocation
- Anima Anandkumar, Dean Phillips Foster, Daniel J. Hsu, S. Kakade, Yi-Kai Liu
- Computer ScienceAlgorithmica
- 30 April 2012
This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
- Anima Anandkumar, Nithin Michael, A. Tang, A. Swami
- Computer ScienceIEEE Journal on Selected Areas in Communications
- 8 June 2010
This work proposes policies for distributed learning and access which achieve order-optimal cognitive system throughput under self play, i.e., when implemented at all the secondary users, and proposes a policy whose sum regret grows only slightly faster than logarithmic in the number of transmission slots.
...
...