• Publications
  • Influence
Scalable Kernel Methods via Doubly Stochastic Gradients
TLDR
An approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients" based on the fact that many kernel methods can be expressed as convex optimization problems, which can readily scale kernel methods up to the regimes which are dominated by neural nets.
Diverse Neural Network Learns True Target Functions
TLDR
This paper analyzes one-hidden-layer neural networks with ReLU activation, and shows that despite the non-convexity, Neural networks with diverse units have no spurious local minima and suggests a novel regularization function to promote unit diversity for potentially better generalization.
On the Complexity of Learning Neural Networks
TLDR
A comprehensive lower bound is demonstrated ruling out the possibility that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently, and is robust to small perturbations of the true weights.
Isotonic Hawkes Processes
TLDR
It is shown that Isotonic-Hawkes processes can fit a variety of nonlinear patterns which cannot be captured by conventional Hawkes processes, and achieve superior empirical performance in real world applications.
Communication Efficient Distributed Kernel Principal Component Analysis
TLDR
The algorithm is a clever combination of subspace embedding and adaptive sampling techniques, and it is shown that the algorithm can take as input an arbitrary configuration of distributed datasets, and compute a set of global kernel principal components with relative error guarantees independent of the dimension of the feature space or the total number of data points.
Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Method
TLDR
This work introduces a two-stage learning algorithm for latent variable models that first uses a spectral method of moments algorithm to find a solution that is close to the optimal solution but not necessarily in the valid set of model parameters.
Poly(A) motif prediction using spectral latent features from human DNA sequences
TLDR
This work proposes a novel machine-learning method for poly(A) motif prediction by marrying generative learning and discriminative learning, and developed an efficient spectral algorithm for extracting latent variable information from hidden Markov models for fitting DNA sequence dynamics.
Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients
TLDR
This work proposes a simple, computationally efficient, and memory friendly algorithm based on the "doubly stochastic gradients" to scale up a range of kernel nonlinear component analysis, such as kernel PCA, CCA and SVD, and enjoys theoretical guarantees that it converges at the rate O(1/t) to the global optimum.
Nonparametric Estimation of Multi-View Latent Variable Models
TLDR
A kernel method for learning multi-view latent variable models, allowing each mixture component to be nonparametric, and then the latent parameters are recovered using a robust tensor power method.
Large-scale insider trading analysis: patterns and discoveries
TLDR
This work presents the first academic, large-scale exploratory study of insider filings and related data, based on the complete Form 4 fillings from the U.S. Securities and Exchange Commission, to help financial regulators and policymakers understand the dynamics of the trades, and enable them to adapt their detection strategies toward these dynamics.
...
...