Learn More
Stochastic gradient descent (SGD) is a popular technique for large-scale optimization problems in machine learning. In order to parallelize SGD, minibatch training needs to be employed to reduce the communication cost. However, an increase in minibatch size typically decreases the rate of convergence. This paper introduces a technique based on approximate(More)
Hard Thresholding Pursuit (HTP) is an iterative greedy selection procedure for finding sparse solutions of underdetermined linear systems. This method has been shown to have strong theoretical guarantee and impressive numerical performance. In this paper, we generalize HTP from compressive sensing to a generic problem setup of sparsity-constrained convex(More)
An efficient system for detection of epileptic activity in ambulatory electroencephalogram (EEG) must be sensitive to abnormalities while keeping the false-detection rate to a low level. Such requirements could be fulfilled neither by single stage nor by simple method strategy, due to the extreme variety of EEG morphologies and frequency of artifacts. The(More)
Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is often employed. The standard approach is to uniformly sample a minibatch at each step, which often leads to high(More)
The power method and block Lanczos method are popular numerical algorithms for computing the truncated singular value decomposition (SVD) and eigenvalue decomposition problems. Especially in the literature of randomized numerical linear algebra, the power method is widely applied to improve the quality of randomized sketching, and relative-error bounds have(More)
We address the need of researchers in nanotechnology who desire an increased level of perceptualization of their simulation data by adding haptic feedback to existing multidimensional volumetric visual-izations. Our approach uses volumetric data from simulation of an LED heteronanostructure, and it translates projected values of amplitude of an(More)
Uniform sampling of training data has been commonly used in traditional stochastic optimization algorithms such as Proximal Stochastic Mirror Descent (prox-SMD) and Proximal Stochastic Dual Coordinate Ascent (prox-SDCA). Although uniform sampling can guarantee that the sampled stochastic quantity is an unbiased estimate of the corresponding true quantity,(More)