Learn More
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove(More)
We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms—a(More)
Mining of transliterations from comparable or parallel text can enhance natural language processing applications such as machine translation and cross language information retrieval. This paper presents an enhanced transliteration mining technique that uses a generative graph reinforcement model to infer mappings between source and target character(More)
Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as(More)
A single, stationary topic model such as latent Dirichlet allocation is inappropriate for modeling corpora that span long time periods , as the popularity of topics is likely to change over time. A number of models that incorporate time have been proposed, but in general they either exhibit limited forms of temporal variation, or require computation-ally(More)
We develop randomized block coordinate descent (CD) methods for linearly constrained convex optimization. Unlike other large-scale CD methods, we do not assume the constraints to be separable, but allow them be coupled linearly. To our knowledge, ours is the first CD method that allows linear coupling constraints, without making the global iteration(More)
Collaborative software is gaining pace as a vital means of information sharing between users. This paper discusses one of the key challenges that affect such systems which is identifying spammers. We discuss potential features that describe the system's users and illustrate how we can use those features in order to determine potential spamming users through(More)
Coordinate descent methods are enjoying renewed interest due to their simplicity and success in many machine learning applications. Given recent theoretical results on random coordinate descent with linear coupling constraints, we develop a software architecture for this class of algorithms. A software architecture has to (1) maintain solution feasibility,(More)