Learn More
We describe Asaga, an asynchronous parallel version of the incremental gradient algorithm Saga that enjoys fast linear convergence rates. Through a novel perspective, we revisit and clarify a subtle but important technical issue present in a large fraction of the recent convergence rate proofs for asynchronous parallel optimization algorithms, and propose a(More)
We study the time that the simple exclusion process on the complete graph needs to reach equilibrium in terms of total variation distance. For the graph with n vertices and 1 ≪ k < n/2 particles, we show that the mixing time is of order 1 2 n log min(k, √ n), and that around this time, for any ε, the total variation distance drops from 1 − ε to ε in a time(More)
We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the “learning to search” (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this(More)
  • 1