Revisit Long Short-Term Memory : An Optimization Perspective

  title={Revisit Long Short-Term Memory : An Optimization Perspective},
  author={Qi Lyu and Jun Zhu},
Long Short-Term Memory (LSTM) is a deep recurrent neural network architecture with high computational complexity. Contrary to the standard practice to train LSTM online with stochastic gradient descent (SGD) methods, we propose a matrix-based batch learning method for LSTM with full Backpropagation Through Time (BPTT). We further solve the state drifting issues as well as improving the overall performance for LSTM using revised activation functions for gates. With these changes, advanced… CONTINUE READING
Highly Cited
This paper has 18 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 13 extracted citations

Similar Papers

Loading similar papers…