On Evaluating the Generalization of LSTM Models in Formal Languages

@article{Suzgun2018OnET,
  title={On Evaluating the Generalization of LSTM Models in Formal Languages},
  author={Mirac Suzgun and Yonatan Belinkov and S. Shieber},
  journal={ArXiv},
  year={2018},
  volume={abs/1811.01001}
}
  • Mirac Suzgun, Yonatan Belinkov, S. Shieber
  • Published 2018
  • Computer Science
  • ArXiv
  • Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. [...] Key Result We find striking differences in model performances under different training settings and highlight the need for careful analysis and assessment when making claims about the learning capabilities of neural network models.Expand Abstract

    Figures, Tables, and Topics from this paper.

    Analysis Methods in Neural Language Processing: A Survey
    • 119
    • PDF
    LSTM Networks Can Perform Dynamic Counting
    • 16
    • PDF
    Theoretical Limitations of Self-Attention in Neural Sequence Models
    • 16
    • PDF
    Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages
    • 7
    • PDF
    Measuring Arithmetic Extrapolation Performance
    • 5
    • PDF
    On the Ability of Self-Attention Networks to Recognize Counter Languages
    Neural Arithmetic Units
    • 5
    • PDF
    The EOS Decision and Length Extrapolation
    • 1
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 25 REFERENCES
    Sequence to Sequence Learning with Neural Networks
    • 10,547
    • PDF
    Long Short-Term Memory
    • 31,078
    • PDF
    LSTM: A Search Space Odyssey
    • 2,064
    • PDF
    LSTM recurrent networks learn simple context-free and context-sensitive languages
    • 432
    • PDF
    Long short-term memory recurrent neural network architectures for large scale acoustic modeling
    • 1,480
    • PDF
    Neural Machine Translation by Jointly Learning to Align and Translate
    • 12,937
    • PDF
    On the Practical Computational Power of Finite Precision RNNs for Language Recognition
    • 93
    • Highly Influential
    • PDF
    A Recurrent Neural Network that Learns to Count
    • 193
    • Highly Influential
    • PDF