Language Modeling with Deep Transformers

@inproceedings{Irie2019LanguageMW,
  title={Language Modeling with Deep Transformers},
  author={Kazuki Irie and Albert Zeyer and R. Schl{\"u}ter and H. Ney},
  booktitle={INTERSPEECH},
  year={2019}
}
  • Kazuki Irie, Albert Zeyer, +1 author H. Ney
  • Published in INTERSPEECH 2019
  • Computer Science
  • We explore deep autoregressive Transformer models in language modeling for speech recognition. [...] Key Result We find that removing the positional encoding even slightly improves the performance of these models.Expand Abstract
    Acoustothermal heating of polydimethylsiloxane microfluidic system
    42
    An Empirical Study of Efficient ASR Rescoring with Transformers
    How Much Self-Attention Do We Needƒ Trading Attention for Feed-Forward Layers
    2
    Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition
    1
    Long-span language modeling for speech recognition
    1
    Pseudolikelihood Reranking with Masked Language Models
    1
    An Empirical Study of Transformer-Based Neural Language Model Adaptation

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 52 REFERENCES
    A new general model to predict biomass and production of bacterioplankton in lakes
    • 2003
    Chemotactic action
    • 2007
    Factors that influence the localization of sound in the vertical plane.
    238
    Vestibular coriolis effect differences modeled with three-dimensional linear-angular interactions.
    21