Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

@inproceedings{Liu2020LowLatencySS,
  title={Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection},
  author={Danni Liu and Gerasimos Spanakis and J. Niehues},
  booktitle={INTERSPEECH},
  year={2020}
}
  • Danni Liu, Gerasimos Spanakis, J. Niehues
  • Published in INTERSPEECH 2020
  • Computer Science, Engineering
  • Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficiency in terms of accuracy-latency trade-off. On the 300-hour How2 dataset, we reduce latency by 83… CONTINUE READING
    4 Citations

    Figures and Tables from this paper.

    Dynamic latency speech recognition with asynchronous revision
    SDST: Successive Decoding for Speech-to-text Translation
    • 1
    • PDF
    TED: Triple Supervision Decouples End-to-end Speech-to-text Translation
    • 1
    • PDF

    References

    SHOWING 1-10 OF 45 REFERENCES
    Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
    • 13
    • PDF
    Streaming Automatic Speech Recognition with the Transformer Model
    • N. Moritz, T. Hori, Jonathan Le Roux
    • Computer Science, Engineering
    • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2020
    • 34
    • PDF
    Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
    • Linhao Dong, S. Xu, Bo Xu
    • Computer Science
    • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2018
    • 151
    • PDF
    Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition
    • 4
    • PDF
    Low-Latency Neural Speech Translation
    • 25
    • PDF
    Low Latency ASR for Simultaneous Speech Translation
    • 4
    • PDF
    Synchronous Transformers for end-to-end Speech Recognition
    • 9
    • PDF
    A Comparative Study on Transformer vs RNN in Speech Applications
    • 141
    • PDF
    End-to-end attention-based large vocabulary speech recognition
    • 697
    • PDF
    Very Deep Self-Attention Networks for End-to-End Speech Recognition
    • 53
    • PDF