English Conversational Telephone Speech Recognition by Humans and Machines

@inproceedings{Saon2017EnglishCT,
  title={English Conversational Telephone Speech Recognition by Humans and Machines},
  author={George Saon and Gakuto Kurata and Tom Sercu and Kartik Audhkhasi and Samuel Thomas and Dimitrios Dimitriadis and Xiaodong Cui and Bhuvana Ramabhadran and Michael Picheny and Lynn-Li Lim and Bergul Roomi and Phil Hall},
  booktitle={INTERSPEECH},
  year={2017}
}
One of the most difficult speech recognition tasks is accurate recognition of human to human communication. Advances in deep learning over the last few years have produced major speech recognition improvements on the representative Switchboard conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This then raises two issues - what IS human… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 147 CITATIONS

Speaker-Invariant Training Via Adversarial Learning

  • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2018
VIEW 13 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Kernel Approximation Methods for Speech Recognition

  • J. Mach. Learn. Res.
  • 2017
VIEW 22 EXCERPTS
CITES BACKGROUND & METHODS

English Broadcast News Speech Recognition by Humans and Machines

  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
VIEW 10 EXCERPTS
CITES METHODS, RESULTS & BACKGROUND

Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition

  • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2017
VIEW 11 EXCERPTS
CITES METHODS

Deep Learning-Based Telephony Speech Recognition in the Wild

VIEW 5 EXCERPTS
CITES RESULTS & BACKGROUND
HIGHLY INFLUENCED

Learning to Recognize Speech From Chaotically Synthesized Data

Faraz Fadavi, Samuel Ginn
  • 2017
VIEW 3 EXCERPTS
CITES BACKGROUND
HIGHLY INFLUENCED

Toward Human Parity in Conversational Speech Recognition

  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2017
VIEW 6 EXCERPTS
CITES BACKGROUND, RESULTS & METHODS
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2017
2019

CITATION STATISTICS

  • 15 Highly Influenced Citations

  • Averaged 49 Citations per year from 2017 through 2019

References

Publications referenced by this paper.