A flat direct model for speech recognition

@article{Heigold2009AFD,
  title={A flat direct model for speech recognition},
  author={Georg Heigold and Geoffrey Zweig and Xiao Li and Patrick Nguyen},
  journal={2009 IEEE International Conference on Acoustics, Speech and Signal Processing},
  year={2009},
  pages={3861-3864}
}
We introduce a direct model for speech recognition that assumes an unstructured, i.e., flat text output. The flat model allows us to model arbitrary attributes and dependences of the output. This is different from the HMMs typically used for speech recognition. This conventional modeling approach is based on sequential data and makes rigid assumptions on the dependences. HMMs have proven to be convenient and appropriate for large vocabulary continuous speech recognition. Our task under… CONTINUE READING

Figures, Tables, and Topics from this paper.

Citations

Publications citing this paper.
SHOWING 1-10 OF 24 CITATIONS

Maximum mutual information multi-phone units in direct modeling

VIEW 4 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

From flat direct models to segmental CRF models

  • 2010 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2010
VIEW 7 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Speech Recognition With Flat Direct Models

  • IEEE Journal of Selected Topics in Signal Processing
  • 2010
VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Porting concepts from DNNs back to GMMs

  • 2013 IEEE Workshop on Automatic Speech Recognition and Understanding
  • 2013
VIEW 1 EXCERPT
CITES BACKGROUND

Structured SVMs for Automatic Speech Recognition

  • IEEE Transactions on Audio, Speech, and Language Processing
  • 2013
VIEW 3 EXCERPTS
CITES BACKGROUND & METHODS

Classification and recognition with direct segment models

  • 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
VIEW 1 EXCERPT
CITES BACKGROUND

References

Publications referenced by this paper.
SHOWING 1-9 OF 9 REFERENCES

Live search for mobile:Web services by voice on the cellphone

  • 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2008
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Language modeling for voice search: A machine translation approach

  • 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2008
VIEW 1 EXCERPT

Template-Based Continuous Speech Recognition

  • IEEE Transactions on Audio, Speech, and Language Processing
  • 2007
VIEW 2 EXCERPTS

Combination of hidden Markov models with dynamic time warping for speech recognition

  • 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
  • 2004
VIEW 2 EXCERPTS

Similar Papers

Loading similar papers…