Accurate speech segmentation by mimicking human auditory processing

@article{King2013AccurateSS,
  title={Accurate speech segmentation by mimicking human auditory processing},
  author={Sarah King and Mark Hasegawa-Johnson},
  journal={2013 IEEE International Conference on Acoustics, Speech and Signal Processing},
  year={2013},
  pages={8096-8100}
}
This paper addresses the problem of locating phone boundaries without prior knowledge of the text of an utterance. A biomimetic model of human auditory processing is used to calculate the neural features of frequency synchrony and average signal level. Frequency synchrony and average signal level are used as input to a two-layered support vector machine (SVM)-based system to detect phone boundaries. Phone boundaries are detected with 87.0% precision and 84.8% recall when the automatic… CONTINUE READING

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • Phone boundaries are detected with 87.0% precision and 84.8% recall when the automatic segmentation system has no prior knowledge of the phone sequence in the utterance.

Citations

Publications citing this paper.

References

Publications referenced by this paper.
Showing 1-10 of 20 references

Suggested formulae for calculating auditory-filter bandwidths and excitation patterns.

The Journal of the Acoustical Society of America • 1983
View 5 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…