exical Stress Detection on Stress-Minimal Word Pairs
- G. S, L. H. Jamieson, R. Chen, C. D. Michel
Prosody is a suprasegmental feature of speech that has an undeniable role in human speech perception and generation. However, employing of prosodic features in CSR process mostly is difficult and we must not expect huge accuracy progress by using them. In this way, the main problem arises from high dependency of prosodic patterns to factors like speakers, psychological state of speakers and superposition effects of higher-level prosodic patterns on lower level of them. In our approach, the selected microprosodic feature case is the lexical word stress pattern and relative stresses of cross-word syllables. We aim to verify if we succeed to present proper models for the prosodic feature recognition purpose, we can use them to modify speech recognition process. We employed a proper neural network approach to the word and cross-word stress recognition task. Then we incorporated these features into a spontaneous Farsi speech recognition system called SHENAVA-1. We found 1.3% better word accuracy.