Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se
@inproceedings{Davis1980ComparisonOP, title={Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se}, author={S. Davis and Paul Mermelstein}, year={1980} }
Several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system. The vocabulary included many phonetically similar monosyllabic words, therefore the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations. For each parameter set (based on a mel-frequency cepstrum, a linear frequency cepstrum, a linear…
3,236 Citations
Speech recognition of mandarin monosyllables
- Computer SciencePattern Recognit.
- 2003
A comparison of feature representations for speaker-independent voiced-stop-consonant recognition
- PhysicsProceedings of Southeastcon '93
- 1993
It is concluded that the feature representations produced by Seneff's (1988) auditory model particularly the mean-rate response representation, are good representations for voiced-stop consonant speech as well as vowel speech and the addition of dynamic feature information in the form of differenced cepstral coefficients to the conglomerate mel-cepstral representative vectors made a difference in the recognition rate.
Evaluation of mel-LPC cepstrum in a large vocabulary continuous speech recognition
- Computer Science2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
- 2001
This paper compares the recognition performance of the mel-LPC cepstrum with those of both the standard LPC mel-cepstrum and the MFCC through the Japanese dictation system with 20,000 word vocabulary, and finds that this performance is slightly superior to that of MFCC.
A syllable, articulatory-feature, and stress-accent model of speech recognition
- Linguistics
- 2002
Analysis results provide evidence for an alternative approach of speech modeling, one in which the syllable assumes pre-eminent status and is melded to the lower as well as the higher tiers of linguistic representation through the incorporation of prosodic information such as stress accent.
Feature representations and classification procedures for Slovene phoneme recognition
- Computer SciencePattern Recognit. Lett.
- 1992
Significance of group delay based acoustic features in the linguistic search space for robust speech recognition
- PhysicsINTERSPEECH
- 2008
In this paper we discuss the complementarity of the group delay features with respect to other conventional acoustic features and also propose the use of such diverse information in the linguistic…
Recognition Of Phonemes In A-Cappella Recordings Using Temporal Patterns And Mel Frequency Cepstral Coefficients
- Computer Science
- 2012
Two alternative classification methods dealing with phonemes in singing, one uses Mel-Frequency Cepstral Coefficient features, while another uses Temporal Patterns, are combined to create a new type of classifier which produces a better performance than the two separate classifiers.
Acoustic-Phonetic Feature Based Dialect Identification in Hindi Speech
- Physics
- 2015
A method to identify Hindi dialects and examine the contribution of different acoustic-phonetic features for the purpose to measure the capability of Auto-associative neural networks for capturing non-linear relation specific to information from spectral features.
Modeling lexical tones for mandarin large vocabulary continuous speech recognition
- Computer Science, Physics
- 2006
This dissertation proposes several new strategies for tone modeling and explores their effectiveness in state-of-the-art HMM-based Mandarin large vocabulary speech recognition systems in two domains: conversational telephone speech and broadcast news.
A novel feature transformation for vocal tract length normalization in automatic speech recognition
- PhysicsIEEE Trans. Speech Audio Process.
- 1998
This paper proposes a method to transform acoustic models that have been trained with a certain group of speakers for use on different speech in hidden Markov model based (HMM-based) automatic speech…
References
SHOWING 1-10 OF 25 REFERENCES
Evaluation of acoustic parameters for monosyllabic word identification
- Physics
- 1978
Several recent investigations have hypothesized that syllable‐sized segments may be more appropriate units than phoneme‐sized segments for use in continuous speech recognition systems. The…
Recognition of monosyllabic words in continuous sentences using composite word templates
- LinguisticsICASSP
- 1978
A modified dynamic programming algorithm is presented that allows building up of reference information from a speaker's productions in the face of variations in acoustic forms induced by variation in the syntactic role of the word in the sentences.
Order dependence in templates for monosyllabic word identification
- Computer ScienceICASSP
- 1979
The ordering of words during template generation did not significantly affect word identification and the average correct identification in open tests for each speaker was 94.76% and 90.53%, with standard deviations of 0.53%.
Automatic segmentation of speech into syllabic units.
- Linguistics, PhysicsThe Journal of the Acoustical Society of America
- 1975
It is suggested that inclusion of alternative fluent‐form syllabifications for multisyllabic words and the use of phonological rules for predicting syllabic contractions can further improve agreement between predicted and experimental syllable counts.
Minimum prediction residual principle applied to speech recognition
- Computer Science
- 1975
A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual through optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm.
Speech recognition experiments with linear predication, bandpass filtering, and dynamic programming
- Computer Science
- 1976
Automatic speech recognition experiments are described in which several popular preprocessing and classification strategies are compared and it is shown that dynamic programming is of major importance for recognition of polysyllabic words.
Considerations in dynamic time warping algorithms for discrete word recognition
- Computer Science
- 1978
An algorithm in which an uncertainty exists in the registration both for initial and final frames was studied and another which constrains the dynamic path to follow the path which is locally minimum at each frame.
Syllable as a unit of speech recognition
- Linguistics
- 1975
Irregularities in phonetic manifestations of phonemes are discussed and it is argued that the syllable, phonologically redefined, will serve as the effective minimal unit in the time domain.
A phonetic-context controlled strategy for segmentation and phonetic labeling of speech
- Physics
- 1975
The extraction of acoustic cues pertinent to a phonetic feature can be tuned to classes of sounds separated on the basis of other cues, and this serves to increase the reliability of segment labeling.
On creating reference templates for speaker independent recognition of isolated words
- Computer Science
- 1978
A method of combining word patterns from a number of speakers is proposed in which a clustering type of analysis is used to determine which patterns are merged to create a word template.