Learn More
In this paper, hum of a person is used in voice biometric system. In addition, recently proposed feature set, i.e., Variable length Teager Energy Based Mel Frequency Cepstral Coefficients (VTMFCC), is found to capture perceptually meaningful source-like information from hum signal. For person recognition, MFCC gives EER of 13.14% and %ID of 64.96%. A(More)
This paper addresses phonetic transcription related issues in Gujarati and Marathi (Indian Languages). Some adhoc approaches to fix relationship between the general alphabetical symbols and phonetic symbols may not always work. Hence, some research issues like ambiguity between frication and aspirated plosive are addressed in this paper. The anusvara in(More)
Speaker recognition deals with designing the system which recognizes the person by speech with the help of computers. In this paper, the various biometric signals produced by humans, viz., speech, singing and humming are considered for person recognition task. Corpus has been developed from 28 subjects in real-life settings. For person recognition task,(More)
Landmarks are the time instants in a speech signal which marks important events (such as vowels, consonants and glides) in the speech signal. This paper proposes use of novel vowel landmark detection (VLD) algorithm for low resourced language, viz., Gujarati, an Indian language. The proposed VLD method uses speech source information to detect the vowel(More)
Obstruents are very important acoustical events (i.e., abrupt-consonantal landmarks) in the speech signal. This paper presents the use of novel Spectral Transition Measure (STM) to locate the obstruents in the continuous speech signal. The problem of obstruent detection involves detection of phonetic boundaries associated with obstruent sounds. In this(More)
Obstruents are the key landmark events found in the speech signal. In this paper, we propose use of spectral transition measure (STM) to locate the obstruents in the continuous speech. The proposed approach does not take in to account any prior information (like phonetic sequence, speech transcription, and number of obstruents in the speech). Hence this(More)