Maulik C. Madhavi

Learn More
This paper addresses phonetic transcription related issues in Gujarati and Marathi (Indian Languages). Some adhoc approaches to fix relationship between the general alphabetical symbols and phonetic symbols may not always work. Hence, some research issues like ambiguity between frication and aspirated plosive are addressed in this paper. The anusvara in(More)
In this paper, hum of a person is used in voice biometric system. In addition, recently proposed feature set, i.e., Variable length Teager Energy Based Mel Frequency Cepstral Coefficients (VTMFCC), is found to capture perceptually meaningful source-like information from hum signal. For person recognition, MFCC gives EER of 13.14% and %ID of 64.96%. A(More)
Landmarks are the time instants in a speech signal which marks important events (such as vowels, consonants and glides) in the speech signal. This paper proposes use of novel vowel landmark detection (VLD) algorithm for low resourced language, viz., Gujarati, an Indian language. The proposed VLD method uses speech source information to detect the vowel(More)
Obstruents are very important acoustical events (i.e., abrupt-consonantal landmarks) in the speech signal. This paper presents the use of novel Spectral Transition Measure (STM) to locate the obstruents in the continuous speech signal. The problem of obstruent detection involves detection of phonetic boundaries associated with obstruent sounds. In this(More)
Query-by-Example Spoken Term Detection (QbE-STD) under low-resource settings, is the task of retrieval which can be done via the example of an audio. The searching phase involves highly computationally intensive Dynamic Time Warping (DTW)-based matching techniques. Search space reduction is an important need in order to reduce the space of searching and(More)
Obstruents are the key landmark events found in the speech signal. In this paper, we propose use of spectral transition measure (STM) to locate the obstruents in the continuous speech. The proposed approach does not take in to account any prior information (like phonetic sequence, speech transcription, and number of obstruents in the speech). Hence this(More)