S. R. Mahadeva Prasanna

Learn More
Vowel onset point (VOP) is the instant at which the onset of vowel takes place during speech production. There are significant changes occurring in the energies of excitation source, spectral peaks, and modulation spectrum at the VOP. This paper demonstrates the independent use of each of these three energies in detecting the VOPs. Since each of these(More)
This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based(More)
In this paper, through different experimental studies we demonstrate that the excitation component of speech can be exploited for speaker recognition studies. Linear prediction (LP) residual is used as a representation of excitation information in speech. The speaker-specific information in the excitation of voiced speech is captured using the(More)
Vowel-like regions (VLRs) in speech includes vowels, semi-vowels, and diphthong sound units. VLR can be identified using a vowel-like region onset point (VLROP) event. By production, the VLR has impulse-like excitation and therefore information about the vocal tract system may be better manifested in them. Also, the VLR is a relatively high signal-to-noise(More)
This paper proposes an approach for processing speech from multiple microphones to enhance speech degraded by noise and reverberation. The approach is based on exploiting the features of the excitation source in speech production. In particular, the characteristics of voiced speech can be used to derive a coherently added signal from the linear prediction(More)
The paper proposes a method for the extraction of pitch in adverse conditions. The real environment, in which degradation is due to several unpredictable sources, like additive noise, reverberation and channel noise, is treated as an adverse condition. The proposed method is based on knowledge of glottal closure (GC) events. A GC event is the instant at(More)
In this paper, we discuss a consortium effort on building text to speech (TTS) systems for 13 Indian languages. There are about 1652 Indian languages. A unified framework is therefore attempted required for building TTSes for Indian languages. As Indian languages are syllable-timed, a syllablebased framework is developed. As quality of speech synthesis is(More)
The objective of the present work is to provide a detailed review of expressive speech synthesis (ESS). Among various approaches for ESS, the present paper focuses the development of ESS systems by explicit control. In this approach, the ESS is achieved by modifying the parameters of the neutral speech which is synthesized from the text. The present paper(More)