Prasanta Kumar Ghosh

Learn More
We propose a novel long-term signal variability (LTSV) measure, which describes the degree of nonstationarity of the signal. We analyze the LTSV measure both analytically and empirically for speech and various stationary and nonstationary noises. Based on the analysis, we find that the LTSV measure can be used to discriminate noise from noisy speech signal(More)
The many-to-one mapping from representations in the speech articulatory space to acoustic space renders the associated acoustic-to-articulatory inverse mapping non-unique. Among various techniques, imposing smoothness constraints on the articulator trajectories is one of the common approaches to handle the non-uniqueness in the acoustic-to-articulatory(More)
We present MRI-TIMIT: a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research. The database currently consists of speech data acquired from two male and two female speakers of American English. Subjects’ upper airways were imaged in the midsagittal plane while reading the same 460 sentence(More)
USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American(More)
Acoustic-to-articulatory inversion is usually done in a subject-dependent manner, i.e., the inversion procedure may not work well if the parallel acoustic and articulatory training data is not available from the subjects in the test set. In this paper, we propose a subject-independent acoustic-to-articulatory inversion procedure; the proposed scheme(More)
To investigate the ameliorative potential of sodium selenite and zinc sulfate on intensive-swimming-induced testicular disorders, 48 Wistar male rats (age, 4 months; mass, 146.2 +/- 3.6 g) were randomly divided into 4 groups: the unexercised-control group (n = 12); the exercised group (n = 12); the control supplemented group (n = 12); and the exercised(More)
An automatic speech recognition approach is presented which uses articulatory features estimated by a subject-independent acoustic-to-articulatory inversion. The inversion allows estimation of articulatory features from any talker's speech acoustics using only an exemplary subject's articulatory-to-acoustic map. Results are reported on a broad class(More)
We propose a glottal source estimation method robust to shimmer and jitter in the glottal flow. The proposed estimation method is based on a joint source-filter optimization technique. The glottal source is modeled by the Liljencrants–Fant (LF) model and the vocaltract filter is modeled by an auto-regressive filter, which is common in the source-filter(More)
We propose a dynamic programming (DP) based piecewise polynomial approximation of discrete data such that the <i>L</i> <sub>2</sub> norm of the approximation error is minimized. We apply this technique for the stylization of speech pitch contour. Objective evaluation verifies that the DP based technique indeed yields minimum mean square error (MSE) compared(More)
We propose a practical, feature-level fusion approach for speaker verification using information from both acoustic and articulatory signals. We find that concatenating articulation features obtained from actual speech production data with conventional Mel-frequency cepstral coefficients (MFCCs) improves the overall speaker verification performance.(More)