Prashanth Gurunath Shivakumar

Learn More
Native language identification from acoustic signals of L2 speakers can be useful in a range of applications such as informing automatic speech recognition (ASR), speaker recognition, and speech biometrics. In this paper we follow a multistream and multi-rate approach, for native language identification, in feature extraction, classification, and fusion. On(More)
Developing a robust Automatic Speech Recognition (ASR) system for children is a challenging task because of increased variability in acoustic and linguistic correlates as function of young age. The acoustic variability is mainly due to the developmental changes associated with vocal tract growth. On the linguistic side, the variability is associated with(More)
We propose a simplified and supervised i-vector modeling scheme for the speaker age regression task. The supervised i-vector is obtained by concatenating the label vector and the linear regression matrix at the end of the mean super-vector and the i-vector factor loading matrix, respectively. Different label vector designs are proposed to increase the(More)
Automatic classification of depression using audiovisual cues can help towards its objective diagnosis. In this paper, we present a multimodal depression classification system as a part of the 2016 Audio/Visual Emotion Challenge and Workshop (AVEC2016). We investigate a number of audio and video features for classification with different fusion techniques(More)
Speech Enhancement is a challenging and important area of research due to the many applications that depend on improved signal quality. It is a pre-processing step of speech processing systems and used for perceptually improving quality of speech for humans. With recent advances in Deep Neural Networks (DNN), deep Denoising Auto-Encoders have proved to be(More)
  • 1