Vidhyasaharan Sethu

Learn More
The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models(More)
Speech is a promising modality for the convenient measurement of cognitive load, and recent years have seen the development of several cognitive load classification systems. Many of these systems have utilised mel frequency cepstral coefficients (MFCC) and prosodic features like pitch and intensity to discriminate between different cognitive load levels.(More)
Quantifying behavioural changes in depression using affective computing techniques is the first step in developing an objective diagnostic aid, with clinical utility, for clinical depression. As part of the AVEC 2013 Challenge, we present a multimodal approach for the Depression Sub-Challenge using a GMM-UBM system with three different kernels for the audio(More)
This paper focuses on speech based emotion classification utilizing acoustic data. The most commonly used acoustic features are pitch and energy, along with prosodic information like rate of speech. We propose the use of a novel feature based on instantaneous frequency obtained from the speech, in addition to the aforementioned features, in order to take(More)
Variations in the acoustic space due to changes in speaker mental state are potentially overshadowed by variability due to speaker identity and phonetic content. Using the Audio/Visual Emotion Challenge and Workshop 2013 Depression Dataset we explore the suitability of i-vectors for reducing these latter sources of variability for distinguishing between low(More)
Continuous emotion dimension prediction has increased in popularity over the last few years, as the shift away from discrete classification based tasks has introduced more realism in emotion modeling. However, many questions remain including how best to combine information from several modalities (e.g. audio, video, etc). As part of the AV+EC 2015(More)
Alterations in speech motor control in depressed individuals have been found to manifest as a reduction in spectral variability. In this paper we present a novel method for measuring acoustic volume a model-based measure that is reflective of this decrease in spectral variability and assess the ability of features resulting from this measure for indexing a(More)
None of the features commonly utilised in automatic emotion classification systems completely disassociate emotion-specific information from speaker-specific information. Consequently, this speaker-specific variability adversely affects the performance of the emotion classification system and in existing systems is frequently mitigated by some form of(More)
Spectral and excitation features, commonly used in automatic emotion classification systems, parameterise different aspects of the speech signal. This paper groups these features as speech production cues, broad spectral measures and detailed spectral measures and looks at how they differ in their performance in both speaker dependent and speaker(More)