Seyedmahdad Mirsamadi

Learn More
Automatic emotion recognition from speech is a challenging task which relies heavily on the effectiveness of the speech features used for classification. In this work, we study the use of deep learning to automatically discover emotionally relevant features from speech. It is shown that using a deep recurrent neural network, we can learn both the short-time(More)
Even though deep neural network acoustic models provide an increased degree of robustness in automatic speech recognition, there is still a large performance drop in the task of far-field speech recognition in reverberant and noisy environments. In this study, we explore DNN adaptation techniques to achieve improved robustness to environmental mismatch for(More)
Room reverberation and environmental noise present challenges for integration of speech recognition technology in smart room applications. We present a multichannel enhancement framework for distributed microphone arrays to mitigate the effects of both additive noise and reverberation on distanttalking microphones. The proposed approach uses techniques of(More)
Room reverberation is a primary cause of failure in distant speech recognition (DSR) systems. In this study, we present a multichannel spectrum enhancement method for reverberant speech recognition, which is an extension of a single-channel dereverberation algorithm based on convolutive nonnegative matrix factorization (NMF). The generalization to a(More)
The problem of single-channel speech enhancement has been traditionally addressed by using statistical signal processing algorithms that are designed to suppress time-frequency regions affected by noise. We study an alternative data-driven approach which uses deep neural networks (DNNs) to learn the transformation from noisy and reverberant speech to clean(More)
Multichannel blind deconvolution (MCBD) algorithms are known to suffer from an extensive computational complexity problem, which makes them impractical for blind source separation (BSS) of speech and audio signals. This problem is even more serious with noncausal MCBD algorithms that must be used in many frequently occurring BSS setups. In this paper, we(More)
Automatic speech recognition ASR using distant far-field microphones is a challenging task, in which room reverberation is one of the primary causes of performance degradation. This study proposes a multichannel spectral enhancement method for reverberation-robust ASR using distributed microphones. The proposed method uses the techniques of nonnegative(More)
The problem of single-channel speech enhancement has been traditionally addressed by using statistical signal processing algorithms that are designed to suppress time-frequency regions affected by noise. We study an alternative data-driven approach which uses deep neural networks (DNNs) to learn the transformation from noisy and reverberant speech to clean(More)
Voice Activity Detectors (VAD) are important components in audio processing algorithms. In general, VADs are two way classifiers, flagging the audio frames where we have voice activity. Most of them are based on the signal energy and build statistical models of the noise background and the speech signal. In the process of derivation, we are limited to(More)
Recognition of distant (far-field) speech is a challenge for ASR due to mismatch in recording conditions resulting from room reverberation and environment noise. Given the remarkable learning capacity of deep neural networks, there is increasing interest to address this problem by using a large corpus of reverberant far-field speech to train robust models.(More)
  • 1