Meet H. Soni

  • Citations Per Year
Learn More
Deep Neural Network (DNN) have been extensively used in Automatic Speech Recognition (ASR) applications. Very recently, DNNs have also found application in detecting natural vs. spoofed speech at ASV spoof challenge held at INTERSPEECH 2015. Along the similar lines, in this work, we propose a new feature extraction architecture of DNN called the subband(More)
In this paper, we propose a novel feature extraction architecture of Deep Neural Network (DNN), namely, subband autoencoder (SBAE). The proposed architecture is inspired by the Human Auditory System (HAS) and extracts features from speech spectrum in an unsupervised manner. We have used features extracted by this architecture for non-intrusive objective(More)
In this paper, we propose to use modified Gammatone filterbank with Teager Energy Operator (TEO) for environmental sound classification (ESC) task. TEO can track energy as a function of both amplitude and frequency of an audio signal. TEO is better for capturing energy variations in the signal that is produced by a real physical system, such as,(More)
In this paper, we propose a new quality assessment method for synthesized speech. Unlike previous approaches which uses Hidden Markov Model (HMM) trained on natural utterances as a reference model to predict the quality of synthesized speech, proposed approach uses knowledge about synthesized speech while training the model. The previous approach has been(More)
The Ideal Ratio Mask (IRM) has proven to be very effective tool in many applications such as speech segregation, speech enhancement for hearing aid design and noise robust speech recognition tasks. The IRM provides information regarding the amount of signal power at each Time-Frequency (T-F) unit in a given signal-plus-noise mixture. In this paper, we(More)
To emulate the human perception in quality assessment, an objective metric or assessment method is required, which is a challenging task. Moreover, assessing the quality of speech without any reference or the ground truth is altogether more difficult. In this paper, we propose a new non-intrusive speech quality assessment metric for objective evaluation of(More)
  • 1