• Corpus ID: 237532414

NORESQA - A Framework for Speech Quality Assessment using Non-Matching References

  title={NORESQA - A Framework for Speech Quality Assessment using Non-Matching References},
  author={Pranay Manocha and Buye Xu and Anurag Kumar},
The perceptual task of speech quality assessment (SQA) is a challenging task for machines to do. Objective SQA methods that rely on the availability of the corresponding clean reference have been the primary go-to approaches for SQA. Clearly, these methods fail in real-world scenarios where the ground truth clean references are not available. In recent years, non-intrusive methods that train neural networks to predict ratings or scores have attracted much attention, but they suffer from several… 
1 Citations
REAL-M: Towards Speech Separation on Real Mixtures
The problem of performance evaluation of reallife mixtures, where the ground truth is not available is addressed by carefully designing a blind Scale-Invariant Signal-to-Noise Ratio (SI-SNR) neural estimator, and it is shown that this estimator reliably evaluates the separation performance on real mixtures.


A Classification-Aided Framework for Non-Intrusive Speech Quality Assessment
This work proposes an utterance-level classification-aided non-intrusive (UCAN) assessment approach that combines the task of quality score classification with the regression task ofquality score estimation, and uses a categorical quality ranking as an auxiliary constraint to assist with quality score estimation.
Novel deep autoencoder features for non-intrusive speech quality assessment
Quantification of the experimental results suggests that proposed metric gives more accurate and correlated scores than an existing benchmark for objective, non-intrusive quality assessment metric ITU-T P.563 standard.
Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM
This study proposes a novel end-to-end, non-intrusive speech quality evaluation model, termed Quality-Net, based on bidirectional long short-term memory, which has potential to be used in a wide variety of applications of speech signal processing.
An Attention Enhanced Multi-Task Model for Objective Speech Assessment in Real-World Environments
  • Xuan Dong, Donald S. Williamson
  • Computer Science
    ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
This work presents a novel reference-less based framework called the attention enhanced multi-task speech assessment (AMSA) model, which provides reliable estimates of multiple objective quality and intelligibility measures in simulated and real-world environments.
Intrusive and Non-Intrusive Perceptual Speech Quality Assessment Using a Convolutional Neural Network
A convolutional neural network is proposed to predict the perceived quality of speech with noise, reverberation, and distortions, both intrusively and non-intrusively, i.e., with and without a clean reference signal.
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework
A large clean speech and noise corpus is opened for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings and an online subjective test framework based on ITU-T P.808 for researchers to quickly test their developments.
Non-intrusive Speech Quality Assessment Using Neural Networks
This work presents an investigation of the applicability of neural networks for non-intrusive audio quality assessment, and proposes three neural network-based approaches for mean opinion score (MOS) estimation.
Nonintrusive Quality Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector Regression
This paper proposes a nonintrusive metric for the quality assessment of noise-suppressed speech and utilizes the sensitivity of FBEs to noise in order to obtain an effective representation of speech towards quality assessment.
An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework
A novel multi-task non-intrusive approach that is capable of simultaneously estimating both subjective and objective scores of real-world speech, to help facilitate learning is proposed.
On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
It is shown that a loss function based on scale-invariant signal-to-distortion ratio (SDR) achieves good general performance across a range of popular speech enhancement evaluation metrics, which suggests that SI-SDR is a good candidate as a general-purpose loss function for speech enhancement systems.