Blind Speech Signal Quality Estimation for Speaker Verification Systems
@inproceedings{Lavrentyeva2020BlindSS, title={Blind Speech Signal Quality Estimation for Speaker Verification Systems}, author={Galina Lavrentyeva and Marina Volkova and Anastasia Avdeeva and Sergey Novoselov and Artem Gorlanov and Tseren Andzhukaev and Artem Ivanov and Alexander Kozlov}, booktitle={INTERSPEECH}, year={2020} }
The problem of system performance degradation in mismatched acoustic conditions has been widely acknowledged in the community and is common for different fields. The present state-ofthe-art deep speaker embedding models are domain-sensitive. The main idea of the current research is to develop a single method for automatic signal quality estimation, which allows to evaluate short-term signal characteristics. This paper presents a neural network based approach for blind speech signal quality…
4 Citations
Directional and Qualitative Feature Classification for Speaker Diarization with Dual Microphone Arrays
- Computer ScienceMICSECS
- 2020
A set of directional and qualitative features extracted from a dual microphone array signal are evaluated and it is shown that specific sets of features result in satisfying classification accuracy and can be further investigated in experiments combining them with biometric and other types of properties.
Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems
- Computer ScienceArXiv
- 2022
This paper presents an investigation over several methods of score calibration: a classical approach based on the logistic regression model; the recently presented magnitude estimation network MagnetO that uses activations from the pooling layer of the trained deep speaker extractor and generalization of such approachbased on separate scale and offset prediction neural networks.
STC-Innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020
- Computer ScienceINTERSPEECH
- 2020
A number of experiments on x vector-based and ResNet-like architectures show that ResNetbased networks outperform x-vector-based systems.
Magnitude-aware Probabilistic Speaker Embeddings
- Computer ScienceArXiv
- 2022
A new probabilistic speaker embedding extractor is proposed using the information encoded in the embedding magnitude and leverage it in the speaker verification pipeline and several quality-aware diarization methods are proposed and in-corporate the magnitudes in those.
References
SHOWING 1-10 OF 46 REFERENCES
STC Speaker Recognition Systems for the VOiCES From a Distance Challenge
- Computer ScienceINTERSPEECH
- 2019
This work investigates different deep neural networks architectures for speaker embedding extraction to solve the task of speaker recognition in single channel distant/far-field audio under noisy conditions and shows that deep networks with residual frame level connections outperform more shallow architectures.
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances
- Computer ScienceOdyssey
- 2020
This paper presents approaches aimed to improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and reduce the system qualitydegradation for short utterances and confirms that ResNet architectures outperform the standard x-vector approach in terms of speaker verification quality.
Scores Calibration in Speaker Recognition Systems
- PhysicsSPECOM
- 2016
The effects of speech duration variability on the calibration when enroll and test speech utterances originate from the same channel are investigated and an effective method of scores stabilization is presented.
Estimation of Room Acoustic Parameters: The ACE Challenge
- PhysicsIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2016
The acoustic characterization of environments (ACE) challenge showed that T60 estimation is a mature field where analytical approaches dominate whilst DRR estimation is one of the less mature fields where machine learning approaches are currently more successful.
The ACE challenge — Corpus description and performance evaluation
- Physics2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2015
The Acoustic Characterization of Environments (ACE) Challenge is a competition to identify the most promising non-intrusive DRR and T60 estimation methods using real noisy reverberant speech.
Quality measures based calibration with duration and noise dependency for speaker recognition
- Computer ScienceSpeech Commun.
- 2015
The STC ASR System for the VOiCES from a Distance Challenge 2019
- Computer ScienceINTERSPEECH
- 2019
The Speech Technology Center (STC) automatic speech recognition (ASR) system for the ”VOiCES from a Distance Challenge 2019” participated in the Fixed condition of the ASR task, which means that the only training data available was an 80-hour subset of the LibriSpeech corpus.
Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part I-Temporal Alignment
- Business
- 2013
The authors present the Perceptual Objective Listening Quality Assessment (POLQA), the third-generation speech quality measurement algorithm, which provides a new measurement standard for predicting Mean Opinion Scores that outperforms the older PESQ standard.
Blind estimators for reverberation time and direct-to-reverberant energy ratio using subband speech decomposition
- Computer Science2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- 2015
Algorithms for estimating the reverberation time and direct-to-reverberant energy ratio are described, indicating the effectiveness of both techniques particularly in high-SNR situations.
Quality Measure Functions for Calibration of Speaker Recognition Systems in Various Duration Conditions
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2013
Results highlight the importance of considering the quality metrics like duration in calibrating the scores for automatic speaker recognition systems and the need for a calibration approach to deal with these effects using quality measure functions (QMFs).