Real Additive Margin Softmax for Speaker Verification
@article{Li2021RealAM, title={Real Additive Margin Softmax for Speaker Verification}, author={Lantian Li and Ruiqian Nai and Dong Wang}, journal={ArXiv}, year={2021}, volume={abs/2110.09116} }
The additive margin softmax (AM-Softmax) loss has delivered remarkable performance in speaker verification. A supposed behavior of AM-Softmax is that it can shrink within-class variation by putting emphasis on target logits, which in turn improves margin between target and non-target classes. In this paper, we conduct a careful analysis on the behavior of AM-Softmax loss, and show that this loss does not implement real max-margin training. Based on this observation, we present a Real AM-Softmax…
References
SHOWING 1-10 OF 26 REFERENCES
Large Margin Softmax Loss for Speaker Verification
- Computer ScienceINTERSPEECH
- 2019
Ring loss and minimum hyperspherical energy criterion are introduced to further improve the performance of the large margin softmax loss with different configurations in speaker verification.
Max-margin metric learning for speaker recognition
- Computer Science2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)
- 2016
Experiments conducted on the SRE08 core test show that compared to PLDA, the new approach can obtain comparable or even better performance, though the scoring is simply a cosine computation.
Angular Margin Centroid Loss for Text-Independent Speaker Recognition
- Computer ScienceINTERSPEECH
- 2020
This paper optimize the cosine distances between speaker embeddings and their corresponding centroids rather than the weight vectors in the classification layer to enhance the intra-class compactness of speaker embedding and explicitly improve the inter-class separability.
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition
- Computer Science2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
- 2019
Three different margin based losses which not only separate classes but also demand a fixed margin between classes are introduced to deep speaker embedding learning and it could be demonstrated that the margin is the key to obtain more discriminative speaker embeddings.
Gaussian-constrained Training for Speaker Verification
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
A Gaussian-constrained training approach that discards the parametric classifier, and enforces the distribution of the derived speaker vectors to be Gaussian, leading to consistent performance improvement.
X-Vectors: Robust DNN Embeddings for Speaker Recognition
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
This paper uses data augmentation, consisting of added noise and reverberation, as an inexpensive method to multiply the amount of training data and improve robustness of deep neural network embeddings for speaker recognition.
Additive Margin Softmax for Face Verification
- Computer ScienceIEEE Signal Processing Letters
- 2018
A conceptually simple and intuitive learning objective function, i.e., additive margin softmax, for face verification, which performs better when the evaluation criteria are designed for very low false alarm rate.
Deep Discriminative Embeddings for Duration Robust Speaker Verification
- EngineeringINTERSPEECH
- 2018
A novel algorithm to learn more discriminative utterance-level embeddings based on the Inception-ResNet speaker classifier is proposed, which outperforms ivector/PLDA framework for short utterances and is effective for long utterances.
Unified Hypersphere Embedding for Speaker Recognition
- Computer ScienceArXiv
- 2018
Results of experiments suggest that simple repetition and random time-reversion of utterances can reduce prediction errors by up to 18% and proposed logistic margin loss function leads to unified embeddings with state-of-the-art identification and competitive verification accuracies.
Generalized End-to-End Loss for Speaker Verification
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
A new loss function called generalized end-to-end (GE2E) loss is proposed, which makes the training of speaker verification models more efficient than the previous tuple-based end- to- end (TE2e) loss function.