A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender
@article{Champion2021ASO, title={A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender}, author={Pierre Champion and Denis Jouvet and Anthony Larcher}, journal={ArXiv}, year={2021}, volume={abs/2101.08478} }
Speech pseudonymization aims at altering a speech signal to map the identifiable personal characteristics of a given speaker to another identity. In other words, it aims to hide the source speaker identity while preserving the intelligibility of the spoken content. This study takes place in the VoicePrivacy 2020 challenge framework, where the baseline system performs pseudonymization by modifying x-vector information to match a target speaker while keeping the fundamental frequency (F0…
6 Citations
Exploring the Importance of F0 Trajectories for Speaker Anonymization using X-vectors and Neural Waveform Models
- Computer Science
- 2021
Modifying the F0 can improve speaker anonymization by as much as 8% with minor word-error rate degradation, according to the VoicePrivacy Challenge 2020 framework and datasets developed and evaluated.
Improving speaker de-identification with functional data analysis of f0 trajectories
- PhysicsSpeech Commun.
- 2022
Evaluating X-Vector-Based Speaker Anonymization Under White-Box Assessment
- Computer ScienceSPECOM
- 2021
This article proposed to constrain the target selection to a specific identity, i.e., removing the random selection of identity, to evaluate the extreme threat under a white-box assessment (the attacker has complete knowledge about the system).
Differentially Private Speaker Anonymization
- Computer ScienceArXiv
- 2022
Experimental results show that the generated utterances retain very high utility for automatic speech recognition training and inference, while being much better protected against strong adversaries who leverage the full knowledge of the anonymization process to try to infer the speaker identity.
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces
- Computer ScienceArXiv
- 2021
It is demonstrated that to effectively defend from potential attacks against VUIs, it is necessary to investigate the attacks from multiple complementary perspectives and carefully account for the effects of deploying countermeasures, pointing to several promising research directions.
The VoicePrivacy 2020 Challenge: Results and findings
- Computer ScienceComput. Speech Lang.
- 2022
References
SHOWING 1-10 OF 21 REFERENCES
Speaker Anonymization Using X-vector and Neural Waveform Models
- Computer Science10th ISCA Workshop on Speech Synthesis (SSW 10)
- 2019
A new approach to speaker anonymization is presented, which exploits state-of-the-art x-vector speaker representations and uses them to derive anonymized pseudo speaker identities through the combination of multiple, random speaker x-vectors.
Reversible speaker de-identification using pre-trained transformation functions
- PhysicsComput. Speech Lang.
- 2017
F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder
- Computer ScienceICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
This work modified and improved autoencoder-based voice conversion to disentangle content, F0, and speaker identity at the same time and can control the F0 contour, generate speech with F0 consistent with the target speaker, and significantly improve quality and similarity.
Phonetic posteriorgrams for many-to-one voice conversion without parallel data training
- Computer Science2016 IEEE International Conference on Multimedia and Expo (ICME)
- 2016
This paper proposes a novel approach to voice conversion with non-parallel training data. The idea is to bridge between speakers by means of Phonetic PosteriorGrams (PPGs) obtained from a…
Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis
- PhysicsSLPAT@Interspeech
- 2015
A Hidden Markov Model (HMM)-based text-to-speech synthesis approach that preserves the voice individuality of those with articulation disorders and aids them in their communication.
Design Choices for X-vector Based Speaker Anonymization
- Computer ScienceINTERSPEECH
- 2020
A flexible pseudo-speaker selection technique is presented as a baseline for the first VoicePrivacy Challenge and several design choices for the distance metric between speakers, the region of x-vector space where the pseudo- Speaker is picked, and gender selection are explored.
X-Vectors: Robust DNN Embeddings for Speaker Recognition
- Computer Science2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
This paper uses data augmentation, consisting of added noise and reverberation, as an inexpensive method to multiply the amount of training data and improve robustness of deep neural network embeddings for speaker recognition.
Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion
- Computer ScienceIEEE Transactions on Emerging Topics in Computational Intelligence
- 2020
This article extends the CDVAE-VC framework by incorporating the concept of adversarial learning, in order to further increase the degree of disentanglement, thereby improving the quality and similarity of converted speech.
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
- Computer Science, PhysicsINTERSPEECH
- 2019
Experimental results show that neural end-to-end TTS models trained from the LibriTTS corpus achieved above 4.0 in mean opinion scores in naturalness in five out of six evaluation speakers.
Application-independent evaluation of speaker detection
- Computer ScienceComput. Speech Lang.
- 2004