ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
- Xin Wang, J. Yamagishi, Zhenhua Ling
- Computer ScienceComputer Speech and Language
- 5 November 2019
A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis
- Xin Wang, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela, J. Yamagishi
- PhysicsIEEE International Conference on Acoustics…
- 7 April 2018
This paper builds a framework in which new vocoding and acoustic modeling techniques with conventional approaches are compared by means of a large scale crowdsourced evaluation, and shows that generative adversarial networks and an autoregressive (AR) model performed better than a normal recurrent network and the AR model performed best.
Real-Time Modeling of Audio Distortion Circuits with Deep Learning
- Eero-Pekka Damskägg, Lauri Juvela, V. Välimäki
- Computer Science
- 2019
The deep neural network studied in this work is useful for real-time virtual analog modeling of nonlinear audio circuits and can be run in real time on a desktop computer.
Non-parallel voice conversion using i-vector PLDA: towards unifying speaker verification and transformation
- T. Kinnunen, Lauri Juvela, P. Alku, J. Yamagishi
- Computer ScienceIEEE International Conference on Acoustics…
- 16 June 2017
This work adopts probabilistic linear discriminant analysis (PLDA) for voice conversion and adopts i-vector method to voice conversion, which requires neither parallel utterances, transcriptions nor time alignment procedures at any stage.
Real-Time Guitar Amplifier Emulation with Deep Learning
- Alec Wright, Eero-Pekka Damskägg, Lauri Juvela, V. Välimäki
- Computer ScienceApplied Sciences
- 21 January 2020
It is demonstrated that the neural network models can convincingly emulate highly nonlinear audio distortion circuits, whilst running in real-time, with some models requiring only a relatively small amount of processing power to run on a modern desktop computer.
GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis
- Manu Airaksinen, B. Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, P. Alku
- Computer ScienceInterspeech
- 8 September 2016
The proposed GlottDNN vocoder was evaluated as part of a full-band state-of-the-art DNN-based text-to-speech (TTS) synthesis system and compared against the release version of the original GlottHMM vocoder, and the well-known STRAIGHT vocoder.
Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs
- Ana Ramírez López, Shreyas Seshadri, Lauri Juvela, O. Räsänen, P. Alku
- PhysicsInterspeech
- 20 August 2017
A parametric approach that uses a vocoder to extract speech features from utterances spoken in normal style to the corresponding features of Lombard speech, and shows that the system is able to convert normal speech into Lombardspeech for the two vocoders.
A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis
- Manu Airaksinen, Lauri Juvela, B. Bollepalli, J. Yamagishi, P. Alku
- Computer ScienceIEEE/ACM Transactions on Audio Speech and…
- 11 May 2018
The obtained results suggest that the choice of the voice has a profound impact on the overall quality of the vocoder-generated speech, and the best vocoder for each voice can vary case by case, indicating that the waveform generation method of a vocoder is essential for quality improvements.
Speaker-independent raw waveform model for glottal excitation
- Lauri Juvela, Vassilis Tsiaras, B. Bollepalli, Manu Airaksinen, J. Yamagishi, P. Alku
- PhysicsInterspeech
- 25 April 2018
A multi-speaker 'GlotNet' vocoder, which utilizes a WaveNet to generate glottal excitation waveforms, which are then used to excite the corresponding vocal tract filter to produce speech.
Deep Learning for Tube Amplifier Emulation
- Eero-Pekka Damskägg, Lauri Juvela, Etienne Thuillier, V. Välimäki
- Computer ScienceIEEE International Conference on Acoustics…
- 1 November 2018
This work proposes a generic data-driven approach to virtual analog modeling and applies it to the Fender Bassman 56F-A vacuum-tube amplifier, and faithfully restitutes the range of sonic characteristics found across the configurations of the original device.
...
...