Alexandros Lazaridis

Learn More
Automatic speaker verification (ASV) systems are subject to various kinds of malicious attacks. Replay, voice conversion and speech synthesis attacks drastically degrade the performance of a standard ASV system by increasing its false acceptance rates. This issue raised a high level of interest in the speech research community where the possible voice(More)
In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion Challenge, we studied the relative importance of the individual speech parameters, and based on their ranking, a subset of(More)
Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory(More)
In the present work, we propose a scheme for the fusion of different phone duration models, operating in parallel. Specifically, the predictions from a group of dissimilar and independent to each other individual duration models are fed to a machine learning algorithm, which reconciles and fuses the outputs of the individual models, yielding more precise(More)
In this paper an attempt is made to automatically recognize the speaker’s accent among regional Swiss French accents from four different regions of Switzerland, i.e. Geneva (GE), Martigny (MA), Neuchâtel (NE) and Nyon (NY). To achieve this goal, we rely on a generative probabilistic framework for classification based on Gaussian mixture modelling (GMM). Two(More)
This paper describes the construction and evaluation of a segmental duration prediction model for Greek language with the application of CART (Classification and Regression Tree) machine learning approach. A ToBI annotated prosodic speech corpus was utilized for the construction of training and testing sets. Our phoneme category was composed of 34 phonemes(More)
In this paper we investigate external phone duration models (PDMs) for improving the quality of synthetic speech in hidden Markov model (HMM)-based speech synthesis. Support Vector Regression (SVR) and Multilayer Perceptron (MLP) were used for this task. SVR and MLP PDMs were compared with the explicit duration modelling of hidden semi-Markov models(More)
A fusion scheme of phone duration models (PDMs) is presented in this work. Specifically, a support vector regression (SVR)-fusion model is fed with the predictions of a group of independent PDMs operating in parallel. The American-English KED TIMIT and the Greek WCL-1 databases are used for evaluating the PDMs and the fusion scheme. The fusion scheme(More)
Most current very low bit rate VLBR speech coding systems use hidden Markov model HMM based speech recognition and synthesis techniques. This allows transmission of information such as phonemes segment by segment; this decreases the bit rate. However, an encoder based on a phoneme speech recognition may create bursts of segmental errors; these would be(More)