Ascensión Gallardo-Antolín

Learn More
In the context of speech and speaker recognition systems, it is well known that the combination of different feature streams can improve significantly their performance. However, the application of multi-stream (MS) techniques to speaker diarization systems has not been extensively studied. In this paper, we address this issue: we formulate different MS(More)
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them(More)
—The Internet Protocol (IP) environment poses two relevant sources of distortion to the speech recognition problem: lossy speech coding and packet loss. In this paper, we propose a new front-end for speech recognition over IP networks. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bit stream)(More)
In this paper, we have extended our previous research on a new approach to ASR in the GSM environment. Instead of recognizing from the decoded speech signal, our system works from the digital speech representation used by the GSM encoder. We have compared the performance of a conventional system and the one we propose on a speaker independent,(More)
The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time(More)
We are improving a flexible, large vocabulary, speaker independent, isolated-word recognition system in a telephone environment, originally designed as an integrated system doing all the recognition process in one step. We have transformed it, by adopting the hypothesis-verification paradigm. In this paper, we will describe the architecture and results of(More)
At ICSLP'96 we presented a flexible, large vocabulary, speaker independent, isolated-word preselection system in a telephone environment, using a two stage, bottom-up strategy [6]. We achieved reasonable performance in large and very large vocabulary tasks, ranging from 1200 to 10000 words. In this paper, we describe recent studies we have carried out on(More)
Automatic Speech Recognition (ASR) is essentially a problem of pattern classification, however, the time dimension of the speech signal has prevented to pose ASR as a simple static classification problem. Support Vector Machine (SVM) classifiers could provide an appropriate solution, since they are very well adapted to high-dimensional classification(More)