Learn More
In this paper, we face the problem of phonetic segmentation under the hierarchical clustering framework. We extend the framework with an unsupervised segmentation algorithm based on a divisive clustering technique and compare both approaches: agglomerative nesting (Bottom-up) against divisive analysis (Top-down). As both approaches require prior knowledge(More)
In this paper we present our efforts in building a speech recog-nizer constrained by the availability of very limited resources. We consider that neither proper training databases nor initial acoustic models are available for the target language. Moreover , for the experiments shown here, we use grapheme-based speech recognizers. Most prior work in the area(More)
We present a system for query by example on zero-resources languages. The system compares speech patterns by fusing the contributions of two acoustic models to cover both their spectral characteristics and their temporal evolution. The spectral model uses standard Gaussian mixtures to model classical MFCC features. We introduce phonetic priors in order to(More)
We present a system for Query-by-Example Spoken Term Detection on zero-resource languages. The system compares speech patterns by representing the signal using two different acoustic models, a Spectral Acoustic (SA) model covering the spectral characteristics of the signal, and a Temporal Acoustic (TA) model covering the temporal evolution of the speech(More)
In this paper we focus on unsupervised discovering of acoustic classes suitable for use in pattern recognition applications. Our approach is based on a two-level clustering of an initial acoustic segmentation of the audio data in order to allow for discovery and correct modeling of complex acoustic classes. Initially, in a first-level, the acoustic space is(More)
When people communicate, their states of mind are coupled with the explicit content of the messages being transmitted. The implicit information conveyed by mental states is essential to correctly understand and frame the communication messages. In mediation, professional mediators include empathy as a fundamental skill when dealing with the relational and(More)
Search, retrieval, and management of multimedia contents are challenging tasks for users and researchers alike. We introduce a software-hardware system for the global management of the multimedia contents produced by Spanish Civil Courts. The ultimate goal is to obtain an automatic classification of images and segments of the audiovisual records that,(More)
A novel phoneme-lattice to phoneme-sequence matching algorithm based on dynamic programming is presented in this paper. Phoneme lattices have been shown to be a good choice to encode in a compact way alternative decoding hypotheses from a speech recognition system. These are typically used for the spoken term detection and keyword-spotting tasks, where a(More)