Learn More
Various methods for query by example for audio signals are discussed in this paper. The query by example aims at automatic retrieval of audio excerpts similar to a user-provided audio sample from his/her personal audio database. Methods based on hidden Markov models, feature histograms, and likelihood ratio test are discussed. A class based approach was(More)
Calculating the similarity estimates between the query sample and the database samples becomes an exhaustive task with large, usually continuously updated multimedia databases. In this paper, a fast and low complexity transformation from the original feature space into k-dimensional vector space and clustering are proposed to alleviate the problem. First k(More)
Users expect efficient ways of handling their personal audio and image data using only the mobile device. Data sharing with friends and interest groups is also becoming increasingly important. Browsing and manually searching through your own and a shared data library can be difficult. For example, the simplest traditional attributes like time and date are(More)
In this paper we present a novel light weight speaker clustering algorithm based on the Bayesian information criterion (BIC). Algorithm utilises BIC profiles, which were earlier used for false alarm compensation (FAC) in our speaker change detector (SCD). Proposed speaker segmentation followed by a light weight clustering is targeted to segment and label(More)
In this paper, we propose a new approach to reduce the memory footprint of HMM based ASR systems. The proposed method involves three steps. Starting from the continuous density HMMs, mixture variances are tied using k-means based vector quantization. Next, the reestimation of the resulted models is performed with tied variances. Finally, scalar quantization(More)
State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main(More)
In this paper, a novel Out-of-Vocabulary (OOV) word detection method relying on phoneme-level acoustic measures and Support Vector Machines (SVM) is proposed. Word level OOV scores are computed from the phoneme level in-vocabulary (IV) and OOV information provided by an HMM based speech recognizer. The OOV word decision is based on the confidence feature(More)
A phonetic vocoding system for Finnish is described. This very low bit rate speech coding method is a subclass of segmental vocoding. The proposed system utilizes speech recognition based on a HMM system to phonetically segment and label the input speech. The fundamental frequency is estimated using a robust pitch tracking algorithm. Speech reconstruction(More)