Learn More
Calculating the similarity estimates between the query sample and the database samples becomes an exhaustive task with large, usually continuously updated multimedia databases. In this paper, a fast and low complexity transformation from the original feature space into k-dimensional vector space and clustering are proposed to alleviate the problem. First k(More)
Users expect efficient ways of handling their personal audio and image data using only the mobile device. Data sharing with friends and interest groups is also becoming increasingly important. Browsing and manually searching through your own and a shared data library can be difficult. For example, the simplest traditional attributes like time and date are(More)
In this paper we present a novel light weight speaker clustering algorithm based on the Bayesian information criterion (BIC). Algorithm utilises BIC profiles, which were earlier used for false alarm compensation (FAC) in our speaker change detector (SCD). Proposed speaker segmentation followed by a light weight clustering is targeted to segment and label(More)
State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main(More)
A phonetic vocoding system for Finnish is described. This very low bit rate speech coding method is a subclass of segmental vocoding. The proposed system utilizes speech recognition based on a HMM system to phonetically segment and label the input speech. The fundamental frequency is estimated using a robust pitch tracking algorithm. Speech reconstruction(More)
In this paper, a novel Out-of-Vocabulary (OOV) word detection method relying on phoneme-level acoustic measures and Support Vector Machines (SVM) is proposed. Word level OOV scores are computed from the phoneme level in-vocabulary (IV) and OOV information provided by an HMM based speech recognizer. The OOV word decision is based on the confidence feature(More)
In this paper, we propose a new approach to reduce the memory footprint of HMM based ASR systems. The proposed method involves three steps. Starting from the continuous density HMMs, mixture variances are tied using k-means based vector quantization. Next, the re-estimation of the resulted models is performed with tied variances. Finally, scalar(More)