Learn More
Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures. The(More)
Author(s): Yu-Fei Ma (yfma@microsoft.com) Xian-Sheng Hua (xshua@microsoft.com) Lie Lu (llu@microsoft.com) Hong-Jiang Zhang (hjzhang@microsoft.com) Affiliation(s): Microsoft Research Asia, 5/F, Beijing Sigma Center, 49 Zhichun Road, Haidian District, Beijing (100080), P.R. China TEL: (8610) 62617711 FAX: (8610) 62555337 ABSTRACT Due to the information(More)
In this paper, we present our study of audio content analysis for classification and segmentation, in which an audio stream is segmented according to audio type or speaker identity. We propose a robust approach that is capable of classifying and segmenting an audio stream into speech, music, environment sound, and silence. Audio classification is processed(More)
Automatic generation of video summarization is one of the key techniques in video management and browsing. In this paper, we present a generic framework of video summarization based on the modeling of viewer's attention. Without fully semantic understanding of video content, this framework takes advantage of understanding of video content, this framework(More)
Automatic music type classification is very helpful for the management of digital music database. In this paper, Octavebased Spectral Contrast feature is proposed to represent the spectral characteristics of a music clip. It represented the relative spectral distribution instead of average spectral envelope. Experiments showed that Octave-based Spectral(More)
Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered in this paper: silence, music, background sound, pure speech, and non- pure speech which includes speech(More)
This paper addresses the problem of real time speaker change detection and speaker tracking in broadcasted news video analysis. In such a case, both speaker identities and number of speakers are assumed unknown. A two-step speaker change detection algorithm, including potential change detection and refinement, is proposed. Speaker tracking is performed(More)