• Publications
  • Influence
Classification of general audio data for content-based retrieval
TLDR
This work describes a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise, and shows that cepstral-based features such as the Mel-frequency cep stral coefficients (MFCC) and linear prediction coefficients (LPC) provide better classification accuracy compared to temporal and spectral features. Expand
Summarization of video programs based on closed captions
TLDR
A summarization system for processing incoming video, extracting and analyzing closed caption text, determining the boundaries of program segments as well as commercial breaks and extracting a program summary from a complete broadcast to enable video transparency is presented. Expand
Video keyframe extraction and filtering: a keyframe is not a keyframe to everyone
TLDR
The experhnents show that the number of keyframes is reduced to a manageable size, thus enabling only important visual information to be presented to the user. Expand
MPEG-7: a content description standard beyond compression
TLDR
This paper presents several technologies that have been developed at Philips Research and proposed to the international standardization organization called MPEG-7, which include visual descriptors for both still images and video. Expand
SmartWatch: an automated video event finder
TLDR
This paper presents an automated video event detection system that combines textual an aural analysis techniques and shows promising results in terms of speed, accuracy, and efficiency. Expand
Evolvable visual commercial detector
TLDR
GAs drastically improved the approach and enabled fast prototyping and performance tuning of commercial detection algorithms, and it is shown how a scalar genetic algorithm can locate sets of parameters in a multi-objective space (precision and recall) that outperform the values selected by an expert engineer. Expand
Integrated multimedia processing for topic segmentation and classification
TLDR
This paper describes the elements of the system and presents results from running Video Scout on real TV programs and incorporates a Bayesian framework that integrates information from the audio, visual, and transcript (closed captions) domains. Expand
Video scouting: an architecture and system for the integration of multimedia information in personal TV applications
TLDR
A system that automatically segments and indexes story segments from the programs according to viewers' profiles and combines information from the audio, visual, and transcript domains in a probabilistic framework based on Bayesian networks is described. Expand
Parsing TV programs for identification and removal of nonstory segments
TLDR
Results indicate that adding detection of text, in addition to cut rate, to reduce the number of false positives, appears to be a promising method that should further increase reliability. Expand
Real time commercial detection using MPEG features
TLDR
This paper presents the algorithm which uses features as triggers and verifiers to perform the commercial detection and achieves a recall of 93% and a precision of 95% when station logos and trailers are excluded. Expand
...
1
2
3
...