Learn More
This paper describes the details of our systems for feature extraction and search tasks of TRECVID-2004. For feature extraction, we emphasize the use of visual auto-concept annotation technique, with the fusion of text and specialized detectors, to induce concepts in videos. For the search task, our emphasis is twofold. First we employ query-specific(More)
This paper presents an enhanced work from our previous paper [Chaisorn et al. 2002]. The system is enhanced to perform news story segmentation on a large video corpus used in TRECVID 2003 evaluation. We use a combination of features include visual-based features such as color, object-based features such as face, video-text, temporal features such as audio(More)
Most existing systems detect events in broadcast team sports video using only internal audio-visual (AV) features with limited success. We found that there are many widely available external knowledge sources - such as match reports and real-time game logs in newspapers and on the Web - that can help in detecting events. This paper proposes a scalable(More)
The use of AV features alone is insufficient to induce high-level semantics. This article proposes a framework that utilizes both internal AV features and various types of external information sources for event detection in team sports video. Three schemes are also proposed to tackle the asynchronism between the fusion of AV and external information. The(More)
Our previous research shows that the use of multiple sources of information based on intrinsic AV features and external knowledge helps to detect events in soccer video. To make the system scalable, we process each source of information independently before fusing the detection results. The fusion of results is vital to the success under this architecture.(More)
This paper describes the details of our systems for story segmentation task and search task of the TREC-2003 Video Track. In story segmentation task, we propose a two-level multi-modal framework. First we analyze the video at the shot level using a variety of low and high-level features, and classify the shots into pre-defined categories using a Decision(More)
2011 Acknowledgements First and foremost, I must thank my advisor, Min-Yen Kan, for all his advice, guidance , and patience in seeing me through my Ph.D. years. Without his generous and unwavering support, I would not have completed this Ph.D. thesis. He heads the Web Information Retrieval / Natural Language Processing Group (WING), and he is known among(More)
  • 1