SeeNSearch: A Context Directed Search Facilitator for Home Entertainment Devices
This paper presents general purpose video analysis and annotation tools, which combine high-level and low-level information, and which learn through user interaction and feedback. The use of these tools is illustrated through the construction of two video browsers, which allow a user to fast forward (or rewind) to frames, shots, or scenes containing a particular character, characters, or other labeled content. The two browsers developed in this work are: (1) a basic video browser, which exploits relations between high-level scripting information and closed captions, and (2) an advanced video browser, which augments the basic browser with annotations gained from applying machine learning. The learner helps the system adapt to different peoples' labelings by accepting positive and negative examples of labeled content from a user, and relating these to low-level color and texture features extracted from the digitized video. This learning happens interactively, and is used to infer labels on data the user has not yet seen. The labeled data may then be browsed or retrieved from the database in real time.An evaluation of the learning performance shows that a combination of low-level color signal features outperforms several other combinations of signal features in learning character labels in an episode of the TV situation comedy, Seinfeld. We discuss several issues that arise in the combination of low-level and high-level information, and illustrate solutions to these issues within the context of browsing television sitcoms.