Learn More
Most state-of-the-art action feature extractors involve differential operators, which act as highpass filters and tend to attenuate low frequency action information. This attenuation introduces bias to the resulting features and generates ill-conditioned feature matrices. The Gaussian Pyramid has been used as a feature enhancing technique that encodes(More)
Recent improvements in content-based video search have led to systems with promising accuracy, thus opening up the possibility for interactive content-based video search to the general public. We present an interactive system based on a state-of-the-art content-based video search pipeline which enables users to do multimodal text-to-video and video-to-video(More)
Historically, researchers in the field have spent a great deal of effort to create image representations that have scale invariance and retain spatial location information. This paper proposes to encode equivalent temporal characteristics in video representations for action recognition. To achieve temporal scale invariance, we develop a method called(More)
In the first part of this three-part report we describe our system and novel approaches used in the TRECVID 2013 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. A separate section of the report (SIN) details methods and results for the Semantic Indexing task. The final section (SED) describes our approaches and results on the(More)
Massive Open Online Courses (MOOCs) enable everyone to receive high-quality education. However, current MOOC creators cannot provide an effective, economical, and scalable method to detect cheating on tests, which would be required for any certification. In this paper, we propose a Massive Open Online Proctoring (MOOP) framework, which combines both(More)
It is common that users are interested in finding video segments, which contain further information about the video contents in a segment of interest. To facilitate users to find and browse related video contents, video hyperlinking aims at constructing links among video segments with relevant information in a large video collection. In this study, we(More)
We propose a method for representing motion information for video classification and retrieval. We improve upon local descriptor based methods that have been among the most popular and successful models for representing videos. The desired local descriptors need to satisfy two requirements: 1) to be representative, 2) to be discriminative. Therefore, they(More)
In the first part of this three-part report we describe our system and novel approaches used in the TRECVID 2013 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. A separate section of the report (SIN) details methods and results for the Semantic Indexing task. The final section (SED) describes our approaches and results on the(More)
The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search. However, metadata is often lacking for user-generated videos, thus these videos are unsearchable by current search engines. Therefore, content-based video retrieval (CBVR) tackles this(More)
—In multi-person tracking scenarios, gaining access to the identity of each tracked individual is crucial for many applications such as long-term surveillance video analysis. Therefore, we propose a long-term multi-person tracker which utilizes face recognition information to not only enhance tracking performance, but also assign identities to tracked(More)