Alexander G. Hauptmann

Learn More
ly. Bridging the semantic gap is essential to exploiting this growing data. Toward this goal, recent research has focused on automatically tagging multimedia content to support end-user interactions such as searching, filtering, mining, content-based routing, personalization, and summarization. However, to date, there’s been limited progress on(More)
Many multimedia applications can benefit from techniques for adapting existing classifiers to data with different distributions. One example is cross-domain video concept detection which aims to adapt concept classifiers across various video domains. In this paper, we explore two key problems for classifier adaptation: (1) how to transform existing(More)
Current web video search results rely exclusively on text keywords or user-supplied tags. A search on typical popular video often returns many duplicate and near-duplicate videos in the top results. This paper outlines ways to cluster and filter out the near-duplicate video using a hierarchical approach. Initial triage is performed using fast signatures(More)
In this paper, we propose a discriminative video representation for event detection over a large scale video dataset when only limited hardware resources are available. The focus of this paper is to effectively leverage deep Convolutional Neural Networks (CNNs) to advance event detection, where only frame level static descriptors can be extracted by the(More)
Based on keypoints extracted as salient image patches, an image can be described as a "bag of visual words" and this representation has been used in scene classification. The choice of dimension, selection, and weighting of visual words in this representation is crucial to the classification performance but has not been thoroughly studied in previous work.(More)
Abstract—Based on the local keypoints extracted as salient image patches, an image can be described as a “bag-of-visualwords (BoW)” and this representation has appeared promising for object and scene classification. The performance of BoW features in semantic concept detection for large-scale multimedia databases is subject to various representation(More)
In this paper we investigate a new problem of identifying the perspective from which a document is written. By perspective we mean a point of view, for example, from the perspective of Democrats or Republicans. Can computers learn to identify the perspective of a document? Not every sentence is written strongly from a perspective. Can computers learn to(More)
Many data mining applications can benefit from adapt- ing existing classifiers to new data with shifted distribu- tions. In this paper, we present Adaptive Support Vector Machine (Adapt-SVM) as an efficient model for adapting a SVM classifier trained from one dataset to a new dataset where only limited labeled examples are available. By in- troducing a new(More)
Most state-of-the-art action feature extractors involve differential operators, which act as highpass filters and tend to attenuate low frequency action information. This attenuation introduces bias to the resulting features and generates ill-conditioned feature matrices. The Gaussian Pyramid has been used as a feature enhancing technique that encodes(More)