John R. Zhang

Learn More
Various methods of content-based video copy detection have been proposed to find video copies in a large video database. In this paper, we represent video feature obtained by global and/or local detectors as signature time series. We observe that the curves of such time series under various kinds of modifications and transformations follow similar trends.(More)
Human arm and body gestures have long been known to hold significance in communication, especially with respect to teaching. We gather ground truth annotations of gesture appearance using a 27-bit pose vector. We manually annotate and analyze the gestures of two instructors, each in a 75-minute computer science lecture recorded to digital video, finding 866(More)
The growth of digitally recorded educational lectures has led to a problem of information overload. Semantic video browsers present one solution whereby content-based features are used to highlight points of interest. We focus on the domain of single-instructor lecture videos. We hypothesize that arm and upper body gestures made by the instructor can yield(More)
Public camera feeds are increasingly being opened to use by multiple authorities (e.g., police, fire, traffic) as well as to the public. Because of the difficulty and insecurity of sharing cryptographic keys, these data are available in the clear. However, authorities must have a mechanism to assure trust in the video, that is, to authenticate it. While(More)
The communicative importance of gestures in teaching environments have been widely studied. Two classes of gestures — point and spread gestures — have been identified to indicate pedagogical importance in teaching discourse [1]. In this work, we propose a system for the identification of the poses of point and spread gestures as a preliminary(More)
We hypothesize that certain speaker gestures can convey significant information that are correlated to audience engagement. We propose gesture attributes, derived from speakers' tracked hand motions to automatically quantify these gestures from video. Then, we demonstrate a correlation between gesture attributes and an objective method of measuring audience(More)
We present a purely algorithmic method for distinguishing when two hands are visually merged together and tracking their positions by propagating tracking information from anchor frames in single-camera video without depth information. We demonstrate and evaluate on a manually labeled dataset selected primarily for clasped hands with 698 images of a single(More)
Studies in linguistics and psychology have long observed correlations between gestures and content in speech. We explore an aspect of this phenomena within the framework of the automatic classification of upper body gestures. We demonstrate a correlation between the variances of natural arm motions and the presence of those conjunctions that are used to(More)
  • 1