John R. Zhang

Learn More
Various methods of content-based video copy detection have been proposed to find video copies in a large video database. In this paper, we represent video feature obtained by global and/or local detectors as signature time series. We observe that the curves of such time series under various kinds of modifications and transformations follow similar trends.(More)
Human arm and body gestures have long been known to hold significance in communication, especially with respect to teaching. We gather ground truth annotations of gesture appearance using a 27-bit pose vector. We manually annotate and analyze the gestures of two instructors, each in a 75-minute computer science lecture recorded to digital video, finding 866(More)
The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Abstract There is substantial academic interest in modeling consumer experiential learning. However , (approximately) optimal solutions to forward-looking experiential learning problems are complex, limiting their behavioral plausibility(More)
We hypothesize that certain speaker gestures can convey significant information that are correlated to audience engagement. We propose gesture attributes, derived from speakers' tracked hand motions to automatically quantify these gestures from video. Then, we demonstrate a correlation between gesture attributes and an objective method of measuring audience(More)
The growth of digitally recorded educational lectures has led to a problem of information overload. Semantic video browsers present one solution whereby content-based features are used to highlight points of interest. We focus on the domain of single-instructor lecture videos. We hypothesize that arm and upper body gestures made by the instructor can yield(More)
  • 1