Audio-visual grouplet: temporal audio-visual interactions for general video concept classification

@inproceedings{Jiang2011AudiovisualGT,
  title={Audio-visual grouplet: temporal audio-visual interactions for general video concept classification},
  author={Wei Jiang and Alexander C. Loui},
  booktitle={ACM Multimedia},
  year={2011}
}
We investigate general concept classification in unconstrained videos by joint audio-visual analysis. A novel representation, the Audio-Visual Grouplet (AVG), is extracted by studying the statistical temporal audio-visual interactions. An AVG is defined as a set of audio and visual codewords that are grouped together according to their strong temporal correlations in videos. The AVGs carry unique audio-visual cues to represent the video content, based on which an audio-visual dictionary can be… CONTINUE READING

Figures and Topics from this paper.

Citations

Publications citing this paper.
SHOWING 1-10 OF 20 CITATIONS

Multiview approaches to event detection and scene analysis

VIEW 5 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

A consumer video search system by audio-visual concept classification

  • 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
  • 2012
VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND

Grouplet-Based Distance Metric Learning for Video Concept Detection

  • 2012 IEEE International Conference on Multimedia and Expo
  • 2012
VIEW 10 EXCERPTS
CITES RESULTS, BACKGROUND & METHODS

Complex Activity Recognition Using Granger Constrained DBN (GCDBN) in Sports and Surveillance Video

  • 2014 IEEE Conference on Computer Vision and Pattern Recognition
  • 2014
VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Spatio-temporal information for human action recognition

  • EURASIP J. Image and Video Processing
  • 2016
VIEW 2 EXCERPTS
CITES BACKGROUND

References

Publications referenced by this paper.
SHOWING 1-5 OF 5 REFERENCES

Video assisted speech source separation

  • Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
  • 2005
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Scalable Recognition with a Vocabulary Tree

  • 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Multimodal information fusion for video concept detection

  • 2004 International Conference on Image Processing, 2004. ICIP '04.
  • 2004
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL