Compact Audio Representation for Event Detection in Consumer Media

@inproceedings{Zhuang2012CompactAR,
  title={Compact Audio Representation for Event Detection in Consumer Media},
  author={Xiaodan Zhuang and Stavros Tsakalidis and Shuang Wu and Pradeep Natarajan and Rohit Prasad and Premkumar Natarajan},
  booktitle={INTERSPEECH},
  year={2012}
}
Local audio-visual descriptors are often compactly stored using representations such as the soft quantization histogram [1]. Typically, classification performance with histogram representations is improved through the use of large codeword sets. Unfortunately, this approach runs into overfitting and scalability challenges when applied to richly diverse real-world collections. A novel “i-vector” approach was recently proposed for the speaker-verification task [2]. In this work, we study the… CONTINUE READING
Highly Cited
This paper has 26 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.
Showing 1-10 of 18 extracted citations

Audio-based multimedia event detection using deep recurrent neural networks

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) • 2016

Improved audio features for large-scale multimedia event detection

2014 IEEE International Conference on Multimedia and Expo (ICME) • 2014
View 1 Excerpt

References

Publications referenced by this paper.
Showing 1-6 of 6 references

Visual Word Ambiguity

IEEE Transactions on Pattern Analysis and Machine Intelligence • 2010
View 5 Excerpts
Highly Influenced

Bbn viser trecvid 2011 multimedia event detection system

P. Natarajan, V. Manohar, +14 authors L. Davis
Proceedings of NIST TrecVid 2011 Workshop, Gaithersburg, MD., 12 2011. • 2011
View 6 Excerpts
Highly Influenced

Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

IEEE Transactions on Audio, Speech, and Language Processing • 2007
View 3 Excerpts
Highly Influenced

Eigenvoice modeling with sparse training data

IEEE Transactions on Speech and Audio Processing • 2005
View 2 Excerpts

Similar Papers

Loading similar papers…