Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification

  title={Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification},
  author={Liwen Zhang and Ziqiang Shi and Jiqing Han},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
Audio signals are temporally-structured data, and learning their discriminative representations containing temporal information is crucial for the audio classification. In this article, we propose an audio representation learning method with a hierarchical pyramid structure called pyramidal temporal pooling (PTP) which aims to capture the temporal information of an entire audio sample. By stacking a global temporal pooling layer on multiple local temporal pooling layers, the PTP can capture the… 
