Deep Learning for Audio Signal Processing

  title={Deep Learning for Audio Signal Processing},
  author={Hendrik Purwins and B. Li and T. Virtanen and Jan Schlx00FCter and Shuo-Yiin Chang and Tara Sainath},
  journal={IEEE Journal of Selected Topics in Signal Processing},
  • Hendrik Purwins, B. Li, +3 authors Tara Sainath
  • Published 2019
  • Computer Science, Engineering, Mathematics
  • IEEE Journal of Selected Topics in Signal Processing
  • Given the recent surge in developments of deep learning, this paper provides a review of the state-of-the-art deep learning techniques for audio signal processing. [...] Key Method The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are…Expand Abstract
    86 Citations
    Listening to Sounds of Silence for Speech Denoising
    • PDF
    Environment Sound Event Classification With a Two-Stream Convolutional Neural Network
    • 1
    • PDF
    A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement
    • 1
    • PDF
    Sparse and Parametric Modeling with Applications to Acoustics and Audio
    Deep generative models for musical audio synthesis
    • PDF
    How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition
    • 6
    • PDF
    A Survey on Deep Reinforcement Learning for Audio-Based Applications
    • 2
    • PDF
    End-to-End Music Transcription Using Fine-Tuned Variable-Q Filterbanks


    Deep beamforming networks for multi-channel speech recognition
    • X. Xiao, Shinji Watanabe, +7 authors Dong Yu
    • Computer Science, Engineering
    • 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2016
    • 118
    • PDF
    End-to-end learning for music audio
    • S. Dieleman, B. Schrauwen
    • Computer Science
    • 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2014
    • 262
    • PDF
    Speech acoustic modeling from raw multichannel waveforms
    • 178
    • PDF
    Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms
    • 104
    • PDF
    Acoustic modeling with deep neural networks using raw time signal for LVCSR
    • 170
    • PDF
    Efficient Neural Audio Synthesis
    • 327
    • PDF
    Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks
    • 125
    • PDF
    Speaker Recognition from Raw Waveform with SincNet
    • 182
    • PDF
    Filterbank learning for deep neural network based polyphonic sound event detection
    • 23
    • PDF
    Supervised Speech Separation Based on Deep Learning: An Overview
    • D. Wang, J. Chen
    • Computer Science, Medicine
    • IEEE/ACM Transactions on Audio, Speech, and Language Processing
    • 2018
    • 474
    • PDF