Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation

  author={Hyeongju Kim and Hyeonseung Lee and Woo Hyun Kang and Hyung Yong Kim and Nam Soo Kim},
For multi-channel speech recognition, speech enhancement techniques such as denoising or dereverberation are conventionally applied as a front-end processor. Deep learning-based front-ends using such techniques require aligned clean and noisy speech pairs which are generally obtained via data simulation. Recently, several joint optimization techniques have been proposed to train the front-end without parallel data within an end-to-end automatic speech recognition (ASR) scheme. However, the ASR… Expand
