WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

  title={WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit},
  author={Binbin Zhang and Di Wu and Zhendong Peng and Xingcheng Song and Zhuoyuan Yao and Hang Lv and Linfu Xie and Chao Yang and Fuping Pan and Jianwei Niu},
Recently, we made available WeNet [1], a production-oriented end-to-end speech recognition toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address the streaming and non-streaming decoding modes in a single model. To further improve ASR performance and facilitate various production requirements, in this paper, we present WeNet 2.0 with four important updates. (1) We propose U2++, a unified two-pass framework with bidirectional attention decoders, which… 

