Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation

  author={Ryo Fukuda and Katsuhito Sudoh and Satoshi Nakamura},
Speech segmentation, which splits long speech into short segments, is essential for speech translation (ST). Popular VAD tools like WebRTC VAD 1 have generally relied on pause-based segmentation. Unfortunately, pauses in speech do not necessarily match sentence boundaries, and sentences can be connected by a very short pause that is difficult to detect by VAD. In this study, we propose a speech segmentation method using a bi-nary classification model trained using a segmented bilingual speech… 

