Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

Abstract

We propose a VAD using long-term 200 ms Mel frequency band statistics, auditory masking, and a pre-trained two level decision tree ensemble based classifier, which allows capturing syllable level structure of speech and discriminating it from common noises. Proposed algorithm demonstrates on the test dataset almost 100% acceptance of clear voice for English… (More)
DOI: 10.1007/978-3-319-45510-5_40

Cite this paper

@inproceedings{Salishev2016VoiceAD, title={Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features}, author={Sergey I. Salishev and Andrey Barabanov and Daniil Kocharov and Pavel A. Skrelin and Mikhail J. Moiseev}, booktitle={TSD}, year={2016} }