Complete-linkage clustering for voice activity detection in audio and visual speech

@inproceedings{Ghaemmaghami2015CompletelinkageCF,
  title={Complete-linkage clustering for voice activity detection in audio and visual speech},
  author={Houman Ghaemmaghami and David Dean and Shahram Kalantari and Sridha Sridharan and Clinton Fookes},
  booktitle={INTERSPEECH},
  year={2015}
}
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These… CONTINUE READING