• Computer Science
  • Published in Semantic Audio 2017

Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data

@inproceedings{Hagerer2017EnhancingLR,
  title={Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data},
  author={Gerhard Hagerer and Vedhas Pandit and Florian Eyben and Bj{\"o}rn W. Schuller},
  booktitle={Semantic Audio},
  year={2017}
}
This paper presents a new method for Long Short-Term Memory Recurrent Neural Network (LSTM) based speech overlap detection. To this end, speech overlap data is created artificially by mixing large amounts of speech utterances. Our elaborate training strategies and presented network structures demonstrate performance surpassing the considered state-of-the-art overlap detectors. Thereby we target the full ternary task of non-speech, speech, and overlap detection. Furthermore, speakers’ gender is… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-9 OF 9 CITATIONS

SphereDiar - an efficient speaker diarization system for meeting data

VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation

VIEW 1 EXCERPT
CITES METHODS

VoicePlay — An affective sports game operated by speech emotion recognition based on the component process model

VIEW 1 EXCERPT
CITES METHODS

References

Publications referenced by this paper.
SHOWING 1-10 OF 27 REFERENCES

Bidirectional recurrent neural networks

VIEW 9 EXCERPTS
HIGHLY INFLUENTIAL

Speech recognition robust against speech overlapping in monaural recordings of telephone conversations

VIEW 1 EXCERPT

Annotating and categorizing competition in overlap speech

VIEW 1 EXCERPT

Improved overlap speech diarization of meeting recordings using long-term conversational features

VIEW 1 EXCERPT