Corpus ID: 211817858

Voice Separation with an Unknown Number of Multiple Speakers

@inproceedings{Nachmani2020VoiceSW,
  title={Voice Separation with an Unknown Number of Multiple Speakers},
  author={Eliya Nachmani and Yossi Adi and L. Wolf},
  booktitle={ICML},
  year={2020}
}
We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly… Expand
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals
Multi-Decoder Dprnn: Source Separation for Variable Number of Speakers
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Single channel voice separation for unknown number of speakers under reverberant and noisy settings
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect
Toward the pre-cocktail party problem with TasTas+
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training
SAGRNN: Self-Attentive Gated RNN For Binaural Speaker Separation With Interaural Cue Preservation
A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation.
  • Hangting Chen, Pengyuan Zhang
  • Computer Science, Medicine
  • Neural networks : the official journal of the International Neural Network Society
  • 2021
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 58 REFERENCES
Permutation invariant training of deep models for speaker-independent multi-talker speech separation
Deep clustering: Discriminative embeddings for segmentation and separation
Music Source Separation in the Waveform Domain
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Alternative Objective Functions for Deep Clustering
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction
TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation
  • Yi Luo, N. Mesgarani
  • Computer Science, Engineering
  • 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2018
The 2018 Signal Separation Evaluation Campaign
Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation
  • Yi Luo, Zhuo Chen, T. Yoshioka
  • Computer Science, Engineering
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
...
1
2
3
4
5
...