Similarity-and-Independence-Aware Beamformer With Iterative Casting and Boost Start for Target Source Extraction Using Reference

@article{Hiroe2022SimilarityandIndependenceAwareBW,
  title={Similarity-and-Independence-Aware Beamformer With Iterative Casting and Boost Start for Target Source Extraction Using Reference},
  author={Atsuo Hiroe},
  journal={IEEE Open Journal of Signal Processing},
  year={2022},
  volume={3},
  pages={1-20}
}
  • Atsuo Hiroe
  • Published 18 October 2021
  • Computer Science, Engineering
  • IEEE Open Journal of Signal Processing
Target source extractionis significant for improving human speech intelligibility and the speech recognition performance of computers. This study describes a method for target source extraction, called the similarity-and-independence-awarebeamformer (SIBF). The SIBF extracts the target source using a rough magnitude spectrogram as the reference signal. The advantage of the SIBF is that it can obtain a more accurate signal than the spectrogram generated by target-enhancing methods such as speech… 

References

SHOWING 1-10 OF 46 REFERENCES
Online LSTM-based Iterative Mask Estimation for Multi-Channel Speech Enhancement and ASR
TLDR
This work has proposed an iterative mask estimation (IME) approach to improve the complex Gaussian mixture model (CGMM) based beamforming and yield the best system for multi-channel ASR in CHiME-4 challenge and demonstrates that the algorithm could improve the speech quality (PESQ) and intelligibility (STOI) forMulti-channel speech enhancement.
Generalized Minimal Distortion Principle for Blind Source Separation
TLDR
It is found that it is possible to tune the parameters to improve separation by up to 2 dB, with no increase in distortion, and at little computational cost, so the method provides a cheap and easy way to boost the performance of blind source separation.
An iterative mask estimation approach to deep learning based multi-channel speech recognition
BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge
TLDR
A new beamformer front-end for Automatic Speech Recognition that leverages the power of a bi-directional Long Short-Term Memory network to robustly estimate soft masks for a subsequent beamforming step and achieves a 53% relative reduction of the word error rate over the best baseline enhancement system for the relevant test data set.
Overdetermined Independent Vector Analysis
We address the convolutive blind source separation problem for the (over-)determined case where (i) the number of nonstationary target-sources K is less than that of microphones M, and (ii) there are…
Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation
TLDR
A new framework called independent deeply learned matrix analysis (IDLMA), which unifies a deep neural network and independence-based multichannel audio source separation, and proposes an appropriate data augmentation method to adapt the DNN source models to the observed signal, which enables IDLMA to work even in the semi-supervised situation.
Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks
TLDR
It is shown that using a single mask across microphones for covariance prediction with minima-limited post-masking yields the best result in terms of signal-level quality measures and speech recognition word error rates in a mismatched training condition.
Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline
This paper describes a new baseline system for automatic speech recognition (ASR) in the CHiME-4 challenge to promote the development of noisy ASR in speech processing communities by providing 1)…
Neural network based spectral mask estimation for acoustic beamforming
TLDR
A neural network based approach to acoustic beamforming is presented, used to estimate spectral masks from which the Cross-Power Spectral Density matrices of speech and noise are estimated, which are used to compute the beamformer coefficients.
A review of blind source separation methods: two converging routes to ILRMA originating from ICA and NMF
TLDR
This paper describes several important methods for the blind source separation of audio signals in an integrated manner, and independent low-rank matrix analysis has been proposed, which integrates IVA and MNMF in a clever way.
...
1
2
3
4
5
...