Signal-Aware Direction-of-Arrival Estimation Using Attention Mechanisms

@article{Mack2022SignalAwareDE,
  title={Signal-Aware Direction-of-Arrival Estimation Using Attention Mechanisms},
  author={Wolfgang Mack and Julian Wechsler and Emanu{\"e}l Habets},
  journal={Comput. Speech Lang.},
  year={2022},
  volume={75},
  pages={101363}
}

End-to-End Signal-Aware Direction-of-Arrival Estimation Using Weighted Steered-Response Power

A loss function is proposed that enables training hybrid DOA estimation systems end-to-end using the noisy microphone signals and the ground-truth DOAs of the SOIs, and hence does not dependent on theGround-truth microphone signals.

Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

A deep-learning-based 2D speaker localization method with large ad-hoc microphone arrays that achieves better performance than conventional methods in both simulated and real-world environments and a softmax-based node selection algorithm that improves the estimation accuracy.

Geometry-aware DoA Estimation using a Deep Neural Network with mixed-data input features

This paper proposes a geometry-aware DoA estimation algorithm that uses a fully connected DNN and takes mixed data as input features, namely the time lags maximizing the generalized cross-correlation with phase transform and the microphone coordinates, which are assumed to be known.

Fast Cross-Correlation for TDoA Estimation on Small Aperture Microphone Arrays

This paper introduces the Fast Cross-Correlation (FCC) method for Time Difference of Arrival (TDoA) Estimation for pairs of microphones on a small aperture microphone array. FCC relies on low-rank

References

SHOWING 1-10 OF 86 REFERENCES

Signal-Aware Broadband DOA Estimation Using Attention Mechanisms

To obtain a flexible signal-aware DOA estimator, it is proposed to use binary mask attention with a DNN for multi-source DOA estimation trained with artificial noise.

Robust Source Counting and DOA Estimation Using Spatial Pseudo-Spectrum and Convolutional Neural Network

This work proposes to use a 2D convolutional neural network with multi-task learning to robustly estimate the number of sources and the directions-of-arrival from short-time spatial pseudo-spectra, which have useful directional information from audio input signals.

A learning-based approach to direction of arrival estimation in noisy and reverberant environments

A learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation and uses a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA.

Robust Speaker Localization Guided by Deep Learning-Based Time-Frequency Masking

Deep learning-based time-frequency (T-F) masking has dramatically advanced monaural (single-channel) speech separation and enhancement. This study investigates its potential for direction of arrival

Time delay estimation via multichannel cross-correlation [audio signal processing applications]

A multichannel crosscorrelation algorithm that can be treated as a natural generalization of the generalized cross-correlation (GCC) TDE method to the multichannels case and can take advantage of the redundancy provided by multiple microphone sensors to improve TDE against both reverberation and noise.

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

  • A. SubramanianChao Weng Dong Yu
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
D-ASR provides explicit speaker locations, it improves the explainability factor, and it achieves better ASR performance as the process is more streamlined, which makes it more appropriate for realistic data.

TIME DELAY ESTIMATION VIA MULTICHANNEL CROSS-CORRELATION

A multichannel crosscorrelation algorithm that can be treated as a natural generalization of the generalized cross-correlation (GCC) TDE method to the multichannels case and can take advantage of the redundancy provided by multiple microphone sensors to improve TDE against both reverberation and noise.

Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

This paper suggests an effective development procedure for DOA estimation models applied to new types of microphone arrays with minimal data collection efforts and demonstrates that the proposed approach achieves similar performance as if the fully-labeled real data are used.

Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network

The results show that the proposed DOAnet is capable of estimating the number of sources and their respective DOAs with good precision and generate SPS with high signal-to-noise ratio.

Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking

A mask estimation network is developed to assist direction of arrival (DOA) estimation by either appending or multiplying the estimated masks to the original input feature, and a multi-task learning architecture to optimize the mask and DOA estimation networks jointly is proposed.
...