Fréchet Audio Distance: A Reference-Free Metric for Evaluating Music Enhancement Algorithms
@inproceedings{Kilgour2019FrchetAD, title={Fr{\'e}chet Audio Distance: A Reference-Free Metric for Evaluating Music Enhancement Algorithms}, author={K. Kilgour and Mauricio Zuluaga and Dominik Roblek and M. Sharifi}, booktitle={INTERSPEECH}, year={2019} }
We propose the Fréchet Audio Distance (FAD), a novel, reference-free evaluation metric for music enhancement algorithms. [...] Key Method FAD is validated using a wide variety of artificial distortions and is compared to the signal based metrics signal to distortion ratio (SDR), cosine distance, and magnitude L2 distance. We show that, with a correlation coefficient of 0.52, FAD correlates more closely with human perception than either SDR, cosine distance or magnitude L2 distance, with correlation coefficients…Expand
7 Citations
Audio Inpainting based on Self-similarity for Sound Source Separation Applications
- Computer Science
- 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)
- 2020
- Highly Influenced
On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks
- Computer Science, Engineering
- IEEE Journal of Selected Topics in Signal Processing
- 2021
- PDF
Conditioned Source Separation for Music Instrument Performances
- Computer Science, Engineering
- ArXiv
- 2020
- 5
- PDF
Speech gesture generation from the trimodal context of text, audio, and speaker identity
- Computer Science
- ACM Trans. Graph.
- 2020
- 3
- PDF
Trusted Artificial Intelligence: Towards Certification of Machine Learning Applications
- Computer Science, Mathematics
- ArXiv
- 2021
- PDF
References
SHOWING 1-10 OF 13 REFERENCES
Performance measurement in blind audio source separation
- Computer Science
- IEEE Transactions on Audio, Speech, and Language Processing
- 2006
- 2,165
- PDF
SDR – Half-baked or Well Done?
- Computer Science, Engineering
- ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
- 205
- PDF
A short-time objective intelligibility measure for time-frequency weighted noisy speech
- Computer Science
- 2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010
- 350
- PDF
Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs
- Computer Science
- 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
- 2001
- 1,576
- PDF
Towards Accurate Generative Models of Video: A New Metric & Challenges
- Computer Science, Mathematics
- ArXiv
- 2018
- 56
- PDF
Supervised Speech Separation Based on Deep Learning: An Overview
- Computer Science, Medicine
- IEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2018
- 493
- PDF
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
- Mathematics, Computer Science
- NIPS
- 2017
- 2,513
- PDF