• Publications
  • Influence
Input complexity and out-of-distribution detection with likelihood-based generative models
TLDR
We use an estimate of input complexity to derive an efficient and parameter-free OOD score, which can be seen as a likelihood-ratio, akin to Bayesian model comparison. Expand
  • 27
  • 7
  • PDF
Timbre analysis of music audio signals with convolutional neural networks
TLDR
The focus of this work is to study how to efficiently tailor Convolutional Neural Networks (CNNs) towards learning timbre representations from log-mel magnitude spectrograms. Expand
  • 54
  • 3
  • PDF
Acoustic scene classification by ensembling gradient boosting machine and convolutional neural networks
TLDR
We use Gradient Boosting Machine (GBM) to evaluate and benchmark different approaches for acoustic scene classification and acoustic event detection for DCASE2017. Expand
  • 20
  • 2
  • PDF
End-to-end Sound Source Separation Conditioned on Instrument Labels
TLDR
This paper presents an extension of the Wave-U-Net [1] model which allows end-to-end monaural source separation with a non-fixed number of sources. Expand
  • 11
  • 1
  • PDF
Musical Instrument Recognition in User-generated Videos using a Multimodal Convolutional Neural Network Architecture
TLDR
This paper presents a method for recognizing musical instruments in user-generated videos by exploiting the multimodal information embedded in the audio and visual domains by means of a Convolutional Neural Network architecture. Expand
  • 7
  • PDF
ACOUSTIC SCENE CLASSIFICATION BY FUSING LIGHTGBM AND VGG-NET MULTICHANNEL PREDICTIONS
This report provides a solution for the task 1 of DCASE 2017 challenge. We build two parallel audio scene classification systems – LightGBM and VGG-net. Their prediction scores are outputExpand
  • 6
  • PDF
Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies
TLDR
We evaluate state-of-the-art image recognition techniques in the context of music instruments and demonstrate how they are integrated for musical instrument detection. Expand
  • 5
  • PDF
Vocoder-Based Speech Synthesis from Silent Videos
TLDR
We present a way to synthesise speech from the silent video of a talker using deep learning using a vocoder synthesis algorithm. Expand
  • 3
  • PDF
Conditioned Source Separation for Music Instrument Performances
TLDR
This paper proposes a source separation method for multiple musical instruments sounding simultaneously and explores how much additional information apart from the audio stream can lift the quality of source separation. Expand
  • 3
  • PDF
GENERATIVE MODELS
Likelihood-based generative models are a promising resource to detect out-ofdistribution (OOD) inputs which could compromise the robustness or reliability of a machine learning system. However,Expand