• Publications
  • Influence
Monoaural Audio Source Separation Using Deep Convolutional Neural Networks
TLDR
A low-latency monaural source separation framework using a Convolutional Neural Network and the performance of the neural network is evaluated on a database comprising of musical mixtures of three instruments as well as other instruments which vary from song to song.
A Comparison of Sound Segregation Techniques for Predominant Instrument Recognition in Musical Audio Signals
TLDR
The authors address the identification of predominant music instruments in polytimbral audio by previously dividing the original signal into several streams, and show that the performance was only enhanced if the recognition models are trained with the features extracted from the separated audio streams.
Score-Informed Source Separation for Multichannel Orchestral Recordings
TLDR
Results show that the original score is able to align with the audio of the performance and separate the sources corresponding to the instrument sections, including annotations of the notes played by multiple instruments from an orchestral ensemble.
Ecological Acoustics Perspective for Content-Based Retrieval of Environmental Sounds
TLDR
A method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics is presented.
Monaural Score-Informed Source Separation for Classical Music Using Convolutional Neural Networks
Comunicacio presentada a la 18th International Society for Music Information Retrieval Conference (ISMIR 2017), celebrada els dies 23 a 27 d'octubre de 2017 a Suzhou, Xina.
Extending voice-driven synthesis to audio mosaicing
TLDR
A system for controlling audio mosaicing with a voice signal, which can be interpreted as a further step in voice-driven sound synthesis, and a system working in pseudo-realtime, which analyzes voice input and sends control messages to the concatenative synthesis module.
Groovator - An Implementation of Real-Time Rhythm Transformations
TLDR
A real-time system for rhythm manipulation of polyphonic audio signals that applies different transformations: Tempo, Swing, Meter and Accent, focusing on live performance situations.
Sound Retrieval From Voice Imitation Queries In Collaborative Databases
TLDR
This work introduces the use of non-speech voice imitations as input queries in a large user-contributed sound repository and addresses first the analysis of the human voice properties when imitating sounds, and studies the automatic classification of voice Imitations in clusters by means of user experiments.
Audio-to-score Alignment at the Note Level for Orchestral Recordings
TLDR
This paper introduces a note-wise pitch salience function that weighs the harmonic contribution according to the notes present in the score and performs image binarization and blob detection based on connectivity rules to determine note onsets and offsets in complex orchestral mixtures.
PHENICX: Performances as Highly Enriched aN d Interactive Concert Experiences
TLDR
Concerts will then become multimodal, multi-perspective and multilayer digital artifacts that can be easily explored, customized, personalized, (re)enjoyed and shared among the users.
...
...