Perfecto Herrera

Learn More
We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the music information retrieval (MIR) community along in the past, as it provides a(More)
Music consumption is biased towards a few popular artists. For instance, in 2007 only 1% of all digital tracks accounted for 80% of all sales. Similarly, 1,000 albums accounted for 50% of all album sales, and 80% of all albums sold were purchased less than 100 times. There is a need to assist people to filter, discover, personalise and recommend from the(More)
In this paper we present a study on music mood classification using audio and lyrics information. The mood of a song is expressed by means of musical features but a relevant part also seems to be conveyed by the lyrics. We evaluate each factor independently and explore the possibility to combine both, using natural language processing and music information(More)
A cover version1 is an alternative rendition of a previously recorded song. Given that a cover may differ from the original song in timbre, tempo, structure, key, arrangement, or language of the vocals, automatically identifying cover songs in a given music collection is a rather difficult task. The music information retrieval (MIR) community has paid much(More)
We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set(More)
be achieved by Machine Learning. • Necessity to consider structural description for key: combine with rhythmic and structural description. System block diagram In this paper, we evaluate two approaches for key estimation from polyphonic audio recordings. Our goal is to compare between a strategy using a cognition-inspired model and several machine learning(More)
This paper presents findings about mood representations. We aim to analyze how do people tag music by mood, to create representations based on this data and to study the agreement between experts and a large community. For this purpose, we create a semantic mood space from last.fm tags using Latent Semantic Analysis. With an unsupervised clustering(More)
A system capable of describing the musical content of any kind of sound file or sound stream, as it is supposed to be done in MPEG7-compliant applications, should provide an account of the different moments where a certain instrument can be listened to. In this paper we concentrate on reviewing the different techniques that have been so far proposed for(More)