• Corpus ID: 17296090

Audio Clip Classification Using Social Tags and the Effect of Tag Expansion

  title={Audio Clip Classification Using Social Tags and the Effect of Tag Expansion},
  author={Frederic Font and Joan Serr{\`a} and Xavier Serra},
  booktitle={Semantic Audio},
Comunicacio presentada a la 53rd International Conference: Semantic audio, celebrada els dies 27 a 29 de gener de 2014 a Londres, Regne Unit. 

Figures from this paper

Harvesting and Structuring Social Data in Music Information Retrieval
This work proposes a methodology that combines Social Media Mining, Knowledge Extraction and Natural Language Processing techniques, to extract meaningful context information from social data and aims to improve retrieval, discovery and annotation of music and sound resources.
Citizen Tagger: Exploring Social Tagging of Conversational Audio
Citizen Tagger CT, a mobile application for tagging audio-based chat-show content, was designed and deployed with 16 members of a faith-based community who tagged a panel discussion about 'faith and vocation'.
The VU Sound Corpus: Adding More Fine-grained Annotations to the Freesound Database
A collection of annotations for a set of 2,133 environmental sounds taken from the Freesound database is presented, finding that it is not only feasible to perform crowd-labeling for a large collection of sounds, but it is also very useful to highlight different aspects of the sounds that authors may fail to mention.
Sound-based distributional models
The first results of the efforts to build a perceptually grounded semantic model based on sound data collected from freesound.org show that the models are able to capture semantic relatedness, with the tag- based model scoring higher than the sound-based model and the combined model.
Analysis of the Impact of a Tag Recommendation System in a Real-World Folksonomy
This work deeply analyze the impact of a tag recommendation system in the folksonomy of Freesound, a real-world and large-scale online sound sharing platform and suggests that tag recommendation effectively increases vocabulary sharing among users of the platform.
Tag Recommendation using Folksonomy Information for Online Sound Sharing Platforms
A new perspective for tag recommendation is explored which, besides taking advantage of information from the folksonomy, employs a sound-specific ontology to guide users during the annotation process, and can be easily generalised and thus be of use to other information sharing platforms.
AWtoolbox: Characterizing Audio Information Using Audio Words
The evaluation result shows that sparse coding (SC) based AW representation leads to very competitive performances across the three tested sound and music classification tasks.
A simple method for analyzing the temporal evolution of the “whoosh” sound, based on the per-band piecewise linear function approximation of the sound envelope signal, using spectral centroid and fuzzy membership functions to estimate a degree to which the sound energy moves upwards or downwards in the frequency domain along the audio file.
Empirical Methodology for Crowdsourcing Ground Truth
This work shows that measuring disagreement is essential for acquiring a high quality ground truth, by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation.


Music Information Retrieval Using Social Tags and Audio
In this paper we describe a novel approach to applying text-based information retrieval techniques to music collections. We represent tracks with a joint vocabulary consisting of both conventional
Improving music genre classification using collaborative tagging data
This paper proposes to exploit the semantic information embedded in tags supplied by users of social networking websites to exploit a graph of tracks so that tracks are neighbors if they are similar in terms of their associated tags.
Multimodal Music Mood Classification Using Audio and Lyrics
It is demonstrated that lyrics and audio information are complementary, and can be combined to improve a classification system, and integrating this in a multimodal system allows an improvement in the overall performance.
Automatic genre classification of music content: a survey
The state-of-the-art in automatic genre classification of music collections through three main paradigms: expert systems, unsupervised classification, and supervised classification is reviewed.
This paper analyzes the folksonomy of Freesound, an online audio clip sharing site which contains more than two million users and 150,000 user-contributed sound samples covering a wide variety of sounds, and compute some metrics that characterize the guysonomy both at the global level and at the tag level.
Categorising social tags to improve folksonomy-based recommendations
Multi-Tasking with Joint Semantic Spaces for Large-Scale Music Annotation and Retrieval
A method is proposed which attempts to capture the semantic similarities between the database items by modelling audio, artist names, and tags in a single low-dimensional semantic embedding space by optimizing the set of prediction tasks of interest jointly using multi-task learning.
Music Mood and Theme Classification - a Hybrid Approach
This paper develops algorithms for classifying music songs by moods and themes by extending existing approaches by also considering the songs’ thematic dimensions and by using social data from the Last.fm music portal, as support for the classification tasks.
Classification and retrieval of sound effects in audiovisual data management
  • Tong Zhang, C.-C. Jay Kuo
  • Computer Science
    Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020)
  • 1999
A method for the classification of sound effects which exploits time-frequency analysis of audio signals and uses the hidden Markov model as the classifier and a query-by-example retrieval approach for sound effects is proposed on top of the archiving scheme, which is proved to be highly efficient and effective.
Classification of sound clips by two schemes: Using onomatopoeia and semantic labels
Using the recently proposed framework for latent perceptual indexing of audio clips, we present classification of whole clips categorized by two schemes: high-level semantic labels and the mid-level