Automatic sound annotation

@article{Cano2004AutomaticSA,
  title={Automatic sound annotation},
  author={Pedro Cano and Markus Koppenberger},
  journal={Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004.},
  year={2004},
  pages={391-400}
}
  • P. Cano, M. Koppenberger
  • Published 29 September 2004
  • Computer Science
  • Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004.
Sound engineers need to access vast collections of sound effects for their film and video productions. Sound effects providers rely on text-retrieval techniques to offer their collections. Currently, annotation of audio content is done manually, which is an arduous task. Automatic annotation methods, normally fine-tuned to reduced domains such as musical instruments or reduced sound effects taxonomies, are not mature enough for labeling with great detail any possible sound. A general sound… 

Figures and Tables from this paper

Semantic Annotation and Retrieval of Music and Sound Effects
We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a
Exploring the Semantic Annotation and Retrieval of Sound
We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and use a semantic query to retrieve relevant tracks from database of unlabeled
Acoustic lexemes for organizing internet audio
In this article, a method is proposed for automatic fine-scale audio description that draws inspiration from ontological sound description methods such as Shaeffer's Objets Sonores and Smalley's
Semi-Automatic Annotation of Music Collections
TLDR
The approach of this master thesis is to propose a semi-automatic strategy that allows to annotate huge music collections, based on audio similarity and a community of users that annotate music titles, to increase the efficiency regarding the manual annotation, and the accuracy regarding the automatic annotation.
TagATune: A Game for Music and Sound Annotation
TLDR
The rationale, design and preliminary results from a pilot study using a prototype of TagATune to label a subset of the FreeSound database, which aims to extract descriptions of sounds and music from human players.
Shortest Path Techniques for Annotation and Retrieval of Environmental Sounds
TLDR
This paper evaluates novel techniques for ordering query results using weights of both shortest paths and minimum cost paths of specified lengths, pruning outbound edges by nodes’ K nearest neighbors, and adjusting edge weights depending on type (acoustic, semantic, or user tagging).
Towards musical query-by-semantic-description using the CAL500 data set
TLDR
Qualitative and quantitative results demonstrate that the supervised multi-class labeling (SML) model can both annotate a novel song with meaningful words and retrieve relevant songs given a multi-word, text-based query.
Modeling the Semantics of Sound
TLDR
It is shown that the supervised mutli-class naı̈ve Bayes model, which has successfully been used for image annotation, can be used to model the semantics of audio data and can both annotate a novel audio track with semantically meaningful words and retrieve relevant audio track given a text-based query.
Techniques for soundscape retrieval and synthesis
TLDR
This dissertation examines the application of several computational tools in the realms of digital signal processing, multimedia information retrieval, and computer music synthesis to the analysis of the soundscape.
Improving the description of instrumental sounds by using ontologies and automatic content analysis
TLDR
A methodology to build a sound collection by using a proposed ontology of tags and the content analysis of its sounds is defined, providing the possibility of automatically describing eventual new sounds to be integrated within the collection.
...
...

References

SHOWING 1-10 OF 15 REFERENCES
Nearest-neighbor Generic Sound Classification with a WordNet-based Taxonomy
TLDR
This work uses WordNet, a semantic network that organizes real world knowledge, to tackle the taxonomy definition problem and uses a nearest-neighbor classifier with a database of isolated sounds unambiguously linked to WordNet concepts to overcome the need of a huge number of classifiers.
Sound Effect Taxonomy Management in Production Environments
TLDR
A classification scheme for sound effects management inspired on the MPEG-7 standard on top of an existing lexical network, WordNet, showing the usefulness of the approach easing the task for the librarian and providing higher control on the search and retrieval for the user.
REVIEW OF ICA AND HOS METHODS FOR RETRIEVAL OF NATURAL SOUNDS AND SOUND EFFECTS.
TLDR
Several methods for statistical modeling of man made sound effects using non-gausian linear and non-linear source-filter models using HOS, optimal basis / sparse geometrical representations using ICA and methods that combine ICA-based features with temporal modeling (HMM) are reviewed.
Sound-source recognition: a theory and computational model
TLDR
A computer model of the recognition process is developed that is capable of “listening” to a recording of a musical instrument and classifying the instrument as one of 25 possibilities, based on current models of signal processing in the human auditory system.
Classification and retrieval of sound effects in audiovisual data management
  • Tong Zhang, C.-C. Jay Kuo
  • Computer Science
    Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020)
  • 1999
TLDR
A method for the classification of sound effects which exploits time-frequency analysis of audio signals and uses the hidden Markov model as the classifier and a query-by-example retrieval approach for sound effects is proposed on top of the archiving scheme, which is proved to be highly efficient and effective.
Automatic Classification of Musical Instrument Sounds
TLDR
An exhaustive review of research on automatic classification of sounds from musical instruments presents and discusses different techniques for similarity-based clustering of sounds and for classification into pre-defined instrumental categories.
Representing Musical Instrument Sounds for Their Automatic Classification
TLDR
A study of the automatic classification of musical instrument sounds is presented allowing a discussion of the efficiency of the feature extraction process and its limitations.
HIERARCHICAL GAUSSIAN TREE WITH INERTIA RATIO MAXIMIZATION FOR THE CLASSIFICATION OF LARGE MUSICAL INSTRUMENT DATABASES
TLDR
An efficient algorithm for selecting the most appropriate features for a given classification task, called IRMFSP, is proposed based on the maximization of the ratio of the between-class inertia to the total inertia combined with a step-wise feature space orthogonalization.
Content-Based Classification, Search, and Retrieval of Audio
TLDR
The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features, which lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features.
WordNet: A Lexical Database for English
TLDR
WordNet1 provides a more effective combination of traditional lexicographic information and modern computing, and is an online lexical database designed for use under program control.
...
...