Musical genre classification of audio signals

@article{Tzanetakis2002MusicalGC,
  title={Musical genre classification of audio signals},
  author={George Tzanetakis and Perry R. Cook},
  journal={IEEE Trans. Speech Audio Process.},
  year={2002},
  volume={10},
  pages={293-302}
}
Musical genres are categorical labels created by humans to characterize pieces of music. [] Key Method The performance and relative importance of the proposed features is investigated by training statistical pattern recognition classifiers using real-world audio collections. Both whole file and real-time frame-based classification schemes are described. Using the proposed feature sets, classification of 61% for ten musical genres is achieved. This result is comparable to results reported for human musical…

Figures and Tables from this paper

Automatic Music Classification into Genres
TLDR
The task of automatic music genre classification is explored, multiple features based on timbral texture, rhythmic content and pitch content are extracted from a single music piece and used to train different classifiers for genre prediction.
Musical Genre Classification Using Support Vector Machines and Audio Features
TLDR
The result shows that the most appropriate kernel for automatic musical genre classification is polynomial kernel and the best combination of audio features is the combination of musical surface, Mel-Frequency Cepstrum Coefficients, tonality and LPC.
Audio content processing for automatic music genre classification: descriptors, databases, and classifiers
TLDR
This dissertation presents, discusses, and sheds some light on the problems that appear when computers try to automatically classify musical genres from audio signals, and proposes a new architecture of classifiers after analyzing music genre classification from different disciplines.
Musical genre classification of audio signals using geometric methods
TLDR
The algorithm uses a vector of features based on the timbral texture of the music, and maps it into a new Euclidean space, by a non-linear method called “Diffusion Maps”, before the classification stage itself, which allows dimensionality reduction while preserving and emphasizing the distinction between different genres.
Artificial Neural Network genre classification of musical signals
TLDR
This paper explores the automatic genre classification of a musical Algerian database and proposes two feature sets composed of signal objective descript ors and which are closely related to perceptive ones, more specifically a multiplayer perceptron (MLP).
Comparative study of Classification of Audio Signals
TLDR
Automatic musical genre classification provides a framework for developing and evaluating features for any type of contentbased analysis of music information retrieval systems.
A survey on symbolic data-based music genre classification
A comparison of human and automatic musical genre classification
TLDR
The results show that, although there is room for improvement, genre classification is inherently subjective and therefore perfect results can not be expected neither from automatic nor human classification.
Effective Training Methods for Automatic Musical Genre Classification
TLDR
This paper suggests an alternative genre classification method based on which part of songs have to be used to have a better accuracy level, and discusses whether using full versions or pieces of songs is better.
Automatic Music Genre Classification of Audio Signals with Machine Learning Approaches
TLDR
This paper presents a comprehensive machine learning approach to the problem of automatic musical genre classification using the audio signal, and proposes two feature sets for representing frequency domain, temporal domain, cepstral domain and modulation frequency domain audio features.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
Automatic Musical Genre Classification of Audio Signals
TLDR
Algorithms for the automatic genre categorization of audio signals are described and a set of features for representing texture and instrumentation and a novel set of Features for representing rhythmic structure and strength are proposed.
Locating singing voice segments within music signals
  • A. Berenzweig, D. Ellis
  • Computer Science
    Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575)
  • 2001
TLDR
This work uses the acoustic classifier of a speech recognizer as a detector for speech-like sounds to locate the portions of a musical track during which the vocals are present reliably, and proposes various statistics of the classifiers output in order to discriminate singing from instrumental accompaniment.
The beat spectrum: a new approach to rhythm analysis
  • J. Foote, Shingo Uchihashi
  • Physics, Computer Science
    IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.
  • 2001
TLDR
The beat spectrum is a measure of acoustic self-similarity versus lag time, computed from a representation of spectrally similarity, which has a variety of applications, including music retrieval by similarity and automatically generating music videos.
Tatum grid analysis of musical signals
  • J. Seppanen
  • Computer Science
    Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575)
  • 2001
TLDR
An algorithm for analyzing the rhythmic content of acoustic signals of polyphonic and multitimbral Western music is presented, and the algorithm was found to be capable of finding the tatum grid from music with a regular rhythm.
Audio content analysis for online audiovisual data segmentation and classification
TLDR
A heuristic rule-based procedure is proposed to segment and classify audio signals and built upon morphological and statistical analysis of the time-varying functions of these audio features.
Content-based methods for the management of digital music
  • D. Pye
  • Computer Science
    2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)
  • 2000
TLDR
A new parameterization, based on a partial decompression of MPEG layer III audio, is proposed to facilitate music processing at user-interactive speeds and provide useful tools in the management of a typical digital music library.
Content-based retrieval of music and audio
  • J. Foote
  • Computer Science
    Other Conferences
  • 1997
TLDR
A system to retrieve audio documents y acoustic similarity based on statistics derived from a supervised vector quantizer, rather than matching simple pitch or spectral characteristics, which may be applicable to image retrieval as well.
TATUM GRID ANALYSIS OF MUSICAL SIGNALS
TLDR
An algorithm for analyzing the rhythmic content of acoustic signals of polyphonic and multitimbral Western music is presented and is found to be capable of finding the tatum grid from music with a regular rhythm.
Tempo and beat analysis of acoustic musical signals.
  • E. D. Scheirer
  • Physics
    The Journal of the Acoustical Society of America
  • 1998
TLDR
A method is presented for using a small number of bandpass filters and banks of parallel comb filters to analyze the tempo of, and extract the beat from, musical signals of arbitrary polyphonic complexity and containing arbitrary timbres that can be used predictively to guess when beats will occur in the future.
Musical understanding at the beat level: real-time beat tracking for audio signals
TLDR
A real-time beat tracking system that processes audio signals that contain sounds of various instruments, and correctly tracked beats in 40 out of 42 popular songs in which drums maintain the beat.
...
...