SPEECH / MUSIC CLASSIFICATION USING BLOCK BASED MFCC FEATURES
@inproceedings{Daimi2015SPEECHM, title={SPEECH / MUSIC CLASSIFICATION USING BLOCK BASED MFCC FEATURES}, author={Vikaskumar Ghodasara Daimi and Syed Naser and Shefali Waldekar and Goutam Saha}, year={2015} }
Classifying an audio stream as either speech or music is receiving wide spread attention due to its varied applications. In this paper, we propose a novel block based mel frequency cepstral coefficient (MFCC) feature extraction method for music and speech classification. We found that the proposed features give better classification accuracy as compared to conventional MFCC features and zero crossing rate (ZCR) features. Here, we use support vector machine (SVM) classifier with 3-fold cross…
2 Citations
Wavelet Transform Based Mel-scaled Features for Acoustic Scene Classification
- Computer ScienceINTERSPEECH
- 2018
This paper attempts ASC by a novel use of wavelet transform based mel-scaled features, and the proposed features are shown to possess better discriminative properties than other spectral features while using a similar classification framework.
Analysis and classification of acoustic scenes with wavelet transform-based mel-scaled features
- Computer ScienceMultimedia Tools and Applications
- 2020
This paper attempts to classify acoustic scenes by a novel use of wavelet-based mel-scaled features by outperforming two benchmark systems, one based on mel-frequency cepstral coefficients and Gaussian mixture models and the other based on log mel-band energies and multi-layer perceptron.
7 References
Audio indexing using feature warping and fusion techniques
- Computer ScienceIEEE 6th Workshop on Multimedia Signal Processing, 2004.
- 2004
This paper reports on the improvement of speech and music indexation performance under various noisy conditions for radio broadcast using warped features fused with traditional features at the output…
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition
- Computer ScienceSpeech Commun.
- 2012
A comparison of features for speech, music discrimination
- Computer Science1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
- 1999
This paper examines the discrimination achieved by several different features using common training and test sets and the same classifier on four types of feature, amplitude, cepstra, pitch and zero-crossings.
Automatic transcription of general audio data: preliminary analyses
- Computer ScienceProceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
- 1996
Preliminary analyses and experiments conducted on data collected from a radio news program found that using relatively straightforward acoustic measurements and classification techniques, it was able to achieve better than 80% classification accuracy for seven salient sound classes present in the data, and nearly 94% classified accuracy for a speech/non-speech decision.
Real-time discrimination of broadcast speech/music
- Computer Science1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings
- 1996
A technique which is successful at discriminating speech from music on broadcast FM radio is described, which provides the capability to robustly distinguish the two classes and runs easily in real time.
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition." Speech Communication
- 2012
Zue . " Automatic transcription of general audio data : Preliminary analyses . " In Spoken Language , 1996 . ICSLP 96 . Proceedings
- 1996