Learn More
Audio classification is an important issue in current audio processing and content analysis researches. Speech/music classification is one of the most interesting branches of audio signal classification. In this paper we present an unsupervised clustering method, based on one-class support vector machines (OCSVM) and inspired by the classical K-means(More)
This paper presents a new feature set for noisy speech recognition in autocorrelation domain. The autocorrelation domain is well-known for its pole preserving and noise separation properties. Therefore, in this paper we use the autocorrelation domain as an appropriate candidate for robust feature extraction. In our approach, initially, the lower lags of the(More)
In this paper, we present a mean quantization based audio watermarking scheme in the wavelet transform domain. The watermark data was embedded by quantizing the means of two selected bands of the wavelet transform of the original audio signal. One of the bands was in the lower frequency and the other one in the higher frequency ranges. Adaptive step sizes(More)
Mel-frequency cepstral coefficients (MFCC) are the most widely used features for speech recognition. However, MFCC-based speech recognition performance degrades in presence of additive noise. In this paper, we propose a set of noise-robust features based on conventional MFCC feature extraction method. Our proposed method consists of two steps. In the first(More)
This paper presents a voice activity detection (VAD) scheme that uses multiple of some popular features. As, in each noisy condition, one type of feature performs best in speech/non- speech classification, combining features can lead to a better performance. Features are combined linearly with weights that were obtained for each condition in training stage(More)
Speech recognition in Persian (Farsi) has recently been addressed by a few native speaking researchers and some approaches to isolated word and phoneme recognition have been reported. A main bottleneck in this research field is the lack of a recognition-specific speech corpus. In this work, a phonetically balanced speech database of Persian has been(More)
This paper presents a novel feature extraction method for environmental sounds classification. Although many kinds of audio features have been investigated for environmental sound classification tasks, most of them have been extracted only to model the speech signal structure, which explains their lower performance when dealing with other kinds of audio(More)
In this paper, our main purpose is to embed data in the frequency domain of audio signals. Data was embedded by means of quantization index modulation (QIM) in the frequency domain. With this aim, the spectrum of the audio signal was divided into two parts. The first part consisted of the first 19 Barks and the second included the remaining 6 Barks. Each of(More)