Learn More
Speech enhancement based on hidden Markov model (HMM) and the minimum mean square error (MMSE) criterion in Mel-frequency domain is generally considered as a weighted-sum filtering of the noisy speech. The weights of filters are often estimated by the HMM of noisy speech, and the estimation of filters usually requires an inverse operation from the(More)
In this paper, by using the cyclostationary properties of speech signal, a voice activity detection (VAD) algorithm based on cyclic cumulant is proposed. The proposed scheme employs the third-order cyclic cumulant of the LPC residual of a speech signal. Analytical expressions for the third-order cyclic cumulant of the LPC residual of short-term speech are(More)
For conventional bandwidth extension, the spectral patching methods, such as spectral folding, spectral translation and non-linear processing, are employed to reconstruct high frequency signal, yet it leads to the spectral shifting between reconstructed and original signal, and does not retain the original harmonic relations. In this paper, a blind harmonic(More)
This paper addresses the problem of musical noise introduced by conventional spectral subtraction method for speech enhancement. A modified spectral subtraction algorithm based on the masking properties of human auditory system is proposed. In comparison with Virag's algorithm, the modification of proposed method is made from four aspects. Firstly,(More)
A multi-layer embedded speech and audio coding algorithm based on bit-plane coding and Scalar Quantized Vector Huffman Coding (SQVH) is proposed in this paper. In this codec the signal sampled at 32 kHz can be coded in terms of scalable bit rates. The core codec is International Telecommunication Union Telecommunication Standardization Sector (ITU-T)(More)
A new 1kb/s waveform interpolation (WI) speech coding algorithm based on non-negative matrix factorization (NMF) is proposed and implemented in this paper. Multi-frame parameter joint vector or matrix quantization, parameter prediction and discrete cosine transform (DCT) are used to reduce the bit rate and to improve the quality of speech. The results of(More)
This letter proposes an efficient parameterization of the three-dimensional room transfer function (RTF) which is robust for the position variations of source and receiver in respective horizontal planes. Based on azimuth harmonic analysis, the proposed method exploits the underlying properties of the associated Legendre functions to remove a portion of the(More)
Most of the current pitch detection algorithms can not work well under the high noise environment. For this reason, a pitch detection algorithm for noisy speech signal based on pre-filtering and weighted wavelet coefficients is proposed. Firstly, the noisy speech signals are pre-filtered. Secondly, the speech pre-filtered is decomposed by the quadratic(More)
In this paper, the projective non-negative matrix factorization (PNMF) with Bregman divergence is applied into the musical instrument classification. A novel supervised learning algorithm for automatic classification of individual musical instrument sounds is addressed inspiring from PNMF with several versions of Bregman divergence. Moreover, the(More)
In this paper, a blind bandwidth extension approach is proposed based on prediction model of self-organizing map (SOM). Owing to nonlinear characteristics of audio spectrum series, the fine spectrum of low-frequency (LF) information is first described based on the principles of nonlinear dynamic, and then phase points are derived from reconstructing fine(More)