Learn More
Acoustic environment recognition has been widely used in many applications, and is a considerable difficult problem for the real-life and complex environment. This paper proposes a novel feature, named minimum statistics project coefficients (MSPC), and intents to solve this problem. The MSPC feature is extracted from the background sound which is more(More)
This paper presents a robust approach to improve the performance of voice activity detector (VAD) in low signal-to-noise ratio (SNR) noisy environments. To this end, we first generate sparse representations by Bregman Iteration based sparse decomposition with a learned over-complete dictionary, and derive a kind of audio feature called sparse power spectrum(More)
As a promising technique, sparse representation has been extensively investigated in signal processing community. Recently, sparse representation is widely used for speech processing in noisy environments; however, many problems need to be solved because of the particularity of speech. One assumption for speech denoising with sparse representation is that(More)
The maximum a posteriori (MAP) criterion is broadly used in the statistical model-based voice activity detection (VAD) approaches. In the conventional MAP criterion, however, the inter-frame correlation of the voice activity is not taken into consideration. In this paper, we proposes a novel modified MAP criterion based on a two-state hidden Markov model(More)
In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in word lattice. When calculating the context consistency of a hypothesized word, the proposed confidence measure considers not only(More)
The sensitivity-field distribution of sensors in the electrical capacitance tomography (ECT) system is influenced by the distribution of multiphase-flow media, and the soft-field characteristics can bring great difficulty into image reconstruction. The subject investigated of this paper is major in the ECT system of 8-electrode oil-water two-phase flow.(More)
Mismatch in speech bandwidth between training and real operation greatly degrades the performance of automatic speech recognition (ASR) systems. Missing feature technique (MFT) is effective in handling bandwidth mismatch. However, current MFT-based methods ignore the mismatch in the filterbank channels which cover the upper and lower limit cutoff(More)
This study presents a robust audio retrieval method and discusses its control strategy. In the method, the retrieval target is divided into short segments, each segment is searched respectively and a retrieval window is used to maintain a list of segments that can be searched simultaneously. The method can quickly detect and locate known sound in real-time(More)
A method for confidence measure (CM) using syllable based confidence features is proposed to improve false-alarm rejection of the mandarin keyword spotting (KWS). The features take advantage of the merit of mandarin syllable structure and describe the confidences in every sub-syllable level. The evaluation is processed with support vector machine (SVM) on(More)