Learn More
In this work, we propose the marginal structured SVM (MSSVM) for structured prediction with hidden variables. MSSVM properly accounts for the uncertainty of hidden variables, and can significantly outperform the previously proposed latent structured SVM (LSSVM; Yu & Joachims (2009)) and other state-of-art methods, especially when that uncertainty is large.(More)
We introduce a technique for augmenting neural text-to-speech (TTS) with lowdimensional trainable speaker embeddings to generate different voices from a single model. As a starting point, we show improvements over the two state-ofthe-art approaches for single-speaker neural TTS: Deep Voice 1 and Tacotron. We introduce Deep Voice 2, which is based on a(More)
In this paper, we propose a fast and reliable impulse noise filter for highly corrupted images. The median filter was once the most popular nonlinear filter for removing impulse noise because of its good denoising power and computational efficiency, but the performance are unsatisfactory when noise ratio is high. So many algorithms have been proposed to(More)
In this paper, we propose a new simplex unscented transform, and a new filter - modify unscented Kalman filter (MUKF) based on this transform. It has less compute consumption than UKF, SUKF (Julier and Uhlmann, 2004) and EKF (Athans et al., 1968). Computer simulation show that this filter has the same performance as UKF and SUKF; and we analysis the(More)
Pseudocolor coding is a common method in medical images enhancement and display. The key technology is gray-to-color mapping functions. According to the features of medical images and human visual system, this paper develops a new nonlinear pseudocolor coding method based on gradient value. Firstly, we get the gradient image and define a threshold according(More)
In this paper, a weighted mutual information (WMI) is proposed for medical image registration. Conventional information measures, for example Shannon's entropy and mutual information (MI), are based on the statistical properties of intensity values of images, and float image and reference image are playing same role when calculating their MI. In fact,(More)
In this paper, we present a novel near-duplicate video retrieval system serving one million web videos. To achieve both the effectiveness and efficiency, a visual word based approach is proposed, which quantizes each video frame into a word and represents the whole video as a bag of words. The system can respond to a query in 41ms with 78.4% MAP on average.
Marginal MAP inference involves making MAP predictions in systems defined with latent variables or missing information. It is significantly more difficult than pure marginalization and MAP tasks, for which a large class of efficient and convergent variational algorithms, such as dual decomposition, exist. In this work, we generalize dual decomposition to a(More)
Many classical direction of arrival (DOA) estimation algorithms suffer from sensitivity to mutual coupling of antenna array. In this paper, a stable estimation method of DOA estimation is introduced. This method is applying a group of auxiliary arrays, exploiting the banded symmetric and Toeplitz matrix model for the mutual coupling in a uniform linear(More)